Browse > Article
http://dx.doi.org/10.3837/tiis.2020.07.013

A new Ensemble Clustering Algorithm using a Reconstructed Mapping Coefficient  

Cao, Tuoqia (Institute of Information Science, Beijing Jiaotong University)
Chang, Dongxia (Institute of Information Science, Beijing Jiaotong University)
Zhao, Yao (Institute of Information Science, Beijing Jiaotong University)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.14, no.7, 2020 , pp. 2957-2980 More about this Journal
Abstract
Ensemble clustering commonly integrates multiple basic partitions to obtain a more accurate clustering result than a single partition. Specifically, it exists an inevitable problem that the incomplete transformation from the original space to the integrated space. In this paper, a novel ensemble clustering algorithm using a newly reconstructed mapping coefficient (ECRMC) is proposed. In the algorithm, a newly reconstructed mapping coefficient between objects and micro-clusters is designed based on the principle of increasing information entropy to enhance effective information. This can reduce the information loss in the transformation from micro-clusters to the original space. Then the correlation of the micro-clusters is creatively calculated by the Spearman coefficient. Therefore, the revised co-association graph between objects can be built more accurately because the supplementary information can well ensure the completeness of the whole conversion process. Experiment results demonstrate that the ECRMC clustering algorithm has high performance, effectiveness, and feasibility.
Keywords
Ensemble clustering; supplementary information; reconstructed mapping coefficient; information entropy; Spearman coefficient;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Zhen-Feng H E, Fan-Lun X, "A Constrained Partition Model and K-Means Algorithm," Journal of Software, pp. 799-809, 2005.
2 O. Tuzel, F. Porikli, and P. Meer, "Kernel methods for weakly supervised mean shift clustering," in Proc. of ICCV, Kyoto, Japan, pp. 48-55, 2009.
3 R. Collins, "Mean shift blob tracking through scale space," in Proc. of CVPR, vol. 2, pp. 234-240, 2003.
4 Z. Lu and M. A. Carreira-Perpinan, "Constrained spectral clustering through affinity propagation," in Proc. of CVPR, Anchorage, AK, USA, pp.1-8, 2008.
5 B. Kulis, S. Basu, I. Dhillon, and R. Mooney, "Semi-supervised graph clustering: A kernel approach," Machine Learning, vol. 74, pp. 1-22, 2009.   DOI
6 S. Mimaroglu and E. Erdil, "Combining multiple clusterings using similarity graph," Pattern Recognition, vol. 44, no. 3, pp. 694-703, 2011.   DOI
7 C.-L. Liu, W.-H. Hsaio, C.-H. Lee, and F.-S. Gou, "Semi-supervised linear discriminant clustering," IEEE Transactions on Cybernetics, vol. 44, no. 7, pp. 989-1000, Jul. 2014.   DOI
8 A. Adolfsson, M. Ackerman, N. Brownstein, "To cluster, or not to cluster, An analysis of cluster ability methods," Pattern Recognition, vol 88, pp. 13-26, 2019.   DOI
9 A. L. N. Fred and A. K. Jain, "Combining multiple clusterings using evidence accumulation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 6, pp. 835-850, 2005.   DOI
10 A. Topchy, A. K. Jain, and W. Punch, "Clustering ensembles: models of consensus and weak partitions," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 27, no. 12, pp. 1866-1881, 2005.   DOI
11 D. Huang, J. Lai, and C.-D. Wang, "Ensemble clustering using factor graph," Pattern Recognition, vol. 50, pp. 131-142, 2016.   DOI
12 H. Liu, M. Shao, S. Li, and Y. Fu, "Infinite ensemble for image clustering," in Proc. of 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining, pp. 1745-1754, 2016.
13 J. Wu, H. Liu, H. Xiong, J. Cao, and J. Chen, "K-means-based consensus clustering: A unified view," IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 1, pp. 155-169, 2015.   DOI
14 Liu H, Zhao R, Fang H, et al., "Entropy-based consensus clustering for patient stratification," Bioinformatics, vol. 33, no. 17, pp. 2691-2698, 2017.   DOI
15 Zhao X, Liang J, Dang C, "Clustering ensemble selection for categorical data based on internal validity indices," Pattern Recognition, vol. 69, pp. 150-168, 2017.   DOI
16 L. Bai, J.Y. Liang, "Cluster validity functions for categorical data: a solution-space perspective," Data Mining & Knowledge Discovery, vol. 29, pp. 1560-1597, 2015.   DOI
17 D. Huang, C. D. Wang, and J. H. Lai, "Locally weighted ensemble clustering," IEEE Transactions on Cybernetics, vol. 48, no. 5, pp. 1460-1473, 2018.   DOI
18 Iam-On N, Boongoen T, Garrett S M, et al, "A Link-Based Approach to the Cluster Ensemble Problem," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 12, pp. 2396-2409, 2011.   DOI
19 Yang Y, Jiang J, "Bi-weighted ensemble via HMM-based approaches for temporal data clustering," Pattern Recognition, vol. 76, pp. 391-403, 2018.   DOI
20 Liu H, Wu J, Liu T, et al, "Spectral Ensemble Clustering via Weighted K-Means: Theoretical and Practical Evidence," IEEE Transactions on Knowledge and Data Engineering, pp. 1129-1143, 2017.   DOI
21 Huang, Dong, J. H. Lai, and C. D. Wang. "Robust Ensemble Clustering Using Probability Trajectories," IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 5, pp. 1312-1326, 2016.   DOI
22 Huang D, Wang C D, Peng H, et al, "Enhanced Ensemble Clustering via Fast Propagation of Cluster-Wise Similarities," IEEE Transactions on Systems, Man, and Cybernetics: Systems, pp. 1-13, 2018.
23 J. Cao, P. Chen, B. W. Ling, Z. Yang and Q. Dai, "Spectral Clustering with Sparse Graph Construction Based on Markov Random Walk," KSII Transactions on Internet and Information Systems, vol. 9, no. 7, pp. 2568-2584, 2015.   DOI
24 Myers, Jerome L. Well, Arnold D., Research Design and Statistical Analysis 2nd, Lawrence Erlbaum: 508, 2003, ISBN 0-8058-4037-0.
25 Z. Li and J. Tang, "Unsupervised Feature Selection via Nonnegative Spectral Analysis and Redundancy Control," IEEE Transactions on Image Processing, vol. 24, no. 12, pp. 5343-5355, 2015.   DOI
26 Li Z, Tang J, He X, "Robust Structured Nonnegative Matrix Factorization for Image Representation," IEEE Transactions on Neural Networks & Learning Systems, vol. 29, no. 5, pp. 1947-1660, 2018.   DOI
27 Zechao L, Jinhui T, Tao M, "Deep Collaborative Embedding for Social Image Understanding," IEEE Transactions on Pattern Analysis & Machine Intelligence, vol. 41, no. 9, pp. 2070-2083, 2018.   DOI
28 Caiming Zhong, Lianyu Hu, Xiaodong Yue, Ting Luo, Qiang Fu, Haiyong Xu, "Ensemble clustering based on evidence extracted from the co-association matrix," Pattern Recognition, Vol. 92, pp. 93-106, 2019.   DOI
29 A. K. Jain, "Data clustering: 50 years beyond K-means," Pattern Recognition. Lett., vol. 31, no. 8, pp. 651-666, 2010.   DOI
30 Sarvari H, Domeniconi C, Stilo G, "Graph-based selective outlier ensembles," in Proc. of the 34th ACM/SIGAPP Symposium. ACM, pp. 518-525, 2019.
31 Shannon C E, "A mathematical theory of communication," Bell Labs Technical Journal, vol. 27, no. 4, pp. 623-656, 1948.   DOI
32 Seifoddini H K, "Single linkage versus average linkage clustering in machine cells formation applications," Computers & Industrial Engineering, vol. 16, no. 3, pp. 419-426, 1989.   DOI
33 Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.   DOI
34 K. Bache and M. Lichman, "UCI machine learning repository," 2017. [Online]. Available:
35 A. Strehl and J. Ghosh, "Cluster ensembles: A knowledge reuse framework for combining multiple partitions," Journal of Machine Learning Research, vol. 3, pp. 583-617, 2003.
36 N. X. Vinh, J. Epps, and J. Bailey, "Information theoretic measures for clusterings comparison: Variants, properties, normalization and correction for chance necessary?," in Proc. of ICML '09: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 1073-1080, 2009.