Browse > Article
http://dx.doi.org/10.3837/tiis.2018.06.010

A new clustering algorithm based on the connected region generation  

Feng, Liuwei (Institute of Information Science, Beijing jiaotong University)
Chang, Dongxia (Institute of Information Science, Beijing jiaotong University)
Zhao, Yao (Institute of Information Science, Beijing jiaotong University)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.12, no.6, 2018 , pp. 2619-2643 More about this Journal
Abstract
In this paper, a new clustering algorithm based on the connected region generation (CRG-clustering) is proposed. It is an effective and robust approach to clustering on the basis of the connectivity of the points and their neighbors. In the new algorithm, a connected region generating (CRG) algorithm is developed to obtain the connected regions and an isolated point set. Each connected region corresponds to a homogeneous cluster and this ensures the separability of an arbitrary data set theoretically. Then, a region expansion strategy and a consensus criterion are used to deal with the points in the isolated point set. Experimental results on the synthetic datasets and the real world datasets show that the proposed algorithm has high performance and is insensitive to noise.
Keywords
connected region; nearest neighbors; importance index; seed point; region expansion;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Y. Kim, K. Shim, M. Kim and J. S. Lee, "DBCURE-MR: An efficient density-based clustering algorithm for large data using Map Reduce," Information Systems, vol. 42, pp. 15-35, June, 2014.   DOI
2 C. Cassisi, A. Ferro, R. Giugno, G. Pigola and A. Pulvirenti, "Enhancing density-based clustering: Parameter reduction and outlier detection," Information Systems, vol. 38, no. 3, pp. 317-330, May, 2013.   DOI
3 A. Y. Ng, M. Jordan, and Y. Weiss, "On Spectral Clustering: Analysis and an algorithm," Proceedings of Advances in Neural Information Processing Systems, vol. 14, pp. 849-856, April, 2002.
4 L. Zelnik-Manor and P. Perona, "Self-tuning spectral clustering," Advances in Neural Information Processing Systems, vol. 17, pp. 1601-1608, January, 2004.
5 C. Hong and D. Y. Yeung, "Robust path-based spectral clustering with application to image segmentation," in Proc. of 10th IEEE International Conference on Computer Vision, IEEE Computer Society, pp. 278-285, October 17-20, 2005.
6 X. Zhang, J. Li and H. Yu, "Local density adaptive similarity measurement for spectral clustering," Pattern Recognition Letters, vol. 32, no. 2, pp. 352-358, January, 2011.   DOI
7 C. H. Q. Ding, X. He, H. Zha, M. Gu and H. D. Simon, "A min-max cut algorithm for graph partitioning and data clustering," in Proc. of IEEE International Conference on Data Mining. IEEE Computer Society, pp. 107-114, November 29-December 2, 2001.
8 I. Fischer and J. Poland, "Amplifying the block matrix structure for spectral clustering," Idsia, pp. 21-28, January, 2005.
9 L. Hagen and A. B. Kahng, "New spectral methods for ratio cut partitioning and clustering," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 11, no. 9, pp. 1074-1085, November, 2006.
10 T. Xia, J. Cao, Y. Zhang and J. Li, "On defining affinity graph for spectral clustering through ranking on manifolds," Neurocomputing, vol. 72, no. 13-15, pp. 3203-3211, August, 2009.   DOI
11 J. Cao, P. Chen, W. K. Ling, Z. Yang and Q. Dai, "Spectral Clustering with Sparse Graph Construction Based on Markov Random Walk," KSII Transactions on Internet and Information Systems, vol. 9, no. 7, pp. 2568-2584, July, 2015.   DOI
12 I. E. Givoni, B. J. Frey, "A binary variable model for affinity propagation," Neural Computation, vol. 21, no. 6, pp. 1589-1600, June, 2009.   DOI
13 I. E. Givoni, B. J. Frey, "Semi-Supervised Affinity Propagation with Instance-Level Constraints," in Proc. of the international conference on Artificial Intelligence &Statistics, pp. 161-168, 2009.
14 M. Leone, M. Weigt, "Clustering by soft-constraint affinity propagation: applications to gene-expression data," Bioinformatics, vol. 23, no. 20, pp. 2708-2715, October, 2007.   DOI
15 M. L. Sumedha, M. Weigt, "Unsupervised and semi-supervised clustering by message passing: soft-constraint affinity propagation," European Physical Journal B, vol. 66, no. 1, pp. 125-135, October, 2008.   DOI
16 C. Furtlehner, M. Sebag, X. Zhang, "Scaling analysis of affinity propagation," Physical Review E Statistical Nonlinear & Soft Matter Physics, vol. 81, no. 6 Pt 2, pp. 066102 , 2009.
17 C. Fu, J. Wang, X. Chen, Z. Qin, M. Zhao, "Flow Transformation of Anonymous Communication Based on Hierarchical Weighted Affinity Propagation Clustering," Journal of Computational Information Systems, vol. 7, no. 1, 2011.
18 B. A. Galitsky, G. Dobrocsi, J. L. D. L. Rosa and S. O. Kuznetsov, "Using Generalization of Syntactic Parse Trees for Taxonomy Capture on the Web," in Proc. of the 19th international conference on Conceptual structures for discovering knowledge, pp. 104-117, July 25-29, 2011.
19 B. Everitt, S. Landau and M. Leese, "Cluster Analysis," Arnold, London, 2001.
20 J. Macqueen, "Some methods for classification and analysis of multivariate observations," in Proc. of the 5th Berkeley Symposium on Mathematical Statistics and Probability, pp. 281-297, January, 1967.
21 A. Rodriguez and A. Laio, "Machine learning. Clustering by fast search and find of density peaks," Science, vol. 344, no. 6191, pp. 1492-1496, June, 2014.   DOI
22 B. J. Frey, D. Dueck, "Clustering by Passing Messages between Data Points," Science, vol. 315, no. 5814, pp. 972-976, February, 2007.   DOI
23 R. Sibson, "SLINK: An optimally efficient algorithm for the single-link cluster method," Computer Journal, vol. 16, no. 1, pp. 30-34, January, 1973.   DOI
24 M. Ester, H. Kriegel, S. Jiirg and X. Xu, "A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise," in Proc. of the 2nd International Conference on Knowledge Discovery and Data Mining, pp. 226-231, August 2-4, 1996.
25 J. Shi and J. Malik, "Normalized cuts and image segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no. 8, pp. 888-905, August, 2000.   DOI
26 P. Bradley, O. Mangasarian, and W. Street, "Clustering via Concave Minimization," Advances in Neural Information Processing Systems, pp. 368-374, January, 1996.
27 G. Wang and Q. Song, "Automatic Clustering via Outward Statistical Testing on Density Metrics," IEEE Transactions on Knowledge and Data Engineering, vol. 28, no. 8, pp. 1971-1985, August, 2016.   DOI
28 F. R. Bach and M. I. Jordan, "Learning Spectral Clustering," Advances in Neural Information Processing Systems, vol. 16, no. 2, pp. 2006, June, 2003.
29 Y. Liu, Y. Liu, and K. C. C. Chan, "Dimensionality reduction for heterogeneous dataset in rushes editing," Pattern Recognition, vol. 42, no. 2, pp. 229-242, 2009.   DOI
30 D. Arthur and S. Vassilvitskii, "k-means++: the advantages of careful seeding," in Proc. of the 18th annual ACM-SIAM symposium on Discrete algorithms, pp. 1027-1035, January 7-9, 2007.
31 D. Defays, "An Efficient Algorithm for a Complete Link Method," Computer Journal, vol. 20, no. 4, pp. 364-366, January, 1977.   DOI
32 J. A. García , J. Fdez-Valdivia, F. J. Cortijo and R. Molina, "A dynamic approach for clustering data," Signal Processing, vol. 44, no. 2, pp. 181-196, June, 1995.   DOI
33 R. O. Duda, P. E. Hart and D. G. Stork, "Pattern Classification, 2nd Edition," Wiley, New York, 2000.
34 T. Senthil and B. Kannapiran, "EETCA: Energy Efficient Trustworthy Clustering Algorithm for WSN," KSII Transactions on Internet and Information Systems, vol. 10, no. 11, pp. 5437-5454, November, 2016.   DOI
35 F. Aadil, S. Khan, K. B. Bajwa, M. F. Khan and A. Ali, "Intelligent Clustering in Vehicular ad hoc Networks," KSII Transactions on Internet and Information Systems, vol. 10, no. 8, pp. 3512-3528, August, 2016.   DOI
36 Y. Su, X. Zhu and W. Z Nie, "Multiple Person Tracking based on Spatial-temporal Information by Global Graph Clustering," KSII Transactions on Internet and Information Systems, vol. 9, no. 6, pp. 2217-2229, June, 2015.   DOI
37 S. Feng, J. Fan, H. Tan, Y. He, H. Mao, W. Luo and D. Ma, "MR-DBSCAN: An Efficient Parallel Density-based Clustering Algorithm using MapReduce," in Proc. of IEEE International Conference on Parallel and Distributed Systems. IEEE Computer Society, pp. 473-480, December 7-9, 2011.