Browse > Article

An Ensemble Clustering Algorithm based on a Prior Knowledge  

Ko, Song (중앙대학교 컴퓨터공학과)
Kim, Dae-Won (중앙대학교 컴퓨터공학과)
Abstract
Although a prior knowledge is a factor to improve the clustering performance, it is dependant on how to use of them. Especial1y, when the prior knowledge is employed in constructing initial centroids of cluster groups, there should be concerned of similarities of a prior knowledge. Despite labels of some objects of a prior knowledge are identical, the objects whose similarities are low should be separated. By separating them, centroids of initial group were not fallen in a problem which is collision of objects with low similarities. There can use the separated prior knowledge by various methods such as various initializations. To apply association rule, proposed method makes enough cluster group number, then the centroids of initial groups could constructed by separated prior knowledge. Then ensemble of the various results outperforms what can not be separated.
Keywords
clustering; semi-supervised; ensemble method; association rule;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Brian S.Everitt et al, 'Cluster Analysis,' ARNOLD
2 Sugato Basu, 'Semi-supervised Clustering by Seeding,' Proceedings of the 19th International Conference on Machine Learning, (ICML-2002), pp. 19-26, Sydney, Australia, July 2002
3 Ana L.N. Fred, Anil K. Jain, 'Combining Multiple Clusterings Using Evidence Accumulation,' IEEE Trans, Pattern Analysis and machine intelligence, Vol.27, No.6, JUNE 2005   DOI   ScienceOn
4 A.K. Jain, M.N. Murty, P.J. Flynn, 'Data Clustering : A Review,' ACM Computing Surveys, Vol.31, No.3, September   DOI   ScienceOn
5 Dan Klein, Sepandar D. Kamvar, Christopher D. Manning, 'From Instance-level Constraints to Spacelevel Constraints : Making the Most of Prior Knowledge in Data Clustering'
6 Kiri Wagsta, 'Constrained K-means Clustering with Background Knowledge,' Proceedings of the Eighteenth International Conference on Machine Learning, pp. 577-584, 2001
7 Aidong zhang, 'advanced analysis of gene expression microarray data,' World Scientific, 2006
8 Danh V. Nguyen et al, 'Tumor classification by partial least squares using microarray gene expressiondata,' Bioinformatics, Vol.18, No.1, p. 39-50, Jun 2002   DOI   ScienceOn
9 http://www.geneontology.org
10 Lawrence Hubert, 'Comparing Partitions,' journal of Classification, 2:193-218, 1985   DOI
11 David Hand et al, 'principal of Data mining,' A Bradford Book The MIT Press Cambridge, Massachusetts London, England, 2001
12 Akinori Fujino et al, 'Semisupervised Learning for a Hybrid Generative/Discriminative Classifier Based on the Maximum Entropy Principle,' IEEE Trans, Pattern Analysis and machine intelligence, Vol.30, No.3, MARCH 2008   DOI   ScienceOn
13 M.A.T. Figueiredo et al, 'Unsupervised Learning of Finite Mixture Models,' IEEE Trans, Pattern Analysis and machine intelligence, March Vol.24, No.3, pp. 381-396, 2002   DOI   ScienceOn
14 Yi Hong, 'Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm,' Pattern Recognition, Vol.41, Issue. 9, SEPTEMBER 2008   DOI   ScienceOn