A New Similarity Measure for Categorical Attribute-Based Clustering |
Kim, Min
(한국과학기술연구원 인지로봇센터)
Jeon, Joo-Hyuk (한국과학기술원 전산학과) Woo, Kyung-Gu (삼성전자 종합기술원 SW 선행연구소) Kim, Myoung-Ho (한국과학기술원 전산학과) |
1 | Z. Huang, A fast clustering algorithm to cluster very large categorical data sets in data mining. Proceedings of SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery, pp.1-8, 1997. |
2 | F. Cao, J. Liang and L. Bai, A new initialization method for categorical data clustering, Expert Systems With Applications: An International Journal archive, vol.36, Issue 7, pp.10223-102228, 2009. DOI ScienceOn |
3 | M. Al-Razgan, C. Domeniconi and D. Barbara, Random Subspace Ensembles for Clustering Categorical Data. Studies in Computational Intelligence, Springer, 2008. |
4 | B. Broda and M. Piasecki, Experiments in Clustering Documents for Automatic Acquisition of Lexical Semantic Networks for Polish, Proceedings of the 16th International Conference Intelligent Information Systems, 2008, pp.203-202, 2008. |
5 | A. M. Fahim, G. Saake, A. M. Salem, F. A. Torkey, and M. A. Ramadan, k-Means for Spherical Clusters with Large Variance in Sizes, Proceedings of World Academy of Science, Engineering and Technology, vol.35, pp.177-182, 2008. |
6 | K. Qin, M. Xu, Y. Du, and S. Yue, Cloud Model and Hierarchical Clustering Based Spatial Data Mining Method and Application, Proceedings of the International Archives of the Photogrammetry, Remote Sensing and Spatial information Sciences, vol.37, pp.241-246, 2008. |
7 | D. H. Fisher, Knowledge acquisition via incremental conceptual clustering. Machine Learning, vol.2, no.2, pp.139-172, 1987. |
8 | M. Gluck and J. Corter, Information, Uncertainty, and the Utility of Categories. Proceedings of Seventh Annual Conference of Cognitive Science Society, pp.283-287, 1985. |
9 | Z. Huang and M.K. Ng, A fuzzy k-modes algorithm for clustering categorical data. IEEE Transactions on Fuzzy Systems, vol.7, no.4, pp.446-452, 1999. DOI ScienceOn |
10 | K.B. McKusick and K. Thompson, COBWEB/3: A portable implementation, Report FIA-90-6-18-2, NASA, Ames Research Center, 1990. |
11 | Y. Reich and S.J. Fenves, The formation and use of abstract concepts in design. Concept Formation: Knowledge and Experience in Unsupervised Learning, Morgan Kaufmann, 1991. |
12 | T. Cover, J. Thomas, Elements of information theory, Wiley InterScience, 1991. |
13 | G. Biswas, J. Weinberg, and C. Li, ITERATE: A conceptual clustering scheme for knowledge discovery in databases. Artificial Intelligence in the Petroleum Industry, B. Braunschweig and R. Day eds., pp.111-139, 1995. |
14 | P. Andritsos, P. Tsaparas, R.J. Miller and K.C. Sevcik, LIMBO: Scalable clustering of categorical data. Proceedings of the 9th International Conference on Extending DataBase Technology (EDBT), 2004. |
15 | D. Barbara, Y. Li and J. Couto, COOLCAT: an entropy-based algorithm for categorical clustering. Proceedings of ACM Conf. on Information and Knowledge Mgt. (CIKM), pp.582-589, 2002. |
16 | D. Hochbaum and D. Shmoys, A best possible heuristic for the k-center problem. Mathematics of Operations Research, vol.10, no.2, pp.180-184, 1985. DOI ScienceOn |
17 |
C. J. Merz and P. Merphy, UCI Repository of Machine Learning Databases, 1996. Available from: |
18 | C. Ding, X. He, H. Zha, and H. D. Simon, Adaptive dimension reduction for clustering high dimensional data. Proceedings of Second IEEE International Conference on Data Mining, pp. 147-154, 2002. |
19 | L. Yu and H. Liu, Feature selection for highdimensional data: a fast correlation-based filter solution. Proceedings of the twentieth International Conference on Machine Learning, pp.856-863, 2003. |
20 | S. Raychaudhuri, P. D. Sutphin, J. T. Chang, and R. B. Altman, Basic microarray analysis: grouping and feature reduction. Trends in Biotechnology, vol.19, no.5, pp.189-193, 2001. DOI ScienceOn |
21 | J. MacQueen, Some methods for classification and analysis of multivariate observation. Proceedings of the fifth Berkeley Symp. on Math. Statist. and Prob., vol.1, pp.281-297, 1966. |
22 | S. Guha, R. Rastogi and K. Shim, ROCK: a robust clustering algorithm for categorical attributes. Proceedings of the 15th International Conference on Data Engineering, pp.512-521, 1999. |
23 | Z. Huang, Extensions to the k-means algorithm for clustering large data sets with categorical data. Data Mining and Knowledge Discovery, vol.2, no.3, pp.283-304, 1998. DOI ScienceOn |
24 | L. Kaufman and P. Rousseeuw, Clustering by means of medoids. In Dodge, Y. (Ed.) Statistical Data Analysis based on the L1 Norm. pp.405-416, 1987. |
25 | Z. He, X. Xu and S. Deng, Squeezer: an efficient algorithm for clustering categorical data. Journal of Computer Science and Technology, vol.17, no.5, pp.611-624, 2002. DOI ScienceOn |
26 | P. H. A. Sneath and R. R. Sokal, Numerical Taxonomy: The Principles and Practice of Numerical Classication, W. H. Freeman and Company, 1973. |
27 | H. Jiawei and K. Micheline, Data Mining: Concepts and Techniques, 2rd ed., pp.383-444, Morgan Kaufmann, 2006. |
28 | A. Ahmad and L. Dey, A k-mean clustering algorithm for mixed numeric and categorical data, Data & Knowledge Engineering, vol.63, Issue 2, pp.503-527, 2007. DOI ScienceOn |
29 | C. Stanfill and D. Waltz, Toward memory-based reasoning, Communications of the ACM, vol.29, no.12, pp.1213-1228, 1986. DOI ScienceOn |