Browse > Article
http://dx.doi.org/10.7236/JIIBC.2013.13.5.183

lustering of Categorical Data using Rough Entropy  

Park, Inkyoo (Dept. of Computer Science, Joongbu Univ.)
Publication Information
The Journal of the Institute of Internet, Broadcasting and Communication / v.13, no.5, 2013 , pp. 183-188 More about this Journal
Abstract
A variety of cluster analysis techniques prerequisite to cluster objects having similar characteristics in data mining. But the clustering of those algorithms have lots of difficulties in dealing with categorical data within the databases. The imprecise handling of uncertainty within categorical data in the clustering process stems from the only algebraic logic of rough set, resulting in the degradation of stability and effectiveness. This paper proposes a information-theoretic rough entropy(RE) by taking into account the dependency of attributes and proposes a technique called min-mean-mean roughness(MMMR) for selecting clustering attribute. We analyze and compare the performance of the proposed technique with K-means, fuzzy techniques and other standard deviation roughness methods based on ZOO dataset. The results verify the better performance of the proposed approach.
Keywords
Cluster analysis; Clustering; Rough Set; Rough Entropy; Uncertainty;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Pawlak, Z. "Rough sets", International Journal of Information and Computer Sciences, Vol.11,No. 5, pp. 341-356, 1982   DOI   ScienceOn
2 Beaubouef, T., Petry, F. E. and Arora, G.,, "Information-theoretic measurtes of uncertainty for rough sets and rough relational databases", Information Science, Vol. 109, No. 1-4, pp. 185-195, 1998.   DOI   ScienceOn
3 Wojciech Ziarko, "Variable Precision Rough Set Model", June 1, 1990 August 1, 1991
4 A. Dempster, N. Laird, D. Rubin, "Maximum likelihood form incomplete data via the EM algorithm", Journal of the Royal Statistical Society Vol. 39(1), pp. 1-38, 1997
5 H. Ralambondrainy, "A Conceptual Version of the K-means Algorithm, Pattern Recognition Letters, Vol. 16, No. 11, pp. 1147-1157, 1995   DOI   ScienceOn
6 H.T. Lee et al., "AED System using Fuzzy Rules", The Institute of Internet, Broading and Communication, Vol 13, No. 4, Aug. 2013
7 Z. Huang, "Extensions to the k-means algorithm for clustering large data sets with categorical values", Data Mining and Knowledge Discovery, Vol. 2, pp. 283-304, 1998   DOI   ScienceOn
8 S. Guha, R. Rastogi, K. Shim, Information Systems, Vol. 25, pp. 345-366, 2000   DOI   ScienceOn
9 R. Krishnapuram, J. Keller, IEEE Transactions on Fuzzy Systems, Vol. 1, pp. 98-110, 1993   DOI   ScienceOn
10 J. Y. Kim, S S. Jo , K.K. Kim , S. H. Choi, Development of Localization and Threedimensional hull map creation S/W for Underwater robot, Journal of Korean Institute of Information Technology, Vol.8 No.6 ,35-40, June 2010
11 J. E. Chung, J. K. Ahn, A Study of Robust Design of FCM Gasket Using Taguchi Method, Journal of the Korea Academia-Industrial cooperation Society, v.14, no.7, 3177-3183, July 2013   과학기술학회마을   DOI   ScienceOn