Browse > Article
http://dx.doi.org/10.14400/JDC.2014.12.12.293

Clustering Algorithm for Data Mining using Posterior Probability-based Information Entropy  

Park, In-Kyoo (Dept. of Computer.Game, College of Engineering, Joongbu University)
Publication Information
Journal of Digital Convergence / v.12, no.12, 2014 , pp. 293-301 More about this Journal
Abstract
In this paper, we propose a new measure based on the confidence of Bayesian posterior probability so as to reduce unimportant information in the clustering process. Because the performance of clustering is up to selecting the important degree of attributes within the databases, the concept of information entropy is added to posterior probability for attributes discernibility. Hence, The same value of attributes in the confidence of the proposed measure is considerably much less due to the natural logarithm. Therefore posterior probability-based clustering algorithm selects the minimum of attribute reducts and improves the efficiency of clustering. Analysis of the validation of the proposed algorithms compared with others shows their discernibility as well as ability of clustering to handle uncertainty with ACME categorical data.
Keywords
Data Mining; Clustering; Bayesian Posterior Probability; Entropy; Rough Set;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Dempster A P., Laird N M., Rubin D B., Maximum likelihood from incomplete data via the EM algirithm, Journal of the Royal Statistical Society, Vol. 39, No. 1, pp. 1-38, 1977
2 Gibson D., Kleindeberg J., Raghvan P., Clustering categorical data: An approach based on dynamical systems, The Very Large Data Bases Journal, vol. 8, no. 3-4, pp. 222-236, 2000   DOI
3 Jiang D., Tang C., Zhang A., Cluster analysis for gene expression data: A survey, IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 11, pp. 1370-1386, 2004   DOI   ScienceOn
4 Herawan T., Ghazali R., Yanto I., Deris M., Rough set approach for cateforical data clustering, International Journal of Database Theory and Application, vol. 3, no. 1, pp. 33-52, 2010
5 Huang Z, Extensions to the k-means algorithm for clustering large data sets with cateforical values. Data Mining and Knowledge Discovery, vol. 2, no. 3, pp. 283-304, 1998   DOI   ScienceOn
6 Kim D., Lee K., Lee D., Fussy clustering of categorical data using fuzzy centroids, Pattern Recognition Letters, vol. 25, no. 11, pp. 1263-1271, 2004   DOI   ScienceOn
7 Parmar D., Wu T., Blackhurst J., MMR: An algorithm for clustering categorical data using rough set throry, Data and Knowledge Engineering, vol. 63, pp. 879-893, 2007   DOI   ScienceOn
8 Sun, L., Xu, J., Xue, Z. and Zhang, L., Rough entropy-based feature selection and its application, Journal of Information and Computational Science, pp. 1525-1532, 2011
9 Anjana K., Study on Naive Bayesian classifier and its relaton to information gain, International Journal on Recent and Innovation Trends in Computing and Communication, vol. 2, pp. 601-603, 2014
10 Pawlak, Z. Rough set Theory and Its Applications to Data Analysis, Cyberdynamics and Systems: An International Journal, pp. 661-688, 1998
11 Tripathy B. K., Ghosh A., A SDR: An algorithm for clustering categorical data using rough set theory, Private communication at the International IEEE Conference held in Kerala, 2011
12 Tripathy B. K., Ghosh A., A SSDR: An algorithm for clustering categorical data using rough set theory, Advances in Applied Science Research, vol. 2, no. 3, pp. 320-324, 2011
13 Hassanein W. A., Elmelegy A. A., Clustering algorithm for categorical data using concepts of significance and dependence of attributes, European Scientific Jouranl, vol. 10, no. 3, pp. 381-400, 2014
14 Fuyuan C., Jiye L., Deyu L., Chuangyin D., A dissimility measure for the k-Modes clustering algorithm, Knowledge-Based Systems Journal, vol. 26, pp. 120-127, 2012   DOI   ScienceOn
15 Tian B. Kulikowski C.A., Leiguang G., Bin Y., Lan H., Chunguang Z., Chinese Journal of Electronics, vol. 21, no. 3, pp. 460-465, 2012
16 Mete Ciilngirturk A., Ergut O., Hierarchical clustering with simple matching and joint entropy dissimility measure, Journal of Modern Applied Statiscal Methods, vol. 13, no. 1, pp. 329-338, 2013