Browse > Article
http://dx.doi.org/10.5351/CKSS.2005.12.3.683

Cluster Analysis Using Principal Coordinates for Binary Data  

Chae, Seong-San (Department of Information and Statistics, Daejeon University)
Kim, Jeong, Il (Department of Information and Statistics, Daejeon University)
Publication Information
Communications for Statistical Applications and Methods / v.12, no.3, 2005 , pp. 683-696 More about this Journal
Abstract
The results of using principal coordinates prior to cluster analysis are investigated on the samples from multiple binary outcomes. The retrieval ability of the known clustering algorithm is significantly improved by using principal coordinates instead of using the distance directly transformed from four association coefficients for multiple binary variables.
Keywords
Agglomerative Clustering Algorithm; Principal Coordinates; Association Coefficients;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 DuBien, J.L. and Warde, W.D.(1987), A comparison of agglomerative clustering methods with respect to noise, Communications in Statistics, Theory and Method, Vol. 16, 1433-1460   DOI   ScienceOn
2 DuBien, J.L., Warde, W.D. and Chae, S.S.(2004). Moments of Rand's C statistic in cluster analysis, Statistics & Probability Letters, Vol. 69, 243-252   DOI   ScienceOn
3 Gower, J.C.(1966). Some distance properties of latent root and vector methods used in multivariate analysis, Biometrika, Vol. 53, 325-338   DOI
4 Gower, J.C.(1971). A general coefficient of similarity and some of its properties, Biometrics, Vol. 27, 857-871   DOI   ScienceOn
5 Huang, Z.(1998). Extensions to the k-means algorithms for clustering large data sets with categorical values, Data mining and Knowledge Discovery, Vol. 2, 283-304   DOI   ScienceOn
6 Lee, J.J.(2005). Discriminant analysis of binary data with multinomial distribution by using the iterative cross entropy minimization estimation, The Korean Communications in Statistics, Vol. 12, 125-137   DOI   ScienceOn
7 Ordonez, C.(2003). Clustering binary data streams with K-means, In 8th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery
8 Rand, W.M.(1971). Objective criteria for the evaluation of clustering methods, Journal of the American Statistical Association, Vol. 66, 846-850   DOI   ScienceOn
9 Affi, A.A. and Clark, V.(1990). Computer-Aided Multivariate Analysis, Van Nostrand Reinhold Company, New York
10 Asparoukhov, O.K. and Krzanowski, W.J.(2001). A comparison of discriminant procedures for binary variables, Computational Statistics & Data Analysis, Vol. 38, 139-160   DOI   ScienceOn
11 Chae, S.S. and Warde, W.D.(1991). A method to predict the number of clusters, Journal of the Korean Statistical Society, Vol. 20, 162-176
12 DuBien, J.L. and Warde, W.D.(1979). A mathematical comparison of the members of an infinite family of agglomerative clustering algorithms, The Canadian Journal of Statistics, Vol. 7, 29-38   DOI
13 Gower, J.C. and Legendre, P.(1986). Metric and Euclidean properties of dissimilarity coefficients, Journal of Classification, Vol. 3, 5-48   DOI
14 Chae, S.S. and Warde, W.D.(2006). Effect of using principal coordinates and principal components on retrieval of clusters, Computational Statistics & Data Analysis, Vol. 50, 1407-1417   DOI   ScienceOn