Browse > Article
http://dx.doi.org/10.5351/KJAS.2005.18.3.661

Tree-structured Clustering for Continuous Data  

Huh Myung-Hoe (Dept. of Statistics, Korea University)
Yang Kyung-Sook (Brain Korea 21 The Education and Research Group for Korea Studies, Korea University)
Publication Information
The Korean Journal of Applied Statistics / v.18, no.3, 2005 , pp. 661-671 More about this Journal
Abstract
The aim of this study is to propose a clustering method, called tree-structured clustering, by recursively partitioning continuous multivariate dat a based on overall $R^2$ criterion with a practical node-splitting decision rule. The clustering method produces easily interpretable clustering rules of tree types with the variable selection function. In numerical examples (Fisher's iris data and a Telecom case), we note several differences between tree-structured clustering and K-means clustering.
Keywords
Tree-structured clustering; Node splitting; Overall R-Square; K-means clustering; Variable selection.;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Makarenkov, V. and Legendre, P. (2001). Optimal variable weighting for ultrametric and additive trees and k-means partitioning: methods and software, Journal of Classification, 18. 245-271
2 Quinlan, J.R. (1993). C4.5 Programs for Machine Learning, Morgan Kaufmann, CA: San Mateo
3 강현철, 한상태, 최종후 (2000). 의사결정나무를 활용한 데이터마이닝 예측모형 해석, <한국통계학회 학술발표회 논문집>, 2000년 춘계. 39-44
4 최대우, 구자용, 최용석 (2004). 배경자료를 이용한 나무군집의 군집분석, <응용통계연구>, 17, 535-545
5 Breiman, L., Friedman, J.H., Olshen, R.A. and Stone, C.J. (1984). Classification and Regression Trees, Wadsworth, CA: Belmont
6 Liu, B., Xia, Y. and Yu, P.S. (2000). Clustering through decision tree construction, IBM Research Report RC21695
7 DeSarbo, W.S., Carrol, J.D., and Clark, L.A., and Green, P.E. (1984). Synthesized clustering: A method for amalgamating alternative clustering bases with differential weighting of variables, Psychometrika, 49, 57-78   DOI
8 Kass, G. (1980). An exploratory technique for investigating large quantities of categorical data, Applied Statistics, 29, 119-219   DOI   ScienceOn