Browse > Article

K-means clustering using a center of gravity for grid-based sample  

Lee, Sun-Myung (Department of Statistics, Changwon National University)
Park, Hee-Chang (Department of Statistics, Changwon National University)
Publication Information
Journal of the Korean Data and Information Science Society / v.21, no.1, 2010 , pp. 121-128 More about this Journal
Abstract
K-means clustering is an iterative algorithm in which items are moved among sets of clusters until the desired set is reached. K-means clustering has been widely used in many applications, such as market research, pattern analysis or recognition, image processing, etc. It can identify dense and sparse regions among data attributes or object attributes. But k-means algorithm requires many hours to get k clusters that we want, because it is more primitive, explorative. In this paper we propose a new method of k-means clustering using a center of gravity for grid-based sample. It is more fast than any traditional clustering method and maintains its accuracy.
Keywords
Data mining; grid-based sampling; k-means clustering;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 박희창, 유지현, 이성용 (2003). 그리드 기반 샘플링에 의한 클러스트링 알고리즘. <한국데이터정보과학회지>, 14, 535-543.   과학기술학회마을
2 박희창, 조광현 (2005). K-평균 군집방법을 이용한 환경조사자료의 모형화. <한국데이터정보과학회지>, 16, 557-566.   과학기술학회마을
3 Chu, S. C., Roddick, J. F. and Pan, J. S. (2002a). Efficient k-medoids algorithms using multi-centroids with multi-runs sampling scheme. Proceedings of International Workshop on Mining Data across Multiple Customer Touchpoints for CRM, 14-25.
4 Chu, S. C., Roddick, J. F. and Pan, J. S. (2002b). An incremental multi-centroid, multi-run sampling scheme for k-medoids-based algorithms-extended report. Proceedings of The Third International Conference on Data Mining Methods and Databases, 553-562.
5 Huang, Z. (1997a). Clustering large data sets with mixed numeric and categorical values. Proceedings of The First Pacific-Asia Conference on Knowledge Discovery and Data Mining, 21-34.
6 Huang, Z. (1997b). A fast clustering algorithm to cluster very large categorical data sets in data mining. Proceedings of ACM SIGMOD Workshop on Data Mining and Knowledge Discovery, 146-151.
7 Kaufman, L. and Rousseeuw, P. J. (1990). Finding groups in data: An introduction to cluster analysis, John Wiley and Sons.
8 Kim, D. W. and Chae, Y. G. (2005). More efficient k-modes clustering algorithm. Journal of the Korean Data and Information Science Society, 16, 549-556.   과학기술학회마을
9 MacQueen, J. (1967). Some methods for classification and analysis of multivariate observations. Proceedings of the 5th Berkeley Symposium on Mathematical Statistics and Probability, 281-297.
10 Ng, R. and Han, J. (1994). Efficient and effective clustering method for spatial data mining. Proceedings of International Conference of Very Large Data Bases, 144-155.
11 Park, H. C. and Lee, S. M. (2005). K-means clustering using grid-based representatives. Journal of the Korean Data and Information Science Society, 16, 759-768.   과학기술학회마을