K-means Clustering using Grid-based Representatives

  • Published : 2005.11.30

Abstract

K-means clustering has been widely used in many applications, such that pattern analysis, data analysis, market research and so on. It can identify dense and sparse regions among data attributes or object attributes. But k-means algorithm requires many hours to get k clusters, because it is more primitive and explorative. In this paper we propose a new method of k-means clustering using the grid-based representative value(arithmetic and trimmed mean) for sample. It is more fast than any traditional clustering method and maintains its accuracy.

Keywords

References

  1. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability v.1 Some methods for classification and analysis of multivariate observations MacQueen, J.
  2. Finding Groups in Data: An Introduction to Cluster Analysis Kaufman, L.;Rousseeuw, P.J.
  3. Proceedings of the 20th Very Large Data Bases Conference Efficient and effective clustering method for spatial data mining Ng, R.;Han, J.
  4. Proceedings of The First Pacific-Asia Conference on Knowledge Discovery and Data Mining Clustering Large Data Sets with Mixed Numeric and Categorical Values Huang, Z.
  5. Proceedings of SIGMOD Workshop on Research Issues on Data Mining and Knowledge Discovery A Fast Clustering Algorithm to Cluster Very Large Categorical Data Sets in Data Mining Huang, Z.
  6. Proceedings of Workshop on Mining Data for CRM Efficient k-medoids algorithms using multi-centroids with multi-runs sampling scheme Chu, S.C.;Roddick, J.F.;Pan, J.S.
  7. Proceedings of Second International Conference on Knowledge Discovery and Data Mining An Incremental Multi-Centroid, Multi-Run Sampling Scheme for k-medoids-based Algorithms-Extended Report Chu, S.C.;Roddick, J.F.;Pan, J.S.