범주형 값들이 순서를 가지고 있는 데이터들의 클러스터링 기법

Clustering Algorithm for Sequences of Categorical Values

  • 오승준 (한양대학교 산업공학과) ;
  • 김재련 (한양대학교 산업공학과)
  • 발행 : 2003.03.01

초록

We study clustering algorithm for sequences of categorical values. Clustering is a data mining problem that has received significant attention by the database community. Traditional clustering algorithms deal with numerical or categorical data points. However, there exist many important databases that store categorical data sequences. In this paper, we introduce new similarity measure and develop a hierarchical clustering algorithm. An experimental section shows performance of the proposed approach.

키워드

참고문헌

  1. Alian K.; 'Clustering Sequences of Complex Objects', Proceedings of the 3rd International Conference on Knowledge Discovery and Data Mining, 1997
  2. Jiawei H. and Micheline K.; Data Mining : Concepts and Techniques, Morgan kaufmann Publishers, pp335-393, 2001
  3. Jiawei H., Micheline K., and Anthony K. H. Tung; 'Spatial Clustering Methods in Data Mining : A Survey', H. J. Miller and J. Han (eds.), Geographic Data Mining and Knowledge Discovery, NY : Taylor and Francis, 2001
  4. Martin E., Hans-Peter K., Jorg S., and Xiaowei X.;'A Dennsity-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise', KDD, 1996
  5. Perkowitz M. and Etzioni O.; 'Towards Adaptive Web Sites : Conceptual Framework and Case Study', Computer Networks 31, Proceedings of the 8th International WWW Conference, 1999
  6. Rakesh A. and Ramakrishnan S.; 'Mining Sequential Patterns', Proceedings of the 11th International Conference on Data Engineering, 1995
  7. Raymond T. Ng and Jiawei H.; 'Efficient and Effective Clustering Method for Spatial Data Mining', VLDB 1994
  8. Sudipto G., Rajeev R., and Kyuseok S.; 'CURE : An Efficient Clustering Algorithm for Large Databases', SIGMOD98, 1998
  9. Sudipto G., Rajeev R., and Kyuseok S.; 'ROCK : A Robust Clustering Algorithm for Categorical Attributes', IEEE99, 1999
  10. Tadeusz M., Marek W.; and Maciej Z.: 'Scalable Hierarchical Clustering Method for Sequences of Categorical Values', Proc. of the 5th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'01), Kowloon, Hong Kong, 2001
  11. Tian Z., Raghu R., and Miron L.; 'BIRCH : An Efficient Data Clustering Method for Very Large Databases', ACM SIGMOD96, 1996
  12. Wang K., Xu C., and Liu B.; 'Clustering Transactions Using Large Items', Proceedings of the '99 ACM, 1999