Browse > Article

Cluster Feature Selection using Entropy Weighting and SVD  

Lee, Young-Seok (Dept.of Computer, Soongsil University)
Lee, Soo-Won (Dept.of Computer, Soongsil University)
Abstract
Clustering is a method for grouping objects with similar properties into a same cluster. SVD(Singular Value Decomposition) is known as an efficient preprocessing method for clustering because of dimension reduction and noise elimination for a high dimensional and sparse data set like E-Commerce data set. However, it is hard to evaluate the worth of original attributes because of information loss of a converted data set by SVD. This research proposes a cluster feature selection method, called ENTROPY-SVD, to find important attributes for each cluster based on entropy weighting and SVD. Using SVD, one can take advantage of the latent structures in the association of attributes with similar objects and, using entropy weighting one can find highly dense attributes for each cluster. This paper also proposes a model-based collaborative filtering recommendation system with ENTROPY-SVD, called CFS-CF and evaluates its efficiency and utilization.
Keywords
SVD; Feature Selection; Clustering; Singular Value Decomposition; Entropy Weighting;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Sudipto Guha, Rajeev Rastogi, and Kyuseok Shim., ROCK : a robust clustering algorithm for categorical attributes, In Information Systems, 25(5), pp.345-366, 2000   DOI   ScienceOn
2 Strehl, A., Ghosh and J., Mooney, R., Impact of similarity measures on web-page clustering, In Proc. AAAI Workshop on AI for Web Search, pp. 58-64, 2000
3 M. Devaney and A. Ram., Efficient feature selection in conceptual clustering, In Machine Learning: Proceedings of the Fourteenth International Conference, pp. 92-97, Nashville, TN, 1997
4 Landauer, T. K., Foltz, P. W., and Laham, D., An introduction to Latent Semantic Analysis, In Discourse Processes 25, pp. 259-284, 1998
5 Sarwar, B. M., Karypis, G., Konstan, J. A., Riedl, J., Item-based Collaborative Filtering Recommender Algorithms, In WWW10 Conference, pp. 285-295, May 2001   DOI
6 Sonny HS Chee, RecTree: A Linear Collaborative Filtering Algorithm, M.S thesis, Computing Science, Simon Fraser University, 2000
7 Paul Resnick ,Neophytos Iacovou, Mitesh Suchak, Peter Bergstrom and John Riedl, GroupLens: an open architecture for collaborative filtering of netnews, Proceedings of the conference on Computer supported cooperative work, pp. 22-26, October 1994
8 D. Billsus and M. J. Pazzani, Learning collaborative information filters, In Proceedings of the Fifteenth International Conference on Machine Learning, pp. 46-54, July 1998
9 Yang, Y., Pedersen, J.O., A Comparative Study on Feature Selection in Text Categorization, Proc.of the 14th International Conference on Machine Learning ICML97, pp. 412-420, 1997
10 Berry, M. W., Dumais, S. T., and O'Brien G. W., Using linear algebra for intelligent information retrieval, SIAM Review, 37(4), pp. 573-595, 1995   DOI   ScienceOn
11 Jachims, T., A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization, Proc. of the 14th International Conference on Machine Learning ICML97, pp. 143-151, 1997
12 Sarwar, B. M., Karypis, G., Konstan, J. A., and Riedl J., Application of Dimensionality Reduction in Recommender System-A Case Study, In ACM WebKDD 2000 Web Mining for E-Commerce Workshop, 2000
13 Lewis, D. D., Feature selection and feature extraction for text categorization, Proceedings of Speech and Natural Language Workshop, pp. 212-217, 1992   DOI
14 Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., and Harshman, R., Indexing by latent semantic analysis, Journal of the American Society for Information Science, 41(6), pp. 391-407, 1990   DOI
15 Kolda, T. G. and O'Leary, D. P., A semidiscrete matrix decomposition for latent semantic indexing in information retrieval, ACM Trans. Inf. Syst., 16, pp. 322-346, 1998   DOI   ScienceOn
16 M.W. Berry, Z. Drmac, E.R. Jessup, Matrices, vector spaces, and information retrieval, SIAM Rev., 41(2), pp. 335-362, 1999   DOI   ScienceOn