Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2003.10D.2.277

Sparse Web Data Analysis Using MCMC Missing Value Imputation and PCA Plot-based SOM  

Jun, Sung-Hae (청주대학교 통계학과)
Oh, Kyung-Whan (서강대학교 컴퓨터학과)
Abstract
The knowledge discovery from web has been studied in many researches. There are some difficulties using web log for training data on efficient information predictive models. In this paper, we studied on the method to eliminate sparseness from web log data and to perform web user clustering. Using missing value imputation by Bayesian inference of MCMC, the sparseness of web data is removed. And web user clustering is performed using self organizing maps based on 3-D plot by principal component. Finally, using KDD Cup data, our experimental results were shown the problem solving process and the performance evaluation.
Keywords
Hybrid MCMC Missing Value Imputation; Self Organizing Maps; Principal Component based Plot;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. E. J. Newman, G. T. Barkema, 'Monte Carlo Methods in Statistical Physics,' Clarendon Press, 1999
2 T. M. Mitchell, 'Machine Learning,' McGraw-Hill, 1997
3 S. M. Ross, 'Introductory Statistics,' McGraw-Hill, 1996
4 D. B. Rubin, 'Multiple Imputation for Nonresponse in Surveys,' John Wiley & Sons, Inc., 1987
5 T. Kohonen, 'Self-Organizing and Associative Memory,' Springer, 1984
6 T. Kohonen, 'Self Organizing Maps,' Springer, 1997
7 B. M. Sarwar, 'Sparsity, Scalability, and Distribution in Recommender Systems,' Ph. D. Thesis, Computer Science Dept., Univ. of Minnesota, 2001
8 J. L. Schafer, 'Analysis of Incomplete Multivariate Data,' Chapman and Hall, 1997
9 J. Han, M. Kamber, 'Data Mining : Concepts and Techniques,' Morgan Kaufmann Publishers, 123-124, 2001
10 W. J. Kennedy, Jr James E. Gentle, 'Statistical Computing,' Marcel Dekker, INC., 1980
11 T. Kohonen, 'Self-organized formation of topologically correct feature maps,' Biological Cybernetics, 43, pp.59-69, 1982   DOI
12 B. M. Sarwar , G. Karypis, J. A. Konstan, J. Riedl , 'Application of Dimensionality Reduction in Recommender System-A Case Study,' WebKDD, Web Mining for E-Commerce Workshop, 2000
13 V. N. Vapnik, 'Statistical Learning Theory,' John Wiley & Sons Inc., 1998
14 http://www.ecn.purdue.edu/KDDCUP
15 C. Guilfoyle, 'Ventors of agent technology,' in Proc. UNICOM Seminar Intell. Agents and Their Business Applicat., London, U.K., pp.135-142, 1995
16 Sonny Han Seng Chee, 'RecTree : A Linear Collaborative Filtering Algorithm,' M. Sc. thesis, Dept. of Computer Science, Univ. Of Toronto, 1992