Browse > Article

Reputation Analysis of Document Using Probabilistic Latent Semantic Analysis Based on Weighting Distinctions  

Cho, Shi-Won (동국대 공대 전기공학과)
Lee, Dong-Wook (동국대 공대 전기공학과)
Publication Information
The Transactions of The Korean Institute of Electrical Engineers / v.58, no.3, 2009 , pp. 632-638 More about this Journal
Abstract
Probabilistic Latent Semantic Analysis has many applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. In this paper, we propose an algorithm using weighted Probabilistic Latent Semantic Analysis Model to find the contextual phrases and opinions from documents. The traditional keyword search is unable to find the semantic relations of phrases, Overcoming these obstacles requires the development of techniques for automatically classifying semantic relations of phrases. Through experiments, we show that the proposed algorithm works well to discover semantic relations of phrases and presents the semantic relations of phrases to the vector-space model. The proposed algorithm is able to perform a variety of analyses, including such as document classification, online reputation, and collaborative recommendation.
Keywords
PLSA; Reputation analysis; EM-Algorithm;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
Times Cited By SCOPUS : 0
연도 인용수 순위
1 Bo Pang and Lillian Lee, Opinion mining and sentiment analysis, Foundations and Trends in Information Retrieval 2(1-2), pp. 1-135, 2008   DOI
2 T. Hofmann, Unsupervised learning by probabilistic latent semantic analysis, Machine Learning, 42(1-2), pp. 177-196, 2001   DOI
3 R. A. Boyles, On the convergence of the EM algorithm, J. Roy. Sta. B., vol. 45, no. 1, pp. 47-50, 1983
4 C. Wu, On the convergence properties of the EM algorithm, Ann. Statist., vol. 11. 1, pp. 95-103, 1983   DOI   ScienceOn
5 김성수, 강지혜, 새로운 고속 EM 알고리즘, 한국정보과학회, 정보과학회논문지 : 시스템 및 이론 제31권 제9.10호, pp. 575-587, 2004   과학기술학회마을
6 이경찬, 강승식, 범주 대표어의 가중치 계산 방식에 의한 자동 문서 분류 시스템, 한국정보과학회, 한국정보과학회 2002년도 봄 학술발표논문집 제29권 제1호(B), pp.475-477, 2002   과학기술학회마을
7 Shimodaira, H., Improving Predictive Inference under Covariate Shift by Weighting the Log-likelihood Function. Journal of Statistical Planning and Inference, Vol. 90, 227-244, 2000   DOI   ScienceOn
8 H. Chen, R. Perry, and K. Buckley, Direct and EM-based map sequence estimation with unknown time-varying channels, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 2129-2132, 2001   DOI
9 Daniel D. Lee and H. Sebastian Seung, Learning the parts of objects by non-negative matrix factorization, Nature, vol 401, pp. 788-791, 1999   DOI   ScienceOn
10 R. J. Kozick and B. M. Sadler, Maximum-likelihood array processing m non-Gaussian noise with Gaussian mixtures, IEEE Trans. on Signal Processing, vol. 48, No. 12, pp. 3520-3535, 2000   DOI   ScienceOn
11 P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1):1-21, 1977
12 Thomas Landauer, P. W. Foltz, and D. Laham, Introduction to Latent Semantic Analysis. Discourse Processes 25: 259-284, 1998   DOI   ScienceOn
13 한국과학기술정보연구원, http://www.kristalinfo.com/K-Lab/Text-CatiKRTC.2003.tar.gz
14 홍영국, 이종혁, 이근배, 의존문법에 기반을 둔 한국어 구문 분석기, 한국정보과학회 1993년 봄 학술논문발표집 제20권 제8호, pp. 33-46, 1994
15 G. Salton and C. Buckley. Term weighting approaches in automatic text retrieval. Information Processing and Management, vol. 24, no. 5, pages 513-523, 1988   DOI   ScienceOn