[KSCI] Korea Science Citation Index Service

Reputation Analysis of Document Using Probabilistic Latent Semantic Analysis Based on Weighting Distinctions

Cho, Shi-Won (동국대 공대 전기공학과)
Lee, Dong-Wook (동국대 공대 전기공학과)

Publication Information

The Transactions of The Korean Institute of Electrical Engineers / v.58, no.3, 2009 , pp. 632-638 More about this Journal

Abstract

Probabilistic Latent Semantic Analysis has many applications in information retrieval and filtering, natural language processing, machine learning from text, and in related areas. In this paper, we propose an algorithm using weighted Probabilistic Latent Semantic Analysis Model to find the contextual phrases and opinions from documents. The traditional keyword search is unable to find the semantic relations of phrases, Overcoming these obstacles requires the development of techniques for automatically classifying semantic relations of phrases. Through experiments, we show that the proposed algorithm works well to discover semantic relations of phrases and presents the semantic relations of phrases to the vector-space model. The proposed algorithm is able to perform a variety of analyses, including such as document classification, online reputation, and collaborative recommendation.

Keywords

PLSA; Reputation analysis; EM-Algorithm;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)
Times Cited By SCOPUS : 0

Reference
Cited By KSCI

1	Bo Pang and Lillian Lee, Opinion mining and sentiment analysis, Foundations and Trends in Information Retrieval 2(1-2), pp. 1-135, 2008 DOI
2	T. Hofmann, Unsupervised learning by probabilistic latent semantic analysis, Machine Learning, 42(1-2), pp. 177-196, 2001 DOI
3	R. A. Boyles, On the convergence of the EM algorithm, J. Roy. Sta. B., vol. 45, no. 1, pp. 47-50, 1983
4	C. Wu, On the convergence properties of the EM algorithm, Ann. Statist., vol. 11. 1, pp. 95-103, 1983 DOI ScienceOn
5	김성수, 강지혜, 새로운 고속 EM 알고리즘, 한국정보과학회, 정보과학회논문지 : 시스템 및 이론 제31권 제9.10호, pp. 575-587, 2004 과학기술학회마을
6	이경찬, 강승식, 범주 대표어의 가중치 계산 방식에 의한 자동 문서 분류 시스템, 한국정보과학회, 한국정보과학회 2002년도 봄 학술발표논문집 제29권 제1호(B), pp.475-477, 2002 과학기술학회마을
7	Shimodaira, H., Improving Predictive Inference under Covariate Shift by Weighting the Log-likelihood Function. Journal of Statistical Planning and Inference, Vol. 90, 227-244, 2000 DOI ScienceOn
8	H. Chen, R. Perry, and K. Buckley, Direct and EM-based map sequence estimation with unknown time-varying channels, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, vol. 4, pp. 2129-2132, 2001 DOI
9	Daniel D. Lee and H. Sebastian Seung, Learning the parts of objects by non-negative matrix factorization, Nature, vol 401, pp. 788-791, 1999 DOI ScienceOn
10	R. J. Kozick and B. M. Sadler, Maximum-likelihood array processing m non-Gaussian noise with Gaussian mixtures, IEEE Trans. on Signal Processing, vol. 48, No. 12, pp. 3520-3535, 2000 DOI ScienceOn
11	P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, 39(1):1-21, 1977
12	Thomas Landauer, P. W. Foltz, and D. Laham, Introduction to Latent Semantic Analysis. Discourse Processes 25: 259-284, 1998 DOI ScienceOn
13	한국과학기술정보연구원, http://www.kristalinfo.com/K-Lab/Text-CatiKRTC.2003.tar.gz
14	홍영국, 이종혁, 이근배, 의존문법에 기반을 둔 한국어 구문 분석기, 한국정보과학회 1993년 봄 학술논문발표집 제20권 제8호, pp. 33-46, 1994
15	G. Salton and C. Buckley. Term weighting approaches in automatic text retrieval. Information Processing and Management, vol. 24, no. 5, pages 513-523, 1988 DOI ScienceOn

KSCI

Reputation Analysis of Document Using Probabilistic Latent Semantic Analysis Based on Weighting Distinctions 가중치 기반 PLSA를 이용한 문서 평가 분석

Reputation Analysis of Document Using Probabilistic Latent Semantic Analysis Based on Weighting Distinctions