Browse > Article
http://dx.doi.org/10.9717/kmms.2019.22.2.250

A Method on Associated Document Recommendation with Word Correlation Weights  

Kim, Seonmi (Dept. of Software Convergence Engineering Chosun University)
Na, InSeop (SW Convergence Education Institute, Chosun University)
Shin, Juhyun (Dept. of ICT Convergence, Chosun University)
Publication Information
Abstract
Big data processing technology and artificial intelligence (AI) are increasingly attracting attention. Natural language processing is an important research area of artificial intelligence. In this paper, we use Korean news articles to extract topic distributions in documents and word distribution vectors in topics through LDA-based Topic Modeling. Then, we use Word2vec to vector words, and generate a weight matrix to derive the relevance SCORE considering the semantic relationship between the words. We propose a way to recommend documents in order of high score.
Keywords
Big Data; Datamining; LDA; Word2vec; Topic Modeling; Information Retrieval; Document Recommendation;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 J.Y. Kim, "Internet Search Engine : Technological Mode that Draws User's Attention to Make Its Expertise Reinforce," Journal of Science and Technology Studies, Vol. 13, No. 1, pp. 181-216, 2013.
2 J.Y. Oh and S.G. Park, "The Effects of Search Engine Credibility and Information Ranking on Search Behavior," Journal of Korean Society for J ournalism and Communication Studies, Vol. 53, No. 6, pp. 26-49, 2009.
3 G.J. Ham, "Semantic-based Document Retrieval Technology Trend," Journal of Korean Society of Mechanical Engineers, Vol. 55, No. 5, pp. 38-42, 2015.
4 R. Kwak, S. Kim, S. Lee, and B. Suh, "Intelligent Issues Tracking System : Exploring Relationship between Stock-specific Keywords and Stock Price," Proceedings of HCI KOREA, pp. 351-356, 2018.
5 M.S. Kim and G.Y. Hae, "XML Information Retrieval by Document Filtering and Query Expansion Based on Ontology," Journal of Korea Multimedia Society, Vol. 8, No. 5, pp. 596-605, 2005.
6 D.M. Blei, A.Y. Ng, and M.I. Jordan, "Latent Dirichlet Allocation," Journal of Machine Learning Research, Vol. 3, pp. 993-1022, 2003.
7 T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, "Distributed Representations of Words and Phrases and their Compositionality," Proceeding of International Conference on Neural Information Processing Systems, pp. 3111-3119, 2013.
8 L. Page, S. Brin, R. Motwani, and T. Winograd, ThePageRank Citation Ranking: Bringing Order to the Web, Stanford Digital Libraries Working Paper, 1998.
9 S. Brin and L. Page, "The Anatomy of a Large-scale Hypertextual Web Search Engine," Journal of Computer Networks and ISDN Systems, Vol. 33, pp. 107-117, 1988.
10 R. Mihalcea and P. Tarau, "TextRank: Brigning Order into Texts," Proceeding of EMNLP-04 and the 2004 Conference on Empirical Methods in Natural Language Processing, pp. 404-411, 2004.
11 J.Y. Son and Y.T. Shin, "Music Lyrics Summarization Method Using TextRank Algorithm," Journal of Korea Multimedia Society, Vol. 21, No. 1, pp. 45-50, 2015.   DOI
12 Turney and M. Littman, "Measuring Praise and Criticism: Inference of Semantic Orientation from Association," Proceedings of ACL-02, 40th Annual Meeting of the Association for Computational Linguistics, pp. 417-424, 2002.
13 S.M. Kim, Method of Related Document Recommendation Considering Semantic Relation between Words, Master's Thesis of Chosun University, 2019.
14 T. Mikolov, K. Chen, G. Corrado, and J. Dean, "Efficient Estimation of Word Representations in Vector Space," arXiv preprint, arXiv:1301.3781, 2013.