Browse > Article
http://dx.doi.org/10.17662/ksdim.2020.16.3.025

A Method of Calculating Topic Keywords for Topic Labeling  

Kim, Eunhoe (서일대학교 소프트웨어공학과)
Suh, Yuhwa (숭실대학교 베어드교양대학)
Publication Information
Journal of Korea Society of Digital Industry and Information Management / v.16, no.3, 2020 , pp. 25-36 More about this Journal
Abstract
Topics calculated using LDA topic modeling have to be labeled separately. When labeling a topic, we look at the words that represent the topic, and label the topic. Therefore, it is important to first make a good set of words that represent the topic. This paper proposes a method of calculating a set of words representing a topic using TextRank, which extracts the keywords of a document. The proposed method uses Relevance to select words related to the topic with discrimination. It extracts topic keywords using the TextRank algorithm and connects keywords with a high frequency of simultaneous occurrence to express the topic with a higher coverage.
Keywords
LDA; Topic Model; Topic Label; Relevance; TextRank;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 박종순, 김창식, "빅데이터 연구동향 분석: 토픽모델링을 중심으로," 디지털산업정보학회논문지, 제15권, 제1호, 2019, pp.1-7.   DOI
2 김창식, 김남규, 곽기영, "머신러닝 및 딥러닝 연구동향 분석: 토픽모델링을 중심으로," 디지털산업정보학회논문지, 제15권, 제2호, 2019, pp.19-28.   DOI
3 David M. Blei, Andrew Y. Ng, and Michael I. Jordan, "Latent Dirichlet Allocation," Journal of Machine Learning Research, Vol. 3, Mar. 2003, pp. 993-1022..
4 Q. Mei, X. Shen, and C.X. Zhai, "Automatic labeling of multinomial topic models." In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, 2007, pp.490-499.
5 Jey Han Lau, David Newman, Sarvnaz Karimi, and Timothy Baldwin, "Best topic word selection for topic labelling," In Proceedings of the 23rd International Conference on Computational Linguistics: Posters (COLING'10), Association for Computational Linguistics, 2010, pp.605-613.
6 Jey Han Lau, Karl Grieser, David Newman, and Timothy Baldwin, "Automatic labelling of topic models," In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, Association for Computational Linguistics, 2011, pp.1536-1545.
7 Ioana Hulpus, Conor Hayes, Marcel Karnstedt, and Derek Greene, "Unsupervised graph-based topic labelling using dbpedia," In Proceedings of the sixth ACM international conference on Web search and data mining (WSDM '13), Association for Computing Machinery, 2013, pp.465-474.
8 S. Bhatia, J. H. Lau, and T. Baldwin, "Automatic labelling of topics with neural embeddings," in 26th COLING International Conference on Computational Linguistics, 2016, pp.953-963.
9 Mihalcea, Rada and Tarau, Paul, "TextRank: Bringing Order into Text," Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Jul. 2004, pp.404-411.
10 Carson Sievert and Kenneth E. Shirley, "LDAvis: A method for visualizing and interpreting topics," Proceedings of the Workshop on Interactive Language Learning, Visualization, and Interfaces, 2014, pp.63-70.
11 Mallet, http://mallet.cs.umass.edu/
12 Gensim, https://radimrehurek.com/gensim/
13 Komoran, https://www.shineware.co.kr/products/komoran/