Browse > Article
http://dx.doi.org/10.16981/kliss.48.201712.235

Comparison of Topic Modeling Methods for Analyzing Research Trends of Archives Management in Korea: focused on LDA and HDP  

Park, JunHyeong (전북대학교 일반대학원 기록관리학과)
Oh, Hyo-Jung (전북대학교 기록관리학과, 문화융복합 아카이빙연구소)
Publication Information
Journal of Korean Library and Information Science Society / v.48, no.4, 2017 , pp. 235-258 More about this Journal
Abstract
The purpose of this study is to analyze research trends of archives management in Korea by comparing LDA (Latent Semantic Allocation) topic modeling, which is the most famous method in text mining, and HDP (Hierarchical Dirichlet Process) topic modeling, which is developed LDA topic modeling. Firstly we collected 1,027 articles related to archives management from 1997 to 2016 in two journals related with archives management and four journals related with library and information science in Korea and performed several preprocessing steps. And then we conducted LDA and HDP topic modelings. For a more in-depth comparison analysis, we utilized LDAvis as a topic modeling visualization tool. At the results, LDA topic modeling was influenced by frequently keywords in all topics, whereas, HDP topic modeling showed specific keywords to easily identify the characteristics of each topic.
Keywords
Archives Management; Research Trends; Topic Modeling; LDA; HDP;
Citations & Related Records
Times Cited By KSCI : 13  (Citation Analysis)
연도 인용수 순위
1 Choi, Yilang. 2015. A Study on the Research Trends of Archival Studies in Korea : Focused on Research Papers between 2004 and 2013. The Korean Journal of Archival Studies, 43, 147-177.
2 Gyu-Hwan Kim, Young-Joon Nam. 2009. A Study on the Research Trends of Records and Archives Management in Korea through an Analysis of Journal Articles. Journal of The Korean Society for Library and Information Science, 43(4): 217-239.   DOI
3 Gyu-Hwan Kim, Bo-Seong Jang, Hyun-Jung Yi. 2009. A Study on Intellectual Structure of Records Management and Archives in Korea : Based on Syntactic and Semantic Structure of Article Titles. Journal of The Korean Society for Library and Information Science, 43(3): 417-439.   DOI
4 Namgyu Kim, Donghoon Lee, Hochang Choi, Willam Xiu Shun Wong. 2017. Investigations on Techniques and Applications of Text Analytics. The Journal of Communications and Information Sciences, 42(2): 471-492.
5 Kim, Sang Kyoum. 2016. A Study on the Research Trends in Domestic Industrial Engineering using Topic Modeling. master's thesis, Seoul National University of Science and Technology, Seoul, Korea.
6 Seol A Jin, Min Song. 2016. Topic Modeling based Interdisoiplinarity Measurement in the Informatics Related Journals. Journal of the Korean Society for Information Management, 33(1): 7-32.   DOI
7 So-Young Yu. 2015. Combining Ego-centric Network Analysis and Dynamic Citation Network Analysis to Topic Modeling for Characterizing Research Trends. Journal of the Korean Society for Information Management, 32(1): 153-169.   DOI
8 TaeKyung Kim, HoeRyeon Choi, HongChul Lee. 2016. A Study on the Research Trends in Fintech using Topic Modeling. The Journal of Korea Academy Industrial Cooperation Society, 17(11): 670-681.
9 Nahm, Choon-Ho. 2016. An Illustrative Application of Topic Modeling Method to a Farmer's Diary. Cross-Cultural Studies, 22(1): 89-135.
10 Jae-Yun Lee, Ju-Young Moon, Hee-Jung Kim. 2007. Examining the Intellectual Structure of Records Management & Archivial Science in Korea with Text Mining. Journal of The Korean Society for Library and Information Science, 41(1): 345-372.   DOI
11 Sang-Tae Na, Joo-Eon Ahn, Min-Ho Jung, Ja-Hee Kim. 2017. Research Trend Analysis for Smart Grids Using Dynamic Topic Modeling. The transactions of The Korean Institute of Electrical Engineers, 66(4): 613-620.   DOI
12 Tea-Woo Nam, Jin-Young Lee. 2009. A Study on the Research Trends of Records and Archives Management in Korea. Journal of Korean Library and Information Science Society, 40(2): 451-472.
13 Ja-Hyun Park, Min Song. 2013. A Study on the Research Trends in Library & Infromation Science in Korea using Topic Modeling. Journal of the Korean Society for Information Management, 30(1): 7-32.   DOI
14 Park Ju Seop, Hong Soon-Goo, Kim Jong-Weon. 2017. A Study on Science Technology Trend and Prediction Using Topic Modeling. Journal of the Korea Industrial Information Systems Research, 22(4): 19-28.   DOI
15 Records Management & Archives Society Of Korea. 2013. Records & Archives Management. Seongnam: Asian cultural history.
16 Seo, Seong Hun. 2016. Fintech trend analysis using topic modeling of BM patents. master's thesis, Seoul National University of Science and Technology, Seoul, Korea.
17 Chong Wang, John Paisley and David M. Blei. 2011. Online Variational Inference for the Hierarchical Dirichlet Process. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, Fort Lauderdale, FL.
18 Shin, Kyoo-Sik, Choi, Hoe-Ryeon, Lee, Hong-Chul. 2015. Topic Model Analysis of Research Trend on Renewable Energy. The Journal of Korea Academy Industrial Cooperation Society, 16(9): 6411-6418.   DOI
19 Hye In Sohn, Young Joon Nam. 2016. A Study on the Research Trends of Archives Management in Korea : Focused on the Journal of Records - Management & Archives Society of Korea and The Korean Journal of Archival Studies. Journal of Korea Society for Information Management, 33(1): 85-110.   DOI
20 Carson Sievert and Kenneth E. Shirley. 2014. LDAvis: A method for visualizing and interpreting topics. proceedings of workshop on interactive language learning, visualization, and interfaces, Baltimore, Maryland.
21 David M. Blei. 2012. Provavilistic Topic Models. Communications of the ACM, 55(4): 77-84.   DOI
22 David M. Blei, Andrew Y. Ng and Michael I. Jordan. 2003. Latent Dirichlet Allocation. Journal of Machine Learning Research, 3: 993-1022.
23 Gensim Home Page. [cited 2017. 9. 15].
24 Jason Chuang, Christopher D. Manning and Jeffrey Heer. 2012. Termite: Visualization Techniques for Assessing Textual Topic Models. Advanced Visual Interfaces, 12: 21-25.
25 Konlpy Home Page. [cited 2017. 9. 15].
26 Teh, Yee Whye, Michael I. Jordan, Matthew J. Beal and David M. Blei. 2007. Hierarchical Dirichlet processes. Journal of the American Statistical Association,101(476): 1566-1581.   DOI
27 Loet Leydesdorff and and Adina Nerghes. 2017. Co-word Maps and Topic Modeling: A Comparison Using Small and Medium-Sized Corpora (N<1,000). Journal of the Association for Information Science and Technology, 68(4): 1024-1035.   DOI