DOI QR코드

DOI QR Code

텍스트마이닝을 활용한 북한 관련 뉴스의 기간별 변화과정 고찰

An Investigation on the Periodical Transition of News related to North Korea using Text Mining

  • 투고 : 2019.06.16
  • 심사 : 2019.09.19
  • 발행 : 2019.09.30

초록

북한의 변화와 동향 파악에 대한 연구는 북한관련 정책에 대한 방향을 결정하고 북한의 행위를 예측하여 사전에 대응 할 수 있다는 측면에서 매우 중요하다. 현재까지 북한 동향에 대한 연구는 전문가를 중심으로 과거 사례를 서술적으로 분석하여, 향후에 북한의 동향을 분석하고 대응하여 왔다. 이런 전문가 서술 중심의 북한 변화 및 동향 연구에서 비정형데이터를 이용한 텍스트마이닝 분석이 더해지면 보다 과학적인 북한 동향 분석이 가능할 것이다. 특히 북한의 동향 파악과 북한의 대남 관련 행위와 연관된 연구는 통일 및 국방 분야에서 매우 유용하며 필요한 분야이다. 본 연구에서는 북한의 신문 기사 내용을 활용한 텍스트마이닝 방법으로 북한과 관련한 핵심 단어를 구축하였다. 그리고 본 연구는 김정은 집권 이후 최근의 남북관계의 극적인 관계와 변화들을 기반으로 세 개의 기간을 나누고 이 기간 내에 국내 언론에 나타난 북한과 관련성이 높은 단어들을 시계열적으로 분석한 연구이다. 북한과 관련한 주요 단어들을 세 개의 기간별로 분류하고 당시에 북한의 태도와 동향에 따라 해당 단어와 주제들의 관련성이 어떻게 변화하였는지를 파악하였다. 본 연구는 텍스트마이닝을 이용한 연구가 남북관계 및 북한의 동향을 이해하고 분석하는 방법론으로서 얼마나 유용한 것이지를 파악하는 것이었다. 앞으로 북한의 동향 분석에 대한 연구는 물론 대북관계 및 정책에 대한 방향을 결정하고, 북한의 행위를 사전에 예측하여 대응 할 수 있는 북한 리스크 측정 모델 구축을 위한 연구로 진행 될 것이다.

The goal of this paper is to investigate changes in North Korea's domestic and foreign policies through automated text analysis over North Korea represented in South Korean mass media. Based on that data, we then analyze the status of text mining research, using a text mining technique to find the topics, methods, and trends of text mining research. We also investigate the characteristics and method of analysis of the text mining techniques, confirmed by analysis of the data. In this study, R program was used to apply the text mining technique. R program is free software for statistical computing and graphics. Also, Text mining methods allow to highlight the most frequently used keywords in a paragraph of texts. One can create a word cloud, also referred as text cloud or tag cloud. This study proposes a procedure to find meaningful tendencies based on a combination of word cloud, and co-occurrence networks. This study aims to more objectively explore the images of North Korea represented in South Korean newspapers by quantitatively reviewing the patterns of language use related to North Korea from 2016. 11. 1 to 2019. 5. 23 newspaper big data. In this study, we divided into three periods considering recent inter - Korean relations. Before January 1, 2018, it was set as a Before Phase of Peace Building. From January 1, 2018 to February 24, 2019, we have set up a Peace Building Phase. The New Year's message of Kim Jong-un and the Olympics of Pyeong Chang formed an atmosphere of peace on the Korean peninsula. After the Hanoi Pease summit, the third period was the silence of the relationship between North Korea and the United States. Therefore, it was called Depression Phase of Peace Building. This study analyzes news articles related to North Korea of the Korea Press Foundation database(www.bigkinds.or.kr) through text mining, to investigate characteristics of the Kim Jong-un regime's South Korea policy and unification discourse. The main results of this study show that trends in the North Korean national policy agenda can be discovered based on clustering and visualization algorithms. In particular, it examines the changes in the international circumstances, domestic conflicts, the living conditions of North Korea, the South's Aid project for the North, the conflicts of the two Koreas, North Korean nuclear issue, and the North Korean refugee problem through the co-occurrence word analysis. It also offers an analysis of South Korean mentality toward North Korea in terms of the semantic prosody. In the Before Phase of Peace Building, the results of the analysis showed the order of 'Missiles', 'North Korea Nuclear', 'Diplomacy', 'Unification', and ' South-North Korean'. The results of Peace Building Phase are extracted the order of 'Panmunjom', 'Unification', 'North Korea Nuclear', 'Diplomacy', and 'Military'. The results of Depression Phase of Peace Building derived the order of 'North Korea Nuclear', 'North and South Korea', 'Missile', 'State Department', and 'International'. There are 16 words adopted in all three periods. The order is as follows: 'missile', 'North Korea Nuclear', 'Diplomacy', 'Unification', 'North and South Korea', 'Military', 'Kaesong Industrial Complex', 'Defense', 'Sanctions', 'Denuclearization', 'Peace', 'Exchange and Cooperation', and 'South Korea'. We expect that the results of this study will contribute to analyze the trends of news content of North Korea associated with North Korea's provocations. And future research on North Korean trends will be conducted based on the results of this study. We will continue to study the model development for North Korea risk measurement that can anticipate and respond to North Korea's behavior in advance. We expect that the text mining analysis method and the scientific data analysis technique will be applied to North Korea and unification research field. Through these academic studies, I hope to see a lot of studies that make important contributions to the nation.

키워드

참고문헌

  1. Aggarwal, C. C. and C. Zhai, Mining text data, Springer, 2012.
  2. Chakraborty, G., M. Pagolu and S. Garla, Text Mining and Analysis: Practical Methods, Examples, and Case Studies Using SAS, SAS Institute, 2013.
  3. Chang, J. Y., "A Study on Research Trends of Graph-Based Text Representations for Text Mining", Journal of the Korea Academia-Industrial cooperation Society, Vol. 12, No 8(2013), 3677-3688. https://doi.org/10.5762/KAIS.2011.12.8.3677
  4. Chen, Y. T. and M. C. Chen, "Using chi-square statistics to measure similarities for text categorization", Expert systems with applications, Vol. 38(2011), 3085-3090. https://doi.org/10.1016/j.eswa.2010.08.100
  5. Cho, G. H., S. Y. Lim, and S. Hur, "An Analysis of the Research Methodologies and Techniques in the Industrial Engineering Using Text Mining", Journal of the Korean Institute of Industrial Engineers, Vol 40, No 1(2014), 52-59. https://doi.org/10.7232/JKIIE.2014.40.1.052
  6. Cho, S. G., J. Cho and S. B. Kim, "Discovering Meaningful Trends in the Inaugural Addresses of United States Presidents Via Text Mining", Journal of the Korean Institute of Industrial Engineers, Vol. 41, No. 5(2015), 453-460. https://doi.org/10.7232/JKIIE.2015.41.5.453
  7. Fan, W., L. Wallace, S. Rich, and Z. Zhang, "Tapping the power of text mining", Communications of the ACM, Vol. 49, No. 9(2006), 76-82. https://doi.org/10.1145/1151030.1151032
  8. Go, G. S., W. K. Jung, Y. G. Shin, S. S. Park, and D. S. Jang, "A Study on Development of Patent Information Retrieval Using Textmining", Journal of the Korea Academia-Industrial cooperation Society, Vol. 12, No. 8(2011), 3677-3688. https://doi.org/10.5762/KAIS.2011.12.8.3677
  9. Hahm, Y., and S. Lee, "The Distinctiveness of Big Data Business Model in Its Components: A Comparative Analysis of Korea-Us Cosmetic Big Data Business Cases", Information Technology and Architecture, Vol. 13, No. 1(2016), 63-75.
  10. Hu, X. and H. Liu, Text analytics in social media, Mining text data, Springer Link, 2012.
  11. Hung, J. L. and K. Zhang, "Examining mobile learning trends 2003-2008 : A categorical meta-trend analysis using text mining techniques", Journal of Computing in Higher Education, Vol. 24, No. 1(2012), 1-17. https://doi.org/10.1007/s12528-011-9044-9
  12. Hymans, Jacques E. C. "Assessing North Korean Nuclear Intentions and Capacities: A New Approach", Journal of East Asian Studies Vol 8(2008), 259-292. https://doi.org/10.1017/S1598240800005324
  13. Judita, P., M, Stevenson, and R. Gaizauskas, "Exploring relation types for literature-based discovery", Journal of the American Medical Informatics Association, Vol. 2(2015), 987-992.
  14. Kim, J. S., M. W. Kim, and B. H. Hyun, "A Study on Analysis of Patent Information Based Biotechnology Research Trend and Promising Research Themes", The Korea Society for Innovation Management and Economics, Vol. 21, No. 2(2013), 25-56.
  15. Kim, H. Y., "Analysis of an Inaugural Address of Korean Presidents Based on Network", Korea Content Association, Vol. 3, No. 2(2013), 67-68. https://doi.org/10.5392/JKCA.2013.13.06.067
  16. Kim, H. Y., H. G.,Kim and B. M. Kang, "A Trend Analysis of Cultural consumption Based on Newspaper Texts", Journal of KIIS E: Software and Applications, Vol. 39, No 3(2012), 244-251.
  17. Kim, H., "A Study on Presidential Leadership and Policy Agenda Setting Pattern: A Content Analysis of Korean Presidential Addresses", Journal of Korean Politics, Vol. 23, No. 2(2014), 77-102.
  18. Kim, M., and P. Koo, "A Study on Big Data Based Investment Strategy Using Internet Search Trends", Journal of the Korean Operations Research and Management Science Society, Vol. 38, No 4(2013), 53-64. https://doi.org/10.7737/JKORMS.2013.38.4.053
  19. Kim, M., D. Notkin, D. Grossman, and G. Wilson, "Identifying and summarizing systematic code changes via rule inference", IEEE Transactions on Software Engineering, Vol 39, (2013), 45-62. https://doi.org/10.1109/TSE.2012.16
  20. Kim, S., H. Cho, and J. Kang, "The Status of Using Text Mining in Academic Research and Analysis Methods", Journal of Information Technology and Architecture, Vol. 13, No. 2(2016), 317-329.
  21. Kim, Y., N. Kim, and S. Jeong, "Stock-Index Invest Model Using News Big Data Opinion Mining", Journal of Intelligence and Information Systems, Vol. 18, No. 2(2012), 143-156. https://doi.org/10.13088/JIIS.2012.18.2.143
  22. Kim, Y., Y. Tian, Y. Jeong, R. Jihee, and S. H. Myaeng, "Automatic discovery of technology trends from patent text". Proceedings of the 2009 ACM symposium on Applied Computing, (2009), 1480-1487.
  23. Lee, C., and H. Moon, "Study on analysis of North Korea's news trends associated with provocations using text mining", Journal of National Defense Studies, Vol. 59, No. 4(2016), 103-124.
  24. Lee, M., and H. Kim, "Construction of Event Networks from Large News Data Using Text Mining Techniques", Journal of Intelligence Information System, Vol. 24, No. 1(2018), 183-203. https://doi.org/10.13088/jiis.2018.24.1.183
  25. Lee, T., and C. Lee, "A Study on the Method for Defense Science and Technology Information Analysis Using S&T Text Mining." Journal of the Military Operations Research Society of Korea, Vol. 36, No. 1(2010), 39-49.
  26. Lee, Y. J., J. H. Seo, and J. T. Choi, "Fashion Trend Marketing Prediction Analysis Based on Opinion Mining Applying SNS Text Contents", The Journal of Korean Institute of Information Technology, Vol. 12, No. 12(2014), 163-170.
  27. Lee. W., A Study on the Power Structure and Change Estimation of North Korean Leaders Using SNA, National Defense University Master's Thesis, 2015.
  28. Lim, E. T., "Five trends in presidential rhetoric: An analysis of rhetoric from George Washington to Bill Clinton", Presidential Studies Quarterly, Vol. 32, No. 2(2002), 328-348. https://doi.org/10.1111/j.0360-4918.2002.00223.x
  29. Lim, M., and M. Kim, "Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis", Journal of Intelligence Information System, Vol. 22, No. 1(2016), 1-18. https://doi.org/10.13088/jiis.2016.22.1.01
  30. Liu, B., "Sentiment analysis and opinion mining", Synthesis Lectures on Human Language Technologies, Vol. 5, No. 1(2012). https://doi.org/10.2200/S00416ED1V01Y201204HLT016
  31. Liu, B., "Sentiment Analysis and Subjectivity", Handbook of natural language processing, Vol. 2, (2010), 627-666.
  32. Myung, J., D. Lee, and Lee. S., "A Korean Product Review Analysis System Using a Semi-Automatically Constructed Semantic Dictionary", Journal of KISS: Software and Applications, Vol. 35, No. 6(2008), 392-403.
  33. Narayanan, R., B. Liu, and A. Choudhary, "Sentiment Analysis of Conditional Sentences", Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, Vol. 1, (2009).
  34. Oh, G. S., and K. H., Lee, Analysis of the Kim Jong-un Regime's South Korea Policy and Unification Discourse through Text Mining, Korea Institute for National Unification, 2016.
  35. Pai, M. Y., M. Y. Chen, H. C. Chu, and Y. M. Chen, "Development of a semantic-based content mapping mechanism for information retrieval", Expert Systems with Applications, Vol. 40, (2013), 2447-2461. https://doi.org/10.1016/j.eswa.2012.10.056
  36. Park, H., W. Seo, B. Coh, J. Lee, and J. Yoon, "Technology Opportunity Discovery Based on Firms' Technologies and Products", Journal of the Korean Institute of Industrial Engineers, Vol. 40, No. 5(2014), 442-450. https://doi.org/10.7232/JKIIE.2014.40.5.442
  37. Park, J. H., E. Park, and D. Jo, "Automated Text Analysis of North Korean New Year Addresses, 1946-2015", Korean Political Science Review, Vol. 49, No. 2(2015), 27-62. https://doi.org/10.18854/kpsr.2015.49.2.002
  38. Rich, Timothy S., "Deciphering North Korea's Nuclear Rhetoric: An Automated Content Analysis of KCNA News", Asian Affairs, Vol. 39, (2012), 73-89. https://doi.org/10.1080/00927678.2012.678128
  39. Ryu, J., C. Han, & H. Shin, "Sector Investment Strategies Using Big Data Trends", Information Technology and Architecture, Vol. 13, No 1(2016), 111-121.
  40. Yoon, S., S. Kim, and K. Shin, "Development of the Accident Prediction Model for Enlisted Men through an Integrated Approach to Datamining and Textmining" Journal of Intelligence and Information Systems, Vol. 21, No. 3(2015), 1-17. https://doi.org/10.13088/jiis.2015.21.3.01
  41. http://www.dailybizon.com/news/articleView.html?idxno=13108 (Search Date: 2019. 4. 20.)
  42. http://www.sisajournal.com/news/articleView.html?idxno=185284 (Search Date: 2019. 5. 22.)