DOI QR코드

DOI QR Code

Social Media Bigdata Analysis Based on Information Security Keyword Using Text Mining

텍스트마이닝을 활용한 정보보호 키워드 기반 소셜미디어 빅데이터 분석

  • Received : 2022.06.27
  • Accepted : 2022.10.12
  • Published : 2022.10.30

Abstract

With development of Digital Technology, social issues are communicated through digital-based platform such as SNS and form public opinion. This study attempted to analyze big data from Twitter, a world-renowned social network service, and find out the public opinion. After collecting Twitter data based on 14 keywords for 1 year in 2021, analyzed the term-frequency and relationship among keyword documents with pearson correlation coefficient using Data-mining Technology. Furthermore, the 6 main topics that on the center of information security field in 2021 were derived through topic modeling using the LDA(Latent Dirichlet Allocation) technique. These results are expected to be used as basic data especially finding key agenda when establishing strategies for the next step related industries or establishing government policies.

디지털 기술의 발전으로 사회적 이슈들이 SNS와 같은 디지털 기반 플랫폼을 통해서 소통되고 여론을 형성하기도 한다. 본 연구에서는 소셜미디어를 통해서 공유되고 있는 정보보호 이슈관련 여론을 살펴보기 위하여 대표적인 단문 소셜네트워크서비스인 트위터 빅데이터 분석을 진행하였다. 2021년 1년간 14개 정보보호 관련 키워드를 중심으로 데이터를 수집한 후, 데이터마이닝 기술을 활용하여 용어 빈도(TF)분석과 피어슨 계수를 활용한 상관분석을 통해 키워드간의 상관관계를 밝혔다. 또한 잠재적 확률기반 LDA 토픽모델링을 실시하여 정보보호분야에 많은 관심을 받았던 6개의 주요 토픽을 도출하였다. 이러한 결과는 관련 산업의 전략수립이나, 정부 정책수립 시 주요 키워드를 도출하는 기초데이터로 활용될 수 있을 것으로 기대된다.

Keywords

References

  1. Bae, J. H., Son, J. E., and Song, M. (2013). Analysis of Twitter for 2012 South Korea Presidential Election by Text Mining Techniques. Journal of Intelligence and Information Systems, 19(3), 141-156. https://doi.org/10.13088/jiis.2013.19.3.141
  2. Cho, S. B., Shin, S. A., and Kang, D. S.(2018), A Study on the Research Trends on Open Innovation using Topic Modeling, Informatin Policy, 25(3), 52-74.
  3. Cho, K. W., Han. N. Y.(2021). Research Trends on Emotional Labor in Korea using text mining. Journal of the Korea Industrial Information Systems Research, 26(6), 119-133. https://doi.org/10.9723/JKSIIS.2021.26.6.119
  4. Choi, H. Y., Lee, J. R., Jin, M. J.(2020). Intimate Partnerships and Family Policy in Korean News Articles and Comments: A Topic Model Analysis. Family and Culture, 32(4), 29-60. https://doi.org/10.21478/FAMILY.32.4.202012.002
  5. Chung, M. S. & Lee, J. Y.(2018), Systemic Analysis of Research Activities and Trends Related to Artificial Intelligence(A.I.) Technology Based on Latent Dirichlet Allocation (LDA) Model). Journal of the Korea Industrial Information Systems Research, 23, 87-95.
  6. Choi, J. H. and Han, D. S.(2011). A Study on the Correlation of Agendas between Politicians' Twitters and traditional News Media. Journal of Communication, 11(2), 501-532. https://doi.org/10.22693/NIAIP.2018.25.3.052
  7. D. M. Blei(2012). Probabilistic Topic Model, Communications of the ACM, 55(4), 77-94 https://doi.org/10.1145/2133806.2133826
  8. Kim, E. M. and Lee, J. H.(2011). The Diffusion of News through Twitter and the Emerging Media Ecosystem. Korean Journal of Journalism & Communication Studies, 55(6), 152-180.
  9. Kim, N. G., Lee, D. H., Choi, H. C and William Xiu Shun, W.(2017). Investigations on Techniques and Applications of Text Analytics. The Journal of Korean Institute of Communications and Information Sciences, 42(2), 471-492. https://doi.org/10.7840/kics.2017.42.2.471
  10. Ku, G. T(2002). The Impact of Website Campaigning on Traditional News Media and Public Agenda: Based on Agenda-Setting. Korean Journal of Journalism & Communication Studies, 46(4), 46-75.
  11. Lee, S. J., and Min, K. S.(2022), Intergrated Interpretation of Network Analysis and Topic Modeling in Text-mining: Focusing on College Competency-based Education. Journal of Education Evaluation, 35(1), 165-188.
  12. Lee. S. J(2020), Topic Modeling of Newspaper Articles on Government 'Senior job program' via Latent Dirichlet Allocation. Journal of Digital Convergence, 18(10), pp. 537-546 https://doi.org/10.14400/JDC.2020.18.10.537
  13. Park, S. H.(2005). On the Journalistic Characteristics and Social Impacts of Internet Bulletin Board as a Public Opinion Space]. Korea Regional Communication Research Association, 5(3), 191-226.
  14. Park, J. S,, Hong, S. G., and Kim, J. W.(2017), A Study on Science Technology Trend and Prediction Using Topic Modeling]. Journal of the Korea Industrial Information Systems Research, 22, 19-28.
  15. Park, S. H.(2012). Critical Study on the Forming Public Opinion of SNS and Participation Behavior. Korean Journal of Communication & Information, 55-73.
  16. Park, K. H., Lee, E. Y., and Yune, S. J.(2021), Counseling Outcomes Research Trend Analysis Using Topic Modeling - Focus on 「Korean Journal of Counseling, Journal of digital convergence, 19(11), 517-523. https://doi.org/10.14400/JDC.2021.19.11.517
  17. T. Griffiths and M. Steyvers(2004), Probabilistic Topic Models, Proceedings of the National Academy of Sciences Vol. 101 Issue suppl_1 Pages 5228-523 https://doi.org/10.1073/pnas.0307752101
  18. X. Wang, and A. McCallum(2006). Topics over Time: A Non-Markov Continuous-Time Model of Topical Tren. KDD '06: Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, 986. https://doi.org/10.1145/1150402
  19. Yoon, T. I. and Shim, J. C.(2003). Agenda-Setting Effects of Controversial Websites. Korean Journal of Journalism & Communication Studies, 47(6), 194-219.