Browse > Article
http://dx.doi.org/10.15207/JKCS.2021.12.4.031

A Convergence Study on the Topic and Sentiment of COVID19 Research in Korea Using Text Analysis  

Heo, Seong-Min (Dept. of Applied Mathematics, Kumoh National Institute of Technology)
Yang, Ji-Yeon (Dept. of Applied Mathematics, Kumoh National Institute of Technology)
Publication Information
Journal of the Korea Convergence Society / v.12, no.4, 2021 , pp. 31-42 More about this Journal
Abstract
The purpose of this study was to explore research topics and examine the trend in COVID19 related research papers. We identified eight topics using latent Dirichlet allocation and found acceptable validity in comparison with the structural topic model. The subtopics have been extracted using k-means clustering and plotted in PCA space. Additionally, we discovered the topics bearing negative tones and warning signs by sentiment analysis. The results flagged up the issues of the topics, Biomedical Related, International Dynamics and Psychological Impact. The findings could serve as a guideline for researchers who explore new research directions and policymakers who need to make decisions about which research projects to support.
Keywords
COVID-19; Convergence study; Text mining; Topic modeling; K-means clustering algorithm; Sentiment analysis;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. E. Roberts, B. M., Stewart & E. M. Airoldi. (2016). A model of text for experimentation in the social sciences. Journal of the American Statistical Association, 111(515), 988-1003. DOI : 10.1080/01621459.2016.1141684   DOI
2 M. E. Roberts, B. M. Stewart & D. Tingley. (2019). Stm: An R package for structural topic models. Journal of Statistical Software, 91(1), 1-40. DOI : 10.18637/jss.v091.i02   DOI
3 J. Cao, T. Xia, J. Li, Y. Zhang & S. Tang. (2009). A density-based method for adaptive LDA model selection. Neurocomputing, 72(7-9), 1775-1781. DOI : 10.1016/j.neucom.2008.06.011   DOI
4 R. Arun, V. Suresh, C. V. Madhavan & M. N. Murthy. (2010, June). On finding the natural number of topics with latent dirichlet allocation: Some observations. In Pacific-Asia conference on knowledge discovery and data mining (pp. 391-402). Berlin, Heidelberg. : Springer. DOI : 10.1007/978-3-642-13657-3_43   DOI
5 T. L. Griffiths & M. Steyvers. (2004). Finding scientific topics. Proceedings of the National academy of Sciences, 101(suppl 1), 5228-5235. DOI: 10.1073/pnas.0307752101   DOI
6 R. Deveaud, E. SanJuan & P. Bellot. (2014). Accurate and effective latent concept modeling for ad hoc information retrieval. Document numerique, 17(1), 61-84. DOI : 10.3166/DN.17.1.61-84   DOI
7 K. Krippendorff. (2018). Content analysis: An introduction to its methodology. Los Angeles : Sage publications.
8 A. F. Hayes & K. Krippendorff. (2007). Answering the call for a standard reliability measure for coding data. Communication methods and measures, 1(1), 77-89. DOI : 10.1080/19312450709336664   DOI
9 C. Buchta, M. Kober, I. Feinerer & K. Hornik. (2012). Spherical k-means clustering. Journal of Statistical Software, 50(10), 1-22.
10 P. J. Rousseeuw. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of computational and applied mathematics, 20, 53-65. DOI : 10.1016/0377-0427(87)90125-7   DOI
11 R. M. del Rio-Chanona, P. Mealy, A. Pichler, F. Lafond & J. D. Farmer. (2020). Supply and demand shocks in the COVID-19 pandemic: An industry and occupation perspective. Oxford Review of Economic Policy, 36(Supplement_1), 94-137.
12 S. Ramelli & A. Wagner. (2020). What the stock market tells us about the consequences of COVID-19. Mitigating the COVID Economic Crisis: Act Fast and Do Whatever, 63-70.
13 K. Lybarger, M. Ostendorf, M. Thompson & M. Yetisgen. (2020). Extracting covid-19 diagnoses and symptoms from clinical text: A new annotated corpus and neural event extraction framework. arXiv preprint arXiv:2012.00974.
14 X. Cheng, Q. Cao & S. S. Liao. (2020). An overview of literature on COVID-19, MERS and SARS: Using text mining and latent Dirichlet allocation. Journal of Information Science, 1-17. DOI : 10.1177/0165551520954674   DOI
15 J. H. Bettencourt-Silva et al. (2020). Exploring the Social Drivers of Health During a Pandemic: Leveraging Knowledge Graphs and Population Trends in COVID-19. Studies in Health Technology and Informatics, 275, 6-11. DOI : 10.3233/SHTI200684   DOI
16 A. Walker, C. Hopkins & P. Surda. (2020). Use of Google Trends to investigate loss-of-smell-related searches during the COVID-19 outbreak. In International forum of allergy & rhinology, 10(7), 839-847. DOI : 10.1002/alr.22580   DOI
17 J. Y. Yang. (2019). Convergence Study on Research Topics for Thyroid Cancer in Korea. Journal of the Korea Convergence Society, 10(2), 75-81. DOI : 10.15207/JKCS.2019.10.2.075   DOI
18 I. T. Jolliffe. (2002). Principal Component Analysis. New York : Springer-Verlag
19 M. Hu & B. Liu. (2004, August). Mining and summarizing customer reviews. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 168-177). Seattle : KDD'04
20 Ministry of Health and Welfare, http://ncov.mohw.go.kr/
21 F. A. Nielsen. (2011). A new ANEW: Evaluation of a word list for sentiment analysis in microblogs. arXiv preprint arXiv:1103.2903.
22 H. M. Salihu, A. A. Salinas-Miranda, L. Hill & K. Chandler. (2013). Survival of pre-viable preterm infants in the United States: a systematic review and meta-analysis. In Seminars in perinatology, 37(6), 389-400. DOI : 10.1053/j.semperi.2013.06.021   DOI
23 H. J. Song. et al. (2020). In validations we trust? The impact of imperfect human annotations as a gold standard on the quality of validation of automated content analysis. Political Communication, 37(4), 550-572. DOI : 10.1080/10584609.2020.1723752   DOI
24 S. M. Heo & J. Y. Yang. (2020). Analysis of Research Topics and Trends on COVID-19 in Korea Using Latent Dirichlet Allocation (LDA). Journal of The Korea Society of Computer and Information, 25(12), 83-91. DOI : 10.9708/jksci.2020.25.12.083   DOI
25 D. H. Lee, Y. J. Kim, D. H. Lee, H. H. Hwang, S. K. Nam & J. Y. Kim. (2020). The Influence of Public Fear, and Psycho-social Experiences during the Coronavirus Disease 2019(COVID-19) Pandemic on Depression and Anxiety in South Korea. The Korean Journal of Counseling and Psychotherapy, 32(4), 2119-2156. DOI : 10.23844/kjcp.2020.11.32.4.2119   DOI
26 S. K. Brooks et al. (2020). The psychological impact of quarantine and how to reduce it: rapid review of the evidence. The lancet, 395(10227), 912-920. DOI : 10.1016/S0140-6736(20)30460-8   DOI
27 F. Stephany, N. Stoehr, P. Darius, L. Neuhauser, O. Teutloff & F. Braesemann. (2020). The CoRisk-Index: A data-mining approach to identify industry-specific risk assessments related to COVID-19 in real-time. arXiv preprint arXiv:2003.12432.
28 A. Abd-Alrazaq, D. Alhuwail, M. Househ, M. Hamdi & Z. Shah. (2020). Top concerns of tweeters during the COVID-19 pandemic: infoveillance study. Journal of medical Internet research, 22(4). DOI : 10.2196/19016   DOI
29 K. Chakraborty, S. Bhatia, S. Bhattacharyya, J. Platos, R. Bag & A. E. Hassanien. (2020). Sentiment Analysis of COVID-19 tweets by Deep Learning Classifiers-A study to show how popularity is affecting accuracy in social media. Applied Soft Computing, 97. DOI : 10.1016/j.asoc.2020.106754   DOI
30 A. Kusters & E. Garrido. (2020). Mining PIGS. A structural topic model analysis of Southern Europe based on the German newspaper Die Zeit (1946-2009). Journal of Contemporary European Studies, 28(4), 477-493. DOI : 10.1080/14782804.2020.1784112   DOI
31 B. M'sik & B. M. Casablanca. (2020). Topic Modeling Coherence: A Comparative Study between LDA and NMF Models using COVID'19 Corpus. International Journal, 9(4). DOI : 10.30534/ijatcse/2020/231942020   DOI
32 K. Garcia & L. Berton. (2021). Topic detection and sentiment analysis in Twitter content related to COVID-19 from Brazil and the USA. Applied Soft Computing, 101. DOI : 10.1016/j.asoc.2020.107057   DOI
33 S. M. Lee, S. E. Ryu. & S. J. Ahn. (2020). Mass Media and Social Media Agenda Analysis Using Text Mining : focused on '5-day Rotation Mask Distribution System'. JOURNAL OF THE KOREA CONTENTS ASSOCIATION. 20(6), 460-469. DOI : 10.5392/JKCA.2020.20.06.460   DOI
34 E. J. Kim, H. M. Sim, J. W. Won & B. J. Kang. (2020). Mapping the COVID-19 Issues from an Urban Perspective in South Korea - Text Mining Analysis Focused on Newspaper Articles. Journal of the Urban Design Institute of Korea Urban Design, 21(6), 163-179. DOI : 10.38195/judik.2020.12.21.6.163   DOI
35 Y. H. Kim. (2020). Exploration of social conflict issues and future signals since the outbreak of COVID-19 in Korea: Using the keywords of news articles. In conference of Korean Academy of Social Welfare, 565-589.
36 S. Y. Song & H. K. Kim. (2020). Exploring Factors Influencing College Students' Satisfaction and Persistent Intention to Take Non-Face-to-Face Courses during the COVID-19 Pandemic. Asian Journal of Education, 21(4), 1099-1126. DOI : 10.15753/aje.2020.12.21.4.1099   DOI
37 S. B. Kim. (2020). COVID-19 and the Complex Geopolitics of Emerging Security : The Emergence of Pandemic and the Transformation of World Politics. Korean Political Science Review, 54(4), 53-81. DOI : 10.18854/kpsr.2020.54.4.003   DOI
38 M. W. Lee & J. E. You. (2020). The Socio-Economic Effects of COVID-19: Focusing on Consumer Expenditure and Labor Market. Asia-Pacific Journal of Business & Commerce, 12(3), 121-141. DOI : 10.35183/ajbc.2020.11.12.3.121   DOI
39 J. S. Kim, N. K. Kang, S. M. Park, E. J. Lee & K. T. Chung. (2020). Diagnostic Techniques for SARS-CoV-2 Detection. Journal of Life Science, 30(8), 731-741. DOI : 10.5352/JLS.2020.30.8.731   DOI
40 H. G. Oh. (2020). Analysis of major social changes and information security issues after COVID-19. Communications of the Korean Institute of Information Scientists and Engineers, 38(9), 48-56.
41 D. M. Blei, A. Y. Ng & M. I. Jordan. (2003). Latent dirichlet allocation. the Journal of machine Learning research, 3, 993-1022. DOI : 10.1162/jmlr.2003.3.4-5.993   DOI