Browse > Article
http://dx.doi.org/10.14699/kbiblia.2021.32.3.247

Examining Suicide Tendency Social Media Texts by Deep Learning and Topic Modeling Techniques  

Ko, Young Soo (연세대학교 문헌정보학과)
Lee, Ju Hee (연세대학교 문헌정보학과)
Song, Min (연세대학교 문헌정보학과)
Publication Information
Journal of the Korean BIBLIA Society for library and Information Science / v.32, no.3, 2021 , pp. 247-264 More about this Journal
Abstract
This study aims to create a deep learning-based classification model to classify suicide tendency by suicide corpus constructed for the present study. Also, to analyze suicide factors, the study classified suicide tendency corpus into detailed topics by using topic modeling, an analysis technique that automatically extracts topics. For this purpose, 2,011 documents of the suicide-related corpus collected from social media naver knowledge iN were directly annotated into suicide-tendency documents or non-suicide-tendency documents based on suicide prevention education manual issued by the Central Suicide Prevention Center, and we also conducted the deep learning model(LSTM, BERT, ELECTRA) performance evaluation based on the classification model, using annotated corpus data. In addition, one of the topic modeling techniques, LDA identified suicide factors by classifying thematic literature, and co-word analysis and visualization were conducted to analyze the factors in-depth.
Keywords
Suicide; Social media; Word Co-Occurrence; Deep-learning; Topic Modeling;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Jeon, H. (2021). KoBERT. GitHub. Available: https://github.com/SKTBrain/KoBERT
2 NEWS 18 (2021, October 15). Covid-19 Spiked Suicide Attempts in Teenage Girls by 51%: US CDC. Available: https://www.news18.com/news/lifestyle/covid-19-spiked-suicide-attempts-in-teenage-girls-by-51-us-cdc-3848564.html
3 NIMH (2021, June 6). Suicide. Available: https://www.nimh.nih.gov/health/statistics/suicide
4 Korea Suicide Prevention Center (2012). Suicide Prevention Training Manual.
5 Ahn, Y. (2019). 2018 National Survey on Suicide. Ministry of Health and Welfare. Available: http://www.mohw.go.kr/react/jb/sjb030301vw.jsp?PAR_MENU_ID=03&MENU_ID=032901&CONT_SEQ=350956
6 Kim, G. M., Kim, K., Jo, J., & Lim, H. S. (2018). Constructing for Korean traditional culture corpus and development of named entity recognition model using Bi-LSTM-CNN-CRFs. Journal of the Korea Convergence Society, 9(12), 47-52. https://doi.org/10.15207/JKCS.2018.9.12.047   DOI
7 Kim, H. J., Park, S. J., Song, C. M., & Song, M. (2019). Text mining driven content analysis of social perception on schizophrenia before and after the revision of the terminology. Journal of the Korean Society for Library and Information Science, 53(4), 285-307.
8 Lee, S., Kim, S., Lee, J., Ko, Y., & Song, M. (2021). Building and analyzing panic disorder social media corpus for automatic deep learning classification model. Journal of the Korean Society for Information Management, 38(2), 153-172. https://doi.org/10.3743/KOSIM.2021.38.2.153   DOI
9 Seo, H. & Song, M. (2019). An analysis of the discourse topics of users who exhibit symptoms of depression on social media. Journal of the Korean society for information management, 36(4), 207-226. https://doi.org/10.3743/KOSIM.2019.36.4.207   DOI
10 Song, M. (2017). Textmining. Seoul: Chungram.
11 Lee, J., Jung, J., & Kim, H. (2021). A study on the judgment of nativelikeness of korean learner corpus by deep learning language model. Korean Language And Culture Education Society, 17(1), 155-177. http://doi.org/10.18842/klaces.2021.17.1.007   DOI
12 Clark, K., Luong, M. T., Le, Q. V., & Manning, C. D. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.
13 Devlin, J. (2021). Bert multilingual. GitHub. Available: https://github.com/google-research/bert/blob/master/multilingual.md
14 Lee, B. O. (2020). A study on the classification of internet suicide suggestions types. Journal of the Korean Society of Civil Security, 19, 153-172.
15 Ministry of Health and Welfare (2021, May 5). First quarter of 2021 「Corona 19 National Mental Health Survey」. Available: http://www.mohw.go.kr/react/al/sal0301vw.jsp?PAR_MENU_ID=04&MENU_ID=0403&CONT_SEQ=365582&page=1
16 Du, J., Zhang, Y., Luo, J., Jia, Y., Wei, Q., Tao, C., & Xu, H. (2018). Extracting psychiatric stressors for suicide from social media using deep learning. BMC medical informatics and decision making, 18(2), 77-87. https://doi.org/10.1186/s12911-018-0632-8   DOI
17 Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
18 Resnik, P., Armstrong, W., Claudino, L., Nguyen, T., Nguyen, V. A., & Boyd-Graber, J. (2015). Beyond LDA: exploring supervised topic modeling for depression-related language in Twitter. In Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 99-107.
19 Jung, S. H. (2005). The socioeconomic burden of suicide and depression in South Korea. National Center for Mental Health.
20 Kim, N. & Lee, N. J. (2021). An analysis of changes in social issues related to patient safety using topic modeling and word co-occurrence analysis. The Korea Contents Society, 21(1), 92-104. https://doi.org/10.5392/JKCA.2021.21.01.092   DOI
21 Park, C., Park, K., Moon, H., Eo, S., & Lim, H. (2021). A study on performance improvement considering the balance between corpus in Neural Machine Translation. Journal of the Korea Convergence Society, 12(5), 23-29. https://doi.org/10.15207/JKCS.2021.12.5.023   DOI
22 Statistics Korea (2020, September 22). The results of statistics on causes of death in 2019. Available: http://kostat.go.kr/portal/korea/kor_nw/3/index.board?bmode=read&aSeq=385220
23 Kaplan, M. S., Huguet, N., McFarland, B. H., & Newsom, J. T. (2007). Suicide among male veterans: a prospective population-based study. Journal of Epidemiology & Community Health, 61(7), 619-624. http://dx.doi.org/10.1136/jech.2006.054346   DOI
24 Won, H. H., Myung, W., Song, G. Y., Lee, W. H., Kim, J. W., Carroll, B. J., & Kim, D. K. (2013). Predicting national suicide numbers with social media data. PloS one, 8(4), e61809. https://doi.org/10.1371/journal.pone.0061809   DOI
25 Park, J. (2020). KoELECTRA: Pretrained ELECTRA model for Korean. GitHub. Available: https://github.com/monologg/KoELECTRA
26 Resnik, P., Garron, A., & Resnik, R. (2013). Using topic modeling to improve prediction of neuroticism and depression in college students. In Proceedings of the 2013 conference on empirical methods in natural language processing, 1348-1353.
27 Song, Min (2021, June 6). treform. GitHub. Available: https://github.com/MinSong2/treform
28 Steyvers, M. & Griffiths, T. (2007). Probabilistic topic models. In Handbook of latent semantic analysis. New Jersey: Psychology Press, 439-460.
29 Thompson, P., Bryan, C., & Poulin, C. (2014). Predicting military and veteran suicide risk: Cultural aspects. In Proceedings of the Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality, 1-6. http://dx.doi.org/10.3115/v1/W14-3201
30 Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems, 5998-6008. https://dl.acm.org/doi/10.5555/3295222.3295349
31 World Health Organization (2021, June 6). Suicide prevention. Available: https://www.who.int/health-topics/suicide#tab=tab_1