Browse > Article
http://dx.doi.org/10.3743/KOSIM.2021.38.2.153

Building and Analyzing Panic Disorder Social Media Corpus for Automatic Deep Learning Classification Model  

Lee, Soobin (연세대학교 문헌정보학과)
Kim, Seongdeok (연세대학교 문헌정보학과)
Lee, Juhee (연세대학교 문헌정보학과)
Ko, Youngsoo (연세대학교 문헌정보학과)
Song, Min (연세대학교 문헌정보학과)
Publication Information
Journal of the Korean Society for information Management / v.38, no.2, 2021 , pp. 153-172 More about this Journal
Abstract
This study is to create a deep learning based classification model to examine the characteristics of panic disorder and to classify the panic disorder tendency literature by the panic disorder corpus constructed for the present study. For this purpose, 5,884 documents of the panic disorder corpus collected from social media were directly annotated based on the mental disease diagnosis manual and were classified into panic disorder-prone and non-panic-disorder documents. Then, TF-IDF scores were calculated and word co-occurrence analysis was performed to analyze the lexical characteristics of the corpus. In addition, the co-occurrence between the symptom frequency measurement and the annotated symptom was calculated to analyze the characteristics of panic disorder symptoms and the relationship between symptoms. We also conducted the performance evaluation for a deep learning based classification model. Three pre-trained models, BERT multi-lingual, KoBERT, and KcBERT, were adopted for classification model, and KcBERT showed the best performance among them. This study demonstrated that it can help early diagnosis and treatment of people suffering from related symptoms by examining the characteristics of panic disorder and expand the field of mental illness research to social media.
Keywords
panic disorder; social media; TF-IDF; word co-occurrence; deep-learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Seoul National University Hospital (2010). N medical information panic disorder. Available: http://www.snuh.org/health/nMedInfo/nView.do?category=DIS&medid=AA000344
2 Ko, Hyunwoong. (2021). Korean Sentence Splitter. Available: https://github.com/hyunwoongko/kss
3 Lee, Junbum. (2021). KcBERT: Korean Comments BERT. Available: https://github.com/Beomi/KcBERT
4 Salton, G. & M. J. McGill. (1983). Introduction to modern information retrieval.
5 Song, Min (2017). Textmining. Seoul: Chungram.
6 Yu, J. (2019). Text mining for identifying topics in internet Q&A about adolescents' sexual concerns. Journal of Health Informatics and Statistics, 44(2), 181-188. https://doi.org/10.21032/jhis.2019.44.2.181   DOI
7 Benton, A., Mitchell, M., & Hovy, D. (2017). Multi-task Learning for Mental Health using Social Media Text. arXiv preprint arXiv:1712.03538.
8 Jeon, Heewon. (2018). KoSpacing: Automatic Korean word spacing. Available: https://github.com/haven-jeon/PyKoSpacing
9 Mozafari, M., Farahbakhsh, R., & Crespi, N. (2019). A BERT-based transfer learning approach for hate speech detection in online social media. In International Conference on Complex Networks and Their Applications (pp. 928-940). Springer, Cham. https://doi.org/10.1007/978-3-030-36687-2_77   DOI
10 Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2017). Attention is all you need. arXiv preprint arXiv:1706.03762.
11 Shin, Yong-Wook (2014). Seoul Asan Hospital. Available: http://psy.amc.seoul.kr/asan/depts/psy/K/bbsDetail.do?menuId=862&contentId=213922
12 Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
13 Du, J., Zhang, Y., Luo, J., Jia, Y., Wei, Q., Tao, C., & Xu, H. (2018). Extracting psychiatric stressors for suicide from social media using deep learning. BMC medical informatics and decision making, 18(2), 77-87. https://doi.org/10.1186/s12911-018-0632-8   DOI
14 Sekulic, I. & Strube, M. (2020). Adapting deep learning methods for mental health prediction on social media. arXiv preprint arXiv:2003.07634.
15 Ko, Eun-Jung, Choi, Young-Hee, Park, Gi-Hwan, & Lee, Jung-Heum (2000). Clinical characteristics of panic disorder. Journal of the Korean Society of Biological Therapies in Psychiatry, 6(2), 188-198.
16 ETRI (2019). KorBERT. Available: https://aiopen.etri.re.kr/service_dataset.php
17 Kim, Gyeong-Min, Kim, Kue-kyeng, Jo, Jae-choon, & Lim, Heui-Seok (2018). Constructing for Korean traditional culture corpus and development of named entity recognition model using Bi-LSTM-CNN-CRFs. Journal of the Korea Convergence Society, 9(12), 47-52.   DOI
18 Kim, Hyun-Ji, Park, Seo-Jeong, Song, Chae-Min, & Song, Min (2019). Text mining driven content analysis of social perception on schizophrenia before and after the revision of the terminology. Journal of the Korean Society for Library and Information Science, 53(4), 285-307.
19 Lee, Sungjick & Kim, Han-joon (2009). Keyword extraction from news corpus using modified TF-IDF. The Jounal of Society for e-Business Studies, 14(4), 59-73.
20 Park, Soo-Hyun (2017). Evidence-based treatment of panic disorder. Korean Journal of Clinical Psychology, 36(4), 458-469. https://doi.org/10.15842/kjcp.2017.36.4.002   DOI
21 Moessner, M., Feldhege, J., Wolf, M., & Bauer, S. (2018). Analyzing big data in social media: Text and network analyses of an eating disorder forum. International Journal of Eating Disorders, 51(7), 656-667. https://doi.org/10.1002/eat.22878   DOI
22 Paek, Hye-Jin, Cho, Hye-Jin, & Kim, Jung-Hyun (2017). Content analysis of news coverage on stigma and attribution regarding mental illness. Korean Journal of Journalism & Communication Studies, 61(4), 7-43. https://doi.org/10.20879/kjjcs.2017.61.4.001   DOI
23 Park, Chan-Jun, Park, Ki-Nam, Moon, Hyeon-Seok, Eo, Su-Gyeong, & Lim, Heui-Seok (2021). A study on performance improvement considering the balance between corpus in neural machine translation. Journal of the Korea Convergence Society, 12(5), 23-29.   DOI
24 Seoul Asan Hospital (2014). Disease encyclopedia panic disorder. Available: http://www.amc.seoul.kr/asan/healthinfo/disease/diseaseDetail.do?contentId=31583
25 Shin, Seo-Hee (2017). Panic disorder patients in the last 5 years Treatment trend analysis. Health Insurance Review & Assessment Service.
26 Medrouk, L. & Pappa, A. (2017). Deep learning model for sentiment analysis in multi-lingual corpus. In International Conference on Neural Information Processing (pp. 205-212). Springer, Cham. https://doi.org/10.1007/978-3-319-70087-8_22   DOI
27 Lee, Hyun-Joo, Gim, Min-Sook, Kim, Se-Joo, Park, Seon-Cheol, Yang, Jong-Chul, Lee, Kyoung-Uk, Lee, Sang-Hyuk, Lee, Seung-Jae, Lim, Se-Won, Chae, Jeong-Ho, Han, Sang-Woo, Hong, Jin-Pyo, & Seo, Ho-Jun (2019). The bodily panic symptoms and predisposing stressors in Korean patients with panic disorder. Korean Neuropsychiatric Association, 58(4), 339-345. https://doi.org/10.4306/jknpa.2019.58.4.339   DOI
28 American Psychiatric Association. (2013). Diagnostic and Statistical Manual of Mental Disorders (5th ed.). Washington, DC: Author.
29 Devlin, J. (2021). Bert multilingual. Available: https://github.com/google-research/bert/blob/master/multilingual.md
30 Jeon, Heewon. (2021). KoBERT. Available: https://github.com/SKTBrain/KoBERT
31 Yu, L., Jiang, W., Ren, Z., Xu, S., Zhang, L., & Hu, X. (2021). Detecting changes in attitudes toward depression on Chinese social media: A text analysis. Journal of affective disorders, 280, 354-363. https://doi.org/10.1016/j.jad.2020.11.040   DOI
32 Roy-Byrne, P. P., Craske, M. G., & Stein, M. B. (2006). Panic disorder. The Lancet, 368(9540), 1023-1032. https://doi.org/10.1016/S0140-6736(06)69418-X   DOI
33 Song, Min. (2021. June 6). treform. Available: https://github.com/MinSong2/treform