Browse > Article
http://dx.doi.org/10.3745/KTSDE.2020.9.2.45

Topic Analysis of the National Petition Site and Prediction of Answerable Petitions Based on Deep Learning  

Woo, Yun Hui (동덕여자대학교 정보통계학과)
Kim, Hyon Hee (동덕여자대학교 정보통계학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.9, no.2, 2020 , pp. 45-52 More about this Journal
Abstract
Since the opening of the national petition site, it has attracted much attention. In this paper, we perform topic analysis of the national petition site and propose a prediction model for answerable petitions based on deep learning. First, 1,500 petitions are collected, topics are extracted based on the petitions' contents. Main subjects are defined using K-means clustering algorithm, and detailed subjects are defined using topic modeling of petitions belonging to the main subjects. Also, long short-term memory (LSTM) is used for prediction of answerable petitions. Not only title and contents but also categories, length of text, and ratio of part of speech such as noun, adjective, adverb, verb are also used for the proposed model. Our experimental results show that the type 2 model using other features such as ratio of part of speech, length of text, and categories outperforms the type 1 model without other features.
Keywords
National Petition; Topic Analysis; Topic Modeling; K-means Clustering; LSTM; Deep Learning;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 The Cheong Wa Dae National Petition Site [Internet], https://www1.president.go.kr/petitions
2 K. Park, "Semantic Analysis of The Sub-Thematic Word in Big Data," Journal of the Linguistic Society of Korea, Vol. 65, pp. 89-109, 2013.
3 D. Scully, "Web-scale K-means clustering," in Proceedings of the 19 th International Conference on WWW, pp. 1177-1178, 2010.
4 H. You, S. Lee, and Y. Ko, "Incremental Clustering and Multi-Document Summarization for Issue Analysis based on Real-time News," Journal of KIISE, Vol.45, No.4, pp.355- 362, 2019.
5 D. M. Blei, A. Y. Ng, and M. I. Jordan, "Latent Dirichlet Allocation," Journal of Machine Learning Research, Vol.3, pp.993-1022, 2003.
6 D. W. Ko and J. J. Yang, "Korean Natural Language Processing and Analysis Using KoNLPy and Word2Vec," in Proceedings of the Korean Institute of Information Scientists and Engineers, pp.140-142, 2018.
7 Scikit-learn [Internet], https://scikit-learn.org/stable/
8 G. U. Park and I. K. Jang, "Comparison of resampling methods for dealing with imbalanced data in binary classification problem," The Korean Journal of Applied Statistics, Vol.32, No.3, pp.349-374, 2019.   DOI