A Study on Analysis of National Petition Data for Deriving Current Issues in Education

교육관련 이슈 도출을 위한 국민청원 데이터 분석 연구

  • Min, Jeongwon (Korea University of Department of Chinese&Japanese Language and Literature Graduate School) ;
  • Shim, Jaekwoun (Korea University Center for Gifted Education)
  • 민정원 (고려대학교 대학원 중일어문학과) ;
  • 심재권 (고려대학교 영재교육원)
  • Received : 2020.08.22
  • Accepted : 2020.08.30
  • Published : 2020.08.31


As the information society gradually advances, various opinions overflow and their complexity increases. As the results, it was made more difficult to derive important issues and properly respond to those problems. Accordingly, it is necessary to get a handle on emerging problems in education in addition to existing discourses and issues. This study aimed at examining the issues of education by analyzing the petitions posted under 'parenting and education' category on National Petition board. In order to offer objective and detailed results, we employed the topic modeling based LDA algorithm, which is an effective method to extract topics in multiple documents. Nine topics were derived as the result of the analysis and the relationship among those topics was visualized. The values of this study exist in that the derived topics represent important issues that reflect the public opinions.

정보사회가 고도화됨에 따라 의견의 다양성과 복잡성이 증대되어 이들로 부터 중요한 이슈를 도출해내고 문제 상황을 정확하게 파악하여 대응하는 일이 더욱 어려워지고 있다. 이에 따라 교육계에서는 기존의 담론과 쟁점 이외에도 변화되는 사회에 발맞추어 새롭게 등장하는 이슈를 발굴하여 대응할 필요가 있다. 본 연구는 국민청원 게시판에 작성된 육아와 교육 카테고리의 글을 분석하여 교육계의 주된 이슈를 도출해 내고자 하였다. 텍스트 마이닝 방법 가운데 하나인 토픽모델링을 활용하여 분석한 결과, 현재 교육 분야의 주요 이슈를 교육 관련법, 대학입시, 교육 관련 범죄, 교육환경, 유·초등교육, 교원처우, 교육정책, 고등교육, 중등교육 등의 9개 주제로 구분할 수 있었고, 이들을 주제 간의 관계를 시각화하여 제시하였다. 본 연구는 국민들의 여론을 수집한 후 주제별로 구분하여 중요 이슈를 도출하였다는 점에서 의의를 가진다.



  1. D.W. Yoon, H.J. Choe, "Analysis of the Core Concepts of Middle School Informatics Textbook Using Big Data Analysis Techniques", Journal of Creative Information Culture, Vol.5, No.2, pp.157-164, 2019.
  3. David M. Blei, Andrew Y. Ng, Michael I. Jordan, "Latent Dirichlet Allocation", Journal of Machine Learning Research Vol.3, 993-1022, 2003.
  4. M. Steyvers, and T. Griffiths, Probabilistic topic models, Handbook of latent semantic analysis, Lawrence Erlbaum Associates Publishers, 2007.
  5. S. Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, and Richard Harshman, "Indexing by latent semantic analysis", Journal of the American Society for Information Science, Vol.41, No.6, pp.391-407, 1990.<391::AID-ASI1>3.0.CO;2-9
  6. T. Hoffmann. "Probabilistic Latent Semantic Indexing", Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval, pp.50-57, 1999.
  7. David M. Blei and John D. Lafferty, "A Correlated Topic Model of Science", The Annals of Applied Statistics, Vol. 1, No. 1, pp.17-35, 2007.
  8. S. J. Kang, and Y. J. Shon., "Phenomenon of Early Childhood Private Education through Topic Modeling Analysis: Focusing on Domestic Newspaper Articles and Blogs", Journal of Future Early Childhood Education, Vol.27, No.1, pp.177-199, 2020.
  9. M. H. Kwak, H. R. Min, M. R. Kim., "Analysis of Students' Open-Ended Course Evaluation Using Topic Modeling", Asian Journal of Education, Vol.20, No.2, pp.491-522, 2019.
  10. S. M. Lee, and S. G. Hong., "Analysis of Blockchain Trends Using Topic Modelling Technique", The Korea Institute of Information and Communication Engineering, Vol.2019, No.1, pp.44-47, 2019.
  11. Y.J. An, and D.G. Kim., "Keyword and Topic Analysis of Online News Coverage on Students with Developmental Disabilities", Korean Journal of Special Education, Vol.54, No.4, pp.27-50, 2020.
  12. E.H. Hwang, J.H. Jang, and H.G. Yang, "An Analysis of the Research Trends in Free-Semester Using Text-Mining Techniques", Journal of Education & Culture, Vol.25, No.3, pp.299-318, 2019.
  13. J.J. Kim, "Revealing Social Issues using National Petition Data in Korea", Master's Thesis, Department of Physics Pohang University of Science and Technology, 2020.
  14. KoNLPy,
  15. D. Newman, J. H. Lau, K. Grieser, and T. Baldwin., "Automatic Evaluation of Topic Coherence", Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the ACL, pp.100-108, 2010.