Browse > Article
http://dx.doi.org/10.15207/JKCS.2022.13.01.081

Verification of educational goal of reading area in Korean SAT through natural language processing techniques  

Lee, Soomin (Department of Computer Science and Engineering, Korea University)
Kim, Gyeongmin (Department of Computer Science and Engineering, Korea University)
Lim, Heuiseok (Department of Computer Science and Engineering, Korea University)
Publication Information
Journal of the Korea Convergence Society / v.13, no.1, 2022 , pp. 81-88 More about this Journal
Abstract
The major educational goal of reading part, which occupies important portion in Korean language in Korean SAT, is to evaluated whether a given text can be fully understood. Therefore given questions in the exam must be able to solely solvable by given text. In this paper we developed a datatset based on Korean SAT's reading part in order to evaluate whether a deep learning language model can classify if the given question is true or false, which is a binary classification task in NLP. In result, by applying language model solely according to the passages in the dataset, we were able to acquire better performance than 59.2% in F1 score for human performance in most of language models, that KoELECTRA scored 62.49% in our experiment. Also we proved that structural limit of language models can be eased by adjusting data preprocess.
Keywords
Korean SAT; Deep learning; Binary classification task; Language models;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. H. Roh. (2011). Reading: the concept and important issues of education. KEDI Research Paper, 43(3), 1-43.
2 Ministry of Education. (2015. January). Korean Language Curriculum. 2015 Revised Curriculum, 5, 1-178.
3 A. Vaswani et al. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 6000-6010.
4 D. Powers, D. Escoffery, & M. Duchnowski. (2015). Validating Automated Essay Scoring: A (Modest) Refinement of the "Gold Standard". Applied Measurement in Education, 28(2), 130-142.   DOI
5 SKT Brain. (2019). KoBERT. Github Repository. https://github.com/SKTBRain/KoBERT
6 J. B. Lee. (2021). KcELECTRA : Korean comments ELECTRA. Github Repository. https://github.com/Beomi/KcELECTRA
7 Y. Y. Yang, S. W. Kang, J. Y. Seo. (2019). Improved Machine Reading Comprehension Using Data Validation for Weakly Labeled Data. IEEE, 8, 5667-5677. DOI : 10.1109/ACCESS.2019.2963569   DOI
8 J. W. Park. (2020). KoELECTRA: Pretrained ELECTRA Model for Korean. Github Repository. https://github.com/monologg/KoELECTRA
9 S. H. Kim. (2014). Analysis of Students' Recognition of National Scholastic Aptitude Test for University Admission -With Focus on the 'Korean Language Section'-, Journal of CheongRam Korean Language Education, 49, 135-164.
10 S. Y. Ryu. (2019). Critical Examination About CSAT Korean Language and Its Developmental Directions-Toward the Recovery of the Nature of the CSAT Evaluation-. New Language Education, 121, 353-380.
11 C. Fellbaum. (2005). WordNet and wordnets, Oxford : Elsevier.
12 J. Wei & K. Zou. (2019). EDA: Easy Data Augmentation Techniques for Boosting Performance. Association for Computational Linguistics, 1, 6382-6388. DOI : 10.18653/v1/D19-1670   DOI
13 Y. R. Lee et al. (2009. July.). Analysis of SAT and ACT, Seoul : KICE.
14 J. H. Moon, H. C. Cho & E. J. Park. (2020). Revisiting Round-Trip Translation for Quality Estimation. http://arxiv.org/abs/2004.13937
15 R. Sennrich, B. Haddow & A. Birch. (2016). Improving Neural Machine Translation Models with Monolingual Data. Association for Computational Linguistics, 1, 86-96.
16 P. Rajpurkar, J. Zhang, K. Lopyrev & P. Liang. (2016). SQuAD: 100, 000+ Questions for Machine Comprehension of Text, EMNLP, 1, 2383-2392. DOI : 10.18653/v1/d16-1264   DOI
17 K. Clark, M. Luong, Q. Le, & C. Manning. (2020). Electra: Pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555.