Browse > Article
http://dx.doi.org/10.15207/JKCS.2021.12.12.065

Classification and analysis of error types for deep learning-based Korean spelling correction  

Koo, Seonmin (Konkuk University, Computer Science)
Park, Chanjun (Department of Computer Science and Engineering, Korea University)
So, Aram (Human-inspired Computing Research Center, Korea University)
Lim, Heuiseok (Department of Computer Science and Engineering, Korea University)
Publication Information
Journal of the Korea Convergence Society / v.12, no.12, 2021 , pp. 65-74 More about this Journal
Abstract
Recently, studies on Korean spelling correction have been actively conducted based on machine translation and automatic noise generation. These methods generate noise and use as train and data set. This has limitation in that it is difficult to accurately measure performance because it is unlikely that noise other than the noise used for learning is included in the test set In addition, there is no practical error type standard, so the type of error used in each study is different, making qualitative analysis difficult. This paper proposes new 'error type classification' for deep learning-based Korean spelling correction research, and error analysis perform on existing commercialized Korean spelling correctors (System A, B, C). As a result of analysis, it was found the three correction systems did not perform well in correcting other error types presented in this paper other than spacing, and hardly recognized errors in word order or tense.
Keywords
Korean spelling correction; Machine translation; Artificial neural network machine translation; Error analysis; Natural language processing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 C. Park, K. Kim, Y. Yang, M. Kang & H. Lim. (2020). Neural spelling correction: translating incorrect sentences to correct sentences for multimedia. Multimedia Tools and Applications, 1-18.
2 C. Park, J. Seo, S. Lee, C. Lee, H. Moon, S. Eo & H. S. Lim. (2021, August). BTS: Back TranScription for speech-to-text post-processor using text-to-speech-to-text. In Proceedings of the 8th Workshop on Asian Translation (WAT2021) (pp. 106-116).
3 M. Konchady. (2009). Detecting Grammatical Errors in Text using a Ngram-based Ruleset. Retrieved October, 6, 2011.
4 A. Kuznetsov & H. Urdiales. (2021). Spelling Correction with Denoising Transformer. arXiv preprint arXiv:2105.05977.
5 J. Xiong, Q. Zhang, S. Zhang, J. Hou & X. Cheng. (2015, June). HANSpeller: a unified framework for Chinese spelling correction. In International Journal of Computational Linguistics & Chinese Language Processing, Volume 20, Number 1, June 2015-Special Issue on Chinese as a Foreign Language.
6 M. Kim, J. Jin, H. C. Kwon & A. Yoon. (2013, December). Statistical context-sensitive spelling correction using typing error rate. In 2013 IEEE 16th International Conference on Computational Science and Engineering (pp. 1242-1246).
7 J. Byun, H. C. Rim & S. Y. Park. (2007, August). Automatic spelling correction rule extraction and application for spoken-style korean text. In Sixth International Conference on Advanced Language Processing and Web Information Technology (ALPIT 2007) (pp. 195-199). IEEE.
8 J. H. Lee, M. Kim & H. C. Kwon. (2017). Improved statistical language model for context-sensitive spelling error candidates. Journal of Korea Multimedia Society, 20(2), 371-381.   DOI
9 M. Lee, H. Shin, D. Lee & S. P Choi. (2021). Korean Grammatical Error Correction Based on Transformer with Copying Mechanisms and Grammatical Noise Implantation Methods. Sensors, 21(8), 2658.
10 C. Park, S. Park & H. Lim. (2020). Self-Supervised Korean Spelling Correction via Denoising Transformer. 7th International Conference on Information, System and Convergence Applications
11 K. Lee. (2018). Patterns of Word Spacing Errors in University Students' Writing. J. Res. Soc. Lang. Lit. 97, 289-318.
12 E. Brill & R. C. Moore. (2000, October). An improved error model for noisy channel spelling correction. In Proceedings of the 38th annual meeting of the association for computational linguistics (pp. 286-293).
13 S. K. Kim, T. Y. Kim, R. W. Kang & J. Kim. (2020). Characteristics of Korean Liaison Rule in the Reading and Writing of Children of Korean-Vietnamese Multicultural Families and the Correlation with Mothers' Korean Abilities. Korean Speech-Lang. Hear. Assoc. 29, 57-71.
14 Li, H., Wang, Y., Liu, X., Sheng, Z., & Wei, S. (2018). Spelling error correction using a nested rnn model and pseudo training data. arXiv preprint arXiv:1811.00238.
15 A. Solyman, Z. Wang & Q. Tao. (2019, September). Proposed model for arabic grammar error correction based on convolutional neural network. In 2019 International Conference on Computer, Control, Electrical, and Electronics Engineering (ICCCEEE) (pp. 1-6). IEEE.
16 J. H. Min, S. J. Jung, S. H. Jung, S. Yang, J. S. Cho & S. H. Kim. (2020). Grammatical Error Correction Models for Korean Language via Pre-trained Denoising. Quantitative Bio-Science, 39(1), 17-24.   DOI
17 M. Lee, H. Shin, D. Lee & S. P. Choi. (2021). Korean Grammatical Error Correction Based on Transformer with Copying Mechanisms and Grammatical Noise Implantation Methods. Sensors, 21(8), 2658.