The Detection and Correction of Context Dependent Errors of The Predicate using Noun Classes of Selectional Restrictions

So, Gil-Ja;Kwon, Hyuk-Chul;

doi:10.6109/jkiice.2014.18.1.25

Journal of the Korea Institute of Information and Communication Engineering (한국정보통신학회논문지)

Volume 18 Issue 1
/
Pages.25-31
/
2014
/
2234-4772(pISSN)
/
2288-4165(eISSN)

The Korea Institute of Information and Commucation Engineering (한국정보통신학회)

DOI QR Code

The Detection and Correction of Context Dependent Errors of The Predicate using Noun Classes of Selectional Restrictions

선택 제약 명사의 의미 범주 정보를 이용한 용언의 문맥 의존 오류 검사 및 교정

So, Gil-Ja (Department of Cyber Police and Science, Youngsan University) ;
Kwon, Hyuk-Chul (School of Computer Science & Engineering, Pusan National University)

소길자 ;
권혁철

Received : 2013.09.24
Accepted : 2013.10.30
Published : 2014.01.31

https://doi.org/10.6109/jkiice.2014.18.1.25 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Korean grammar checkers typically detect context-dependent errors by employing heuristic rules; these rules are formulated by language experts and consisted of lexical items. Such grammar checkers, unfortunately, show low recall which is detection ratio of errors in the document. In order to resolve this shortcoming, a new error-decision rule-generalization method that utilizes the existing KorLex thesaurus, the Korean version of Princeton WordNet, is proposed. The method extracts noun classes from KorLex and generalizes error-decision rules from them using the Tree Cut Model and information-theory-based MDL (minimum description length).

현재 실용화된 국내 문법 검사기는 경험적으로 구축된 오류 결정 규칙을 이용해 주위의 문맥을 보고 문법 오류를 판단하는 문맥 의존 오류를 처리하고 있다. 그러나 기존 문법 검사기의 오류 결정 규칙은 어휘 수준으로 구축되어 있어 검사기의 재현율이 낮다. 따라서 어휘대신 어휘 범주 정보를 사용하여 오류 결정 규칙을 일반화할 필요가 있다. 본 논문에서는 검사단어가 용언일 때 선택 제약 명사의 의미 범주를 국내에서 개발된 어휘의미망 KorLex에서 TCM과 MDL을 이용해 추출하고 추출된 의미 범주를 이용해 용언의 오류 결정 규칙을 일반화하는 방법을 제안한다.

Keywords

References

K. Kukich, "Techniques for automatically correcting words in text," ACM Computing Surveys, vol. 24, no. 4, pp. 377-439, 1992. https://doi.org/10.1145/146370.146380
A. R. Golding, and D. Roth, "A winnow-based approach to context-sensitive spelling correction," Machine Learning, vol. 34, no. 1-3, pp. 107-130, 1999. https://doi.org/10.1023/A:1007545901558
A. Carlson, and I. Fette, "Memory-based context-sensitive spelling correction at web scale," in Proceeding of The 6th International Conference on Machine Learning and Applications, pp. 166-171, 2007.
M. Y. Kang, A. S. Yoon, H. C. Kwon, "Improving Partial Parsing Based on Error-Pattern Analysis for Korean Grammar-Checker", TALIP ACM, vol. 2, no. 4,pp. 301-323, 2003. https://doi.org/10.1145/1007551.1007552
J. L. Kong, S. Y. Hwang, "A Korean Grammar Checker based on the Trees Resulted from a Full Parser," Journal of KIISE : Software and Applications, vol. 30, no. 10, pp. 992-999, 2003.
A. S. Yoon, S. H. Hwang, E. R. Lee, H. C. Kwon, "Construction of Korean Wordnet KorLex 1.5," Journal of KIISE : Software and Applications, vol. 36, no. 1, pp. 92-108, 2009.
H. Li, and N. Abe, "Generalizing case frames using a thesaurus and the MDL principle," Computational Linguistics, vol. 24 no. 2, pp. 217-244, 1998.
G. Hirst, and D. S. Onge, "Lexical chains as representations of context for the detection and correction of malapropisms," WordNet, The MIT Press, pp. 305-332, 1995.
G. Hirst, and A. Budanitsky, "Correcting real-word spelling errors by restoring lexical cohesion," Natural Language Engineering, vol. 11, no. 1, pp. 87-111, 2005. https://doi.org/10.1017/S1351324904003560
A. Islam, and D. Inkpen, "Real-word spelling correction using Google web lT 3-grams," in Proceeding of The 2009 Conference on Empirical Methods in Natural Language Processing, pp. 1241-1249, 2009.