Browse > Article
http://dx.doi.org/10.3745/KIPSTB.2011.18B.6.405

Generalization of error decision rules in a grammar checker using Korean WordNet, KorLex  

So, Gil-Ja (영산대학교 게임.콘텐츠학과)
Lee, Seung-Hee ((주)나라인포테크 지능시스템 연구소)
Kwon, Hyuk-Chul (부산대학교 정보컴퓨터공학부, 인지과학협동과정)
Abstract
Korean grammar checkers typically detect context-dependent errors by employing heuristic rules that are manually formulated by a language expert. These rules are appended each time a new error pattern is detected. However, such grammar checkers are not consistent. In order to resolve this shortcoming, we propose new method for generalizing error decision rules to detect the above errors. For this purpose, we use an existing thesaurus KorLex, which is the Korean version of Princeton WordNet. KorLex has hierarchical word senses for nouns, but does not contain any information about the relationships between cases in a sentence. Through the Tree Cut Model and the MDL(minimum description length) model based on information theory, we extract noun classes from KorLex and generalize error decision rules from these noun classes. In order to verify the accuracy of the new method in an experiment, we extracted nouns used as an object of the four predicates usually confused from a large corpus, and subsequently extracted noun classes from these nouns. We found that the number of error decision rules generalized from these noun classes has decreased to about 64.8%. In conclusion, the precision of our grammar checker exceeds that of conventional ones by 6.2%.
Keywords
Grammar Checker; Context Dependent Error Detection; Selectional Constraint Noun Classes; Generalization of an Error Decision Rule; MDL(Minimum Description Length); Tree Cut Model;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 J. Rissanen. "Modeling by shortest data description," Automatic, Vol.14, No.5, pp.37-38, 1978.
2 윤애선, 황순희 외, "한국어 어휘의미망 Korlex 1.5의 구축," 정보과학회논문지:소프트웨어 및 응용, Vol.36, No.1, pp.92-108, 2009.
3 M. Roger, "Spelling checkers, spelling correctors, and the misspellings of poor spellers," Information Processing and Management, Vol.23, No.5, pp.495-505, 1987.   DOI   ScienceOn
4 K. Kukich, "Techniques for automatically correcting words in text," ACM Computing Surveys, Vol.24, No.4, pp.377-439, Dec., 1992.   DOI
5 A. R. Golding and D. Roth. "A winnow-based approach to context-sensitive spelling correction," Machine learning, Vol.34, No.1-3, pp.107-130, 1999.   DOI
6 A. R. Golding, "A Bayesian hybrid method for context-sensitive spelling correction," Proc. the 3rd workshop on very large corpora, pp.39-53, 1995.
7 E. S. Atwell, "How to detect grammatical errors in a text without parsing it," Proc. EACL '87, pp.38-45, 1987.
8 C. Chelba and F. Jelinek, "Recognition performance of a structured language model," Eurospeech, 1999.
9 김현진, "어절 간 의존관계와 부분 문장 분석을 이용한 한국어문법 검사기 구현," 부산대학교 전자계산학과 석사학위 논문, 1997
10 M. Y. Kang, A. S. Yoon, H. C. Kwon, "Improving partial parsing based on error-pattern analysis for Korean grammar-checker," ACM Transactions on Asian Language Information Processing, Vol.2, No.4, pp.301-323, 2003.   DOI
11 이공주, 황선영 외, "전체 문장 분석에 기반한 한국어 문법 검사기," 정보과학회논문지:소프트웨어 및 응용, Vol.30, No.10, pp.992-999, 2003.
12 S. Clark and D. Weir, "Class-based probability estimation using a semantic hierarchy," Computational Linguistics, Vol.28 No.2, pp.187-206, 2002.   DOI   ScienceOn
13 H. Li and N. Abe, "Generalizing case frames using a thesaurus and the MDL principle," Computational Linguistics, Vol.24 No.2, pp.217-244, 1998.
14 P. Resnik. "Selectional preferences and sense disambiguation," Proc. ACL SIGLEX Workshop, pp.52-57, 1997.