Browse > Article
http://dx.doi.org/10.3745/KTSDE.2013.2.12.865

Automatic Construction of Korean Two-level Lexicon using Lexical and Morphological Information  

Kim, Bogyum (충북대학교 디지털정보융합학과)
Lee, Jae Sung (충북대학교 디지털정보융합학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.2, no.12, 2013 , pp. 865-872 More about this Journal
Abstract
Two-level morphology analysis method is one of rule-based morphological analysis method. This approach handles morphological transformation using rules and analyzes words with morpheme connection information in a lexicon. It is independent of language and Korean Two-level system was also developed. But, it was limited in practical use, because of using very small set of lexicon built manually. And it has also a over-generation problem. In this paper, we propose an automatic construction method of Korean Two-level lexicon for PC-KIMMO from morpheme tagged corpus. We also propose a method to solve over-generation problem using lexical information and sub-tags. The experiment showed that the proposed method reduced over-generation by 68% compared with the previous method, and the performance increased from 39% to 65% in f-measure.
Keywords
Korean Two-level Morphology; Two-level Lexicon; Korean Morphotactic; Automatic Construction; Tagged Corpus;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Koskenniemi, Kimmo, "Two-level Model for Morphological Analysis," In IJCAI'83, International Joint Conference on Artificial Intelligence, pp.683-685, 1983.
2 Koskenniemi, Kimmo, "A general computational model for word-form recognition and production," In Proceedings for COLING-84:Association for Computational Linguistics, pp.178-181, 1984.
3 Antworth and Evan L, "PC-KIMMO :A Two-level Processor for Morphological Analyzis," Occasional Publications in Academic Computing No.16. Summer Institute of Linguistics, Dallas, TX, 1990.
4 S. Lee, "A Two-level Morphological Analysis of Korean," Master dissertation, Korea Advanced Institute of Science and Technology, Dept. of Computer Science, 1992. (in Korean)
5 S. Lee, D. Kim, J. Seo, K. Choi, G. Kim, "A Two-level Approach to Korean Verb Morphology," Proceedings of Fall Korea Information Science Society Conference, Vol.19, No.2, pp.993-996, 1992. (in Korean)
6 Barton. G. Edward Berwick, Robert C. and Ristad, Eric Sven, "Computational and Natural Language," The MIT Press, Cambridge, 1987.
7 The national institute of the Korean Language, "Part-Of-Speech Tagged Corpus For Korean," 21C Sejong Project, 2011. (in Korean)
8 A. Arppe, L. Carlson, K. Linden, J. Piitulainen, M. Suominen, M. Vainio, H. Westerlund and A. Yli-Jyra, "Inquiries Into Words; a Festschrift for Kimmo Koskenniemi on his 60th Birthday," CSLI Publications, Stanford University, pp.71-83, 2005.
9 W. A. Gale and K. W. Church, "A Program for Aligning Sentences in Bilingual Corpora," In Using Large Corpora (ed. Armstrong, S.), The MIT Press, Cambridge, Massachusettes, London, England, pp.75-102, 1994.
10 S. Y. Kim, "A morphological analyzer for korean language with tabular parsing method and connectivity information," Master dissertation, Korea Advanced Institute of Science and Technology, Dept. of Computer Science, 1987. (in Korean)
11 J. W. Kang, "A design and implementation of hangul spelling and word-spacing checker using connectivity information," Master dissertation, Korea Advanced Institute of Science and Technology, Dept. of Computer Science, 1990. (in Korean)
12 J. S. Lee, B. Kim. "Automatic Construction of Korean Morphotactic for Two-level Lexicon," In LaRC2011, International Conference on Terminology, Language and Content Resources, 2011.