Browse > Article

Korean Noun Extractor using Occurrence Patterns of Nouns and Post-noun Morpheme Sequences  

Park, Yong-Hyun (동아대학교 컴퓨터공학과)
Hwang, Jae-Won (동아대학교 컴퓨터공학과)
Ko, Young-Joong (동아대학교 컴퓨터공학과)
Abstract
Since the performance of mobile devices is recently improved, the requirement of information retrieval is increased in the mobile devices as well as PCs. If a mobile device with small memory uses a tradition language analysis tool to extract nouns from korean texts, it will impose a burden of analysing language. As a result, the need for the language analysis tools adequate to the mobile devices is increasing. Therefore, this paper proposes a new method for noun extraction using post-noun morpheme sequences and noun patterns from a large corpus. The proposed noun extractor has only the dictionary capacity of 146KB and its performance shows 0.86 $F_1$-measure; the capacity of noun dictionary corresponds to only the 4% capacity of the existing noun extractor with a POS tagger. In addition, it easily extract nouns for unknown word because its dependence for noun dictionaries is low.
Keywords
Mobile device; Noun extraction; Noun pattern; Unknown word extraction;
Citations & Related Records
연도 인용수 순위
  • Reference
1 D. Jang, S. Myaeng, "A Noun Exiractor based on Dictionaries and Heuristic Rules Obtained from Training Data," Proc. Morphological Analyzer and Tagger Evaluation Contest (MATEC''99), pp.151- 156, Oct. 1999. (in Korean)
2 D. An, "A Noun Extractor using Connectivity Information," Proc. Morphological Analyzer and Tagger Evaluation Contest (MATEC" 99), PP.173-178, Oct. 1999. (in Korean)
3 W. Lee, S. Kim, G. Kim, K. Choi, "Implementation of Modularized Morphological Analyzer," Proc. Morphological Analyzer and Tagger Evaluation Contest (MATEC''99), pp.123-136, Oct. 1999. (in Korean)
4 J. Hong, J. Cha, "A New Korean Morphological Analyzer using Eojeol Pattern Dictionary," Proc. of the KCC-2008, vol.35, no.1, pp.279-284, June. 2008. (in Korean)
5 J. Shim, J. Kim, J. Cha, G. Lee, "Robust Part-of Speech Tagger using Statistical and Rule-based Approach," Proc. Morphological Analyzer and Tagger Evaluation Contest (MATEC''99), pp.60-75, Oct. 1999. (in Korean)
6 J. Lee, B. Shin, K. Lee, J Kim, S. Ahn, "Noun Exiractor based on a multi-purpose Korean morphological engine implemented with COM," Proc. Morphological Analyzer and Tagger Evaluation Contest (MATEC '99) pp.167-172, Oct. 1999. (in Korean)
7 J. Lee, J. Park, K. Cha, S. Park, "Morphological Analyzer and Tagger Evaluation Contest(MATEC99) Overview," Proc. Morphological Analyzer and Tagger Evaluation Contest (MATEC "99), pp.13-22, Oct. 1999. (in Korean)
8 D. Lee, S. Lee, H. Rim, "An Efficient Method for Korean Noun Exiraction Using Noun Patterns," Journal of KIISE Software and Applications, vol.30, no.1-2, pp.173-183, Feb. 2003. (in Korean)
9 N. Kim, Y. Seo, "A Korean Morphological Analyzer CBKMA and A Index Extractor CBKMA/IX," Proc. Morphological Analyzer and Tagger Evaluation Contest (MATEC '99), pp.50-59, Oct. 1999. (in Korean)