DOI QR코드

DOI QR Code

Error-driven Noun-Connection Rule Extraction for Morphological Analysis

오류에 기반한 복합명사 좌우접속규칙 사전 구축

  • 이공주 (충남대학교 정보통신공학과) ;
  • 이성욱 (한국교통대학교 컴퓨터정보공학과)
  • Received : 2012.10.22
  • Accepted : 2012.11.12
  • Published : 2012.11.30

Abstract

The goal of this research is to develop an error-driven noun-connection rules which is used for breaking complicate nouns in Korean morphology analysis module. We collected complicate nouns from Web sites, and analyzed them by CnuMa. Whenever we find errors from outputs of the analyzer, we write noun-connection rules to correct the errors. The noun-connection rules are devised by considering left/right contexts in compound nouns. The error-driven noun-connection rules are helpful in improving precision and recall of a Korean morphology analyzer, CnuMa by 2.8% and 10.8%, respectively.

본 연구의 목적은 한국어 형태소 분석기의 복합명사 분석에 이용할 수 있는 좌우접속규칙을 오류 정보를 이용하여 구축하는 것이다. 우리는 복합명사를 웹사이트로부터 수집하였고 CnuMa 형태소분석기를 이용하여 형태소를 분석하였다. 오류가 발견되면 그 오류를 수정할 수 있는 명사 접속 규칙을 구축하였으며, 명사 좌우 접속 규칙은 복합명사내의 좌우 문맥을 고려하여 작성되었다. 오류에 기반한 좌우접속규칙은 한국어 형태소 분석기인 CnuMa 형태소분석기의 정확률과 재현율을 각각 2.8%, 10.8% 향상시켰다.

Keywords

References

  1. Bo-Hyun Yun, Min-Jeung Cho and Hae-Chang Rim, "Segmenting korean compound nouns using statistical information and a preference rule", Journal of Computing Science and Engineering, vol. 24, no. 8, pp. 900-909, 1997.
  2. Jae-Hyuk Choi and Sang-Jo Lee, "A method for reducing dictionary access with bidirectional longest match strategy in korean morphological analyzer", Journal of Computing Science and Engineering, vol. 20, no. 10, pp. 1497-1507, 1993.
  3. Seong-Yong Kim, A Morphological Analyzer for Korean Language with Tabular Parsing Method and Connectivity Information, Master's Thesis, KAIST, 1987 (in Korean).
  4. S.S. Kang, Korean Morphological Analysis Using Syllable Information and Multi-word Unit Information, Doctoral Thesis, Seoul National University, 1993 (in Korean).
  5. Dong Un An, "A noun extractor using connectivity information", Proceddings of The 11th Annual Conference on Human & Cognitive Language Technology, pp. 173-178, 1999 (in Korean).
  6. Jae-Hoon Kim, "Korean part-of-speech tagging using a weighted network", Journal of Computing Science and Engineering, vol. 25, no. 6, pp. 951-959, 1998.
  7. Woon-Jae Lee, Design and Implementation of an Automatic Tagging System for Korean Texts, Master's Thesis, KAIST, 1993 (in Korean).
  8. Kong Joo Lee, Songwook Lee and Jee Eun Kim, "A bidirectional korean-japanese statistical machine translation system by using MOSES", Journal of the Korean society of Marine Engineering, vol. 36, no. 5, pp. 683-693, 2012 (in Korean). https://doi.org/10.5916/jkosme.2012.36.5.683
  9. The 21st Century Sejong Project, http://sejong.or.kr/sejong_kr/index.html, Accessed 20. Nov. 2006.

Cited by

  1. Performance Analysis of a Korean Word Autocomplete System and New Evaluation Metrics vol.39, pp.6, 2015, https://doi.org/10.5916/jkosme.2015.39.6.656
  2. 감성 분석 및 감성 정보 부착 시스템 구현 vol.5, pp.8, 2012, https://doi.org/10.3745/ktsde.2016.5.8.377
  3. 대화 말뭉치 구축을 위한 반자동 의미표지 태깅 시스템 vol.8, pp.5, 2019, https://doi.org/10.3745/ktsde.2019.8.5.213