Syllable-based POS Tagging without Korean Morphological Analysis

Shim, Kwang-Seob;

Korean Journal of Cognitive Science (인지과학)

Volume 22 Issue 3
/
Pages.327-345
/
2011
/
1226-4067(pISSN)

The Korean Society for Cognitive Science (한국인지과학회)

Syllable-based POS Tagging without Korean Morphological Analysis

형태소 분석기 사용을 배제한 음절 단위의 한국어 품사 태깅

Shim, Kwang-Seob (School of Information Technology, Sungshin Women's University)

심광섭 (성신여자대학교 IT학부)

Received : 2011.07.20
Accepted : 2011.09.09
Published : 2011.09.30

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, a new approach to Korean POS (Part-of-Speech) tagging is proposed. In previous works, a Korean POS tagger was regarded as a post-processor of a morphological analyzer, and as such a tagger was used to determine the most likely morpheme/POS sequence from morphological analysis. In the proposed approach, however, the POS tagger is supposed to generate the most likely morpheme and POS pair sequence directly from the given sentences. 398,632 eojeol POS-tagged corpus and 33,467 eojeol test data are used for training and evaluation, respectively. The proposed approach shows 96.31% of POS tagging accuracy.

본 논문에서는 형태소 분석기를 사용하지 않는 음절 단위의 한국어 품사 태깅 방법론을 제안한다. 기존 연구에서 한국어 품사 태거는 형태소 분석기가 생성한 결과 중에서 문맥에 가장 잘 맞는 형태소/품사 열을 결정하는 데 반하여, 본 논문에서 제안한 방법론에서는 품사열을 결정할 뿐만 아니라 형태소도 생성한다. 398,632 어절의 학습 데이터로 학습을 하고 33,467 어절의 평가 데이터로 성능 평가를 한 결과 어절 단위의 정확도가 96.31%인 것으로 나타났다.

Korean Journal of Cognitive Science (인지과학)

Syllable-based POS Tagging without Korean Morphological Analysis

형태소 분석기 사용을 배제한 음절 단위의 한국어 품사 태깅

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)