An Automatic Korean Lexical Acquisition System

Lim, Heui-Seok;

Journal of the Korea Academia-Industrial cooperation Society (한국산학기술학회논문지)

Volume 8 Issue 5
/
Pages.1087-1091
/
2007
/
1975-4701(pISSN)
/
2288-4688(eISSN)

The Korea Academia-Industrial cooperation Society (한국산학기술학회)

An Automatic Korean Lexical Acquisition System

한국어 어휘자동획득 시스템

Lim, Heui-Seok

임희석 (한신대학교 컴퓨터공학부)

Published : 2007.10.31

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

This paper proposes a automatic korean lexical acquisition system which reflects the characteristics of human language acquisition. The proposed system automatically builds two kinds of lexicon, full-form lexicon and decomposition using Korean corpus as its input. As the experimental results using Korean Sejeong corpus of which size is 10 million Eojeols, the system acquired 2,097 full-form Eojeols and 3,488 morphemes. The accumulated frequency of the acquired full-form Eojeols covers the 38.63% of the input corpus and accuracy of morpheme acquisition is 99.87%.

본 논문은 인간의 언어 획득 원리를 반영한 계산주의적 한국어 어휘 자동 획득 시스템을 제안한다. 제안하는 시스템은 인간의 언어 생활을 모델링한 한국어 코퍼스를 입력 받아 언어 인식을 위하여 사용할 수 있는 어절 사전과 형태소 사전의 어절과 형태소를 자동으로 획득할 수 있다. 1천만 어절 크기의 한국어 코퍼스를 이용하여 실험한 결과, 2,097개의 어절과 3,488개의 형태소를 획득할 수 있었다. 획득된 2,097개의 어절의 출현 빈도의 합은 1천만 어절의 38.63%에 해당하였고 형태소 추출의 정확도는 99.87%를 보였다.

Journal of the Korea Academia-Industrial cooperation Society (한국산학기술학회논문지)

An Automatic Korean Lexical Acquisition System

한국어 어휘자동획득 시스템

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)