Determination of an Optimal Sentence Segmentation Position using Statistical Information and Genetic Learning

;;

Journal of the Korean Institute of Telematics and Electronics C (전자공학회논문지C)

Volume 35C Issue 10
/
Pages.38-47
/
1998
/
1226-5853(pISSN)

The Institute of Electronics and Information Engineers (대한전자공학회)

Determination of an Optimal Sentence Segmentation Position using Statistical Information and Genetic Learning

통계 정보와 유전자 학습에 의한 최적의 문장 분할 위치 결정

김성동 (서울대학교 컴퓨터공학과) ;
김영택 (서울대학교 컴퓨터공학과)

Published : 1998.10.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

The syntactic analysis for the practical machine translation should be able to analyze a long sentence, but the long sentence analysis is a critical problem because of its high analysis complexity. In this paper a sentence segmentation method is proposed for an efficient analysis of a long sentence and the method of determining optimal sentence segmentation positions using statistical information and genetic learning is introduced. It consists of two modules: (1) decomposable position determination which uses lexical contextual constraints acquired from a training data tagged with segmentation positions. (2) segmentation position selection by the selection function of which the weights of parameters are determined through genetic learning, which selects safe segmentation positions with enhancing the analysis efficiency as much as possible. The safe segmentation by the proposed sentence segmentation method and the efficiency enhancement of the analysis are presented through experiments.

실용적인 기계번역 시스템을 위한 구문 분석은 긴 문장의 분석을 허용하여야 하는데 긴 문장의 분석은 높은 분석의 복잡도 때문에 매우 어려운 문제이다. 본 논문에서는 긴 문장의 효율적인 분석을 위해 문장을 분할하는 방법을 제안하며 통계 정보와 유전자 학습에 의한 최적의 문장 분할 위치 결정 방법을 소개한다. 문장 분할 위치의 결정은 분할 위치가 태그된 훈련 데이타에서 얻어진 어휘 문맥 제한 조건을 이용하여 입력문장의 분할 가능 위치를 결정하는 부분과 여러 개의 분할 가능 위치 중에서 안전한 분할을 보장하고 보다 많은 분석의 효율 향상을 얻을 수 있는 최적의 분할 위치를 학습을 통해 선택하는 부분으로 구성된다. 실험을 통해 제안된 문장 분할 위치 결정 방법이 안전한 분할을 수행하며 문장 분석의 효율을 향상시킴을 보인다.

Journal of the Korean Institute of Telematics and Electronics C (전자공학회논문지C)

Determination of an Optimal Sentence Segmentation Position using Statistical Information and Genetic Learning

통계 정보와 유전자 학습에 의한 최적의 문장 분할 위치 결정

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)