• Title/Summary/Keyword: 차트 파싱

Search Result 12, Processing Time 0.016 seconds

Development of Broad-Coverage Korean Dependency Parser BCD-KL-Parser (한국어 구문분석 시스템 BCD-KL-Parser의 개발)

  • Kim, Minho;Kim, Seongtae;Kwon, Hyuk-Chul
    • Annual Conference on Human and Language Technology
    • /
    • 2018.10a
    • /
    • pp.3-7
    • /
    • 2018
  • 본 연구진은 모든 형태소 분석 후보에 적절한 의존관계를 부여하여 구문분석 트리 후보를 순위화하여 제시하는 한국어 구문 분석 시스템 BCD-KL-Parser를 개발하고 있다. 이 시스템의 최종목표는 형태소 분석후보와 구문분석 트리 후보를 줄여나감으로써, 구문분석의 정확도와 실행 속도를 높이는 것이다. 본 논문에서 소개하는 BCD-KL-Parser에서는 형태적 중의성 해소규칙을 정의하여 형태소 분석후보의 수를 줄이고, 용언의 하위범주화 정보와 선택제약 정보 그리고 의존관계 제약규칙을 정의하여 구문분석 트리 후보의 수를 최소화할 수 있었다. 그 결과 '21세기 세종계획 구문분석 말뭉치'에서 무작위로 추출한 2,167문장에 대하여 UAS 92.27%를 달성할 수 있었다.

  • PDF

Three-Phase English Syntactic Analysis for Improving the Parsing Efficiency (영어 구문 분석의 효율 개선을 위한 3단계 구문 분석)

  • Kim, Sung-Dong
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.1
    • /
    • pp.21-28
    • /
    • 2016
  • The performance of an English-Korean machine translation system depends heavily on its English parser. The parser in this paper is a part of the rule-based English-Korean MT system, which includes many syntactic rules and performs the chart-based parsing. The parser generates too many structures due to many syntactic rules, so much time and memory are required. The rule-based parser has difficulty in analyzing and translating the long sentences including the commas because they cause high parsing complexity. In this paper, we propose the 3-phase parsing method with sentence segmentation to efficiently translate the long sentences appearing in usual. Each phase of the syntactic analysis applies its own independent syntactic rules in order to reduce parsing complexity. For the purpose, we classify the syntactic rules into 3 classes and design the 3-phase parsing algorithm. Especially, the syntactic rules in the 3rd class are for the sentence structures composed with commas. We present the automatic rule acquisition method for 3rd class rules from the syntactic analysis of the corpus, with which we aim to continuously improve the coverage of the parsing. The experimental results shows that the proposed 3-phase parsing method is superior to the prior parsing method using only intra-sentence segmentation in terms of the parsing speed/memory efficiency with keeping the translation quality.