Browse > Article

Improving Parsing Efficiency Using Chunking in Chinese-Korean Machine Translation  

양재형 (강남대학교 컴퓨터미디어공학부)
심광섭 (성신여자대학교 컴퓨터정보학부)
Abstract
This paper presents a chunking system employed as a preprocessing module to the parser in a Chinese to Korean machine translation system. The parser can benefit from the dependency information provided by the chunking module. The chunking system was implemented using transformation-based learning technique and an effective interface that conveys the dependency information to the parser was also devised. The module was integrated into the machine translation system and experiments were performed with corpuses collected from Chinese websites. The experimental results show the introduction of chunking module provides noticeable improvements in the parser's performance.
Keywords
chunk; chunking; transformation-based learning; natural language parsing; machine translation;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 S. Abney, 'Principle-Based Parsing,Parsing by Chunks,' in Berwick, Abney, Tenny eds., Principle-Based Parsing, Kluwer Academic Publishers (1991) 257-278
2 Hobbs, J., Appelt, D.,Bear, J.,I srael, D., Kameyama, M., Stickel, M., Tyson, M., 'FASTUS: A Cascaded Finite-State Transducer for Extracting Information From Natural Language Text,' in Roche, Schabes eds.,Finite-State Language Processing (1997) 383-406
3 Ramshaw, L. A., Marcus, M. P., 'Text Chunking Using Transformation-based Learning,' in Proc of 3rd ACL Workshop on Very Large Corpora (1995) 82-94
4 Abney, S., 'Partial Parsing via Finite-State Cascades,' in Proc of Robust Parsing Workshop ESSLLI'96 (1996) 8-15
5 Cardie, C., Pierce, D., 'Error-driven Pruning of Treebank Grammars for Base Noun Phrase Identification,' in Proc of ACL/Coling (1998) 218-224   DOI
6 Cardie, C., Pierce, D., 'The Role of Lexicalization and Pruning for Base Noun Phrase Grammars,' in Proc of AAAI-99 (1999)
7 Skut, W., Brants, T., 'A Maximum Entropy Partial Parser for Unrestricted Text,' in Proc of 6th Workshop on Very Large Corpora (1998)
8 Briscoe, E.J., Carroll, J., 'Automatic Extraction of Subcategorization from Corpora,' in Proc of ACL Conference on Applied Natural Language Processing (1997)
9 Carrol, J., Minnen, G., Briscoe, T., 'Corpus Annotation for Parser Evaluation,' in Proc of EACL'99 Workshop on Linguistically Interpreted Corpora (1999)
10 양재형, '규칙기반 학습에 의한 한국어의 기반 명사구인식', 정보과학회논문지:소프트웨어및응용, 27권 10호, pp.1062-1071, (2000)
11 황영숙, 정후증, 박소영, 곽용재, 임해창, '자질집합선택 기반의 기계학습을 통한 한국어 기본구 인식의 성능향상', 정보과학회논문지:소프트웨어및응용, 29권9호, pp.654-668, 2002   과학기술학회마을
12 Argamon-Engelson, S., Dagan, I., Krymolowski, Y., 'A Memory-based Approach to Learning Shallow Natural Language Patterns,' in Proc of ACL/Coling, (1998) 67-73
13 Ngai, G., Florian, R., 'Transformation-Based Learning in the Fast Lane,' in Proc of North American ACL 2001 (2001) 40-47
14 Kim Sang, E. F. T., Buchholz, S., 'Introduction to the CoNLL-200 Shared Task: Chunking,' in Proc of CoNLL-2000 (2000) 127-132
15 Xue, N., Xia, F., The Bracketing Guidelines for the Penn Chinese Treebank, IRCS Repost 00-08 available at http://www.cis.upenn.edu/~chinese/
16 Brill, E., 'Transformation-based Error-driven Learning and Natural Language Processing', Computational Linguistics21(4) (1995) 543-565
17 김광백, 박의규, 나동렬, 윤준태, '구간 분할 기반 한국어 구문분석', 14회 한글및한국어정보처리 학술대회 논문집, pp. 163-168, 2002
18 김미영, 강신재, 이종혁, '단위 분석과 의존문법에 기반한 한국어 구문분석', 27회 정보과학회 춘계학술발표 논문집, pp. 327-329, 2000