Browse > Article

Korean Syntactic Rules using Composite Labels  

김성용 (국방과학연구소)
이공주 (이화여자대학교 컴퓨터학)
최기선 (한국과학기술원 전산학과)
Abstract
We propose a format of a binary phrase structure grammar with composite labels. The grammar adopts binary rules so that the dependency between two sub-trees can be represented in the label of the tree. The label of a tree is composed of two attributes, each of which is extracted from each sub-tree so that it can represent the compositional information of the tree. The composite label is generated from part-of-speech tags using an automatic labeling algorithm. Since the proposed rule description scheme is binary and uses only part-of-speech information, it can readily be used in dependency grammar and be applied to other languages as well. In the best-1 context-free cross validation on 31,080 tree-tagged corpus, the labeled precision is 79.30%, which outperforms phrase structure grammar and dependency grammar by 5% and by 4%, respectively. It shows that the proposed rule description scheme is effective for parsing Korean.
Keywords
syntactic analysis; Korean; agglutination; binary rules; composite label; automatic labeling algorithm;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 C. H. Kim, J. H. Kim, J. Y. Seo, and G. C. Kim. 1994. A right-to-left chart parsing with headable paths for Korean dependency grammar. Computer Processing of Chinese and Oriental Languages 8 (Supplement), 105-118
2 K. J. Seo, K. C. Nam, and K. S. Choi. 1998. A probabilistic model for dependency parsing considering ascending dependencies. Literary and Linguistic Computing 13(2), 59-63   DOI   ScienceOn
3 C. H. Han, N. R. Han, and E. S. Ko. 2001. Bracketing Guidelines for Penn Korean TreeBank. IRCS Report 01-10, University of Pennsylvania
4 나동렬. 1994. 한국어 파싱에 대한 고찰. 정보과학회지 12(8), 33-46
5 C. D. Manning and H. Schutze. 1999. Foundations of Statistical Language Processing. The MIT Press
6 J. Cha and Geunbae Lee. Structural disambiguation of morpho-syntactic categorial parsing for Korean, Proceedings of 18th Conference on Computational Linguistics, pp. 1002-1006. 2000   DOI
7 K. J. Lee, J. H. Kim, and G. C. Kim. 1997. An efficient parsing of Korean sentences using restricted phrase structure grammar. Computer Processin of Oriental Languages 11(1), 49-62
8 한국과학기술원. 1997. 문화체육부와 과학기술부의 연구과제 국어정보처리기 구축과 STEP2000에서 구축된 KAIST 코퍼스, 1996-1997. 한국과학기술원
9 최기선, 남영준, 김진규, 한영균, 박석문, 김진수, 이춘택, 김덕봉, 김재훈, 최병진. 1996. 한국어정보베이스를 위한 형태·통사 태그 표준에 관한 연구. 인지과학 7(4), 43-61
10 Jeongwon Cha, Geunbae Lee, Jong-Hyeok Lee. Korean Combinatory Categorial Grammar and statistical parsing, Computers and the Humanities, Vol 36(4): 431-453, Nov. 2002   DOI   ScienceOn
11 C. D. Manning, and R. Carpenter. 1997. Probabilistic parsing using left corner language models cmp-lg/9711003
12 C. M. White. 2000. Rapid Grammar Development and Parsing Constraint Dependency Grammars with Abstract Role Values. Ph.D. Thesis, Purdue University
13 J. E. Hopcraft and J. D. Ullman. 1979. Introduction to Automata Theory, Languages and Computation. Addison-Wesley
14 E. Charniak, S. Goldwater, and M. Johnson. 1998. Edge-based bast-first chart parsing. Proc. of the Fourteenth Nat'l Conf. on AI, 127-133
15 H. Tanaka, T. Tokunaga, and M. Aizawa. 1995. Integration of morphological and syntactic analysis based on LR parsing algorithm. Journal of Natural Language Processing 2(2), 59-74   DOI
16 S. Sekine and M. Collins. 1997. Evalb. ftp://cs.nyu.edu
17 E. Black, S. Abney, D. Flickinger, C. Gdaniec, R. Grishman, P. Harrison, D. Hindle, R. Ingia, F. Jelinek, J. Klavans, M. Liberman, M. Marcus, S. Roukos, B. Santorini, and T. Strzalkowski. 1991. A procedure for quantitatively comparing the syntactic coverage of English grammars. Proceedings of Speech and Natural Language Wrkshop DARPA, Pacific Grove, 306-311   DOI