Shallow Parsing on Grammatical Relations in Korean Sentences

Lee, Song-Wook;Seo, Jung-Yun;

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

Volume 32 Issue 10
/
Pages.984-989
/
2005
/
1229-6848(pISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

Shallow Parsing on Grammatical Relations in Korean Sentences

한국어 문법관계에 대한 부분구문 분석

이성욱 (동서대학교 컴퓨터공학과) ;
서정연 (서강대학교 컴퓨터학과)

Published : 2005.10.01

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

This study aims to identify grammatical relations (GRs) in Korean sentences. The key task is to find the GRs in sentences in terms of such GR categories as subject, object, and adverbial. To overcome this problem, we are fared with the many ambiguities. We propose a statistical model, which resolves the grammatical relational ambiguity first, and then finds correct noun phrases (NPs) arguments of given verb phrases (VP) by using the probabilities of the GRs given NPs and VPs in sentences. The proposed model uses the characteristics of the Korean language such as distance, no-crossing and case property. We attempt to estimate the probabilities of GR given an NP and a VP with Support Vector Machines (SVM) classifiers. Through an experiment with a tree and GR tagged corpus for training the model, we achieved an overall accuracy of $84.8\%,\;94.1\%,\;and\;84.8\%$ in identifying subject, object, and adverbial relations in sentences, respectively.

본 연구의 목적은 한국어 문장의 문법관계를 분석하는 데 있다. 주된 문제는 문장의 주어, 목적어, 부사어를 문장에서 찾아내는 것이다. 이 문제를 해결하기 위해서 한국어 구문 분석에서 발생하는 여러 중의성을 고려해야 한다. 우리는 문법관계의 중의성을 먼저 해결하고 그 다음에 주어진 명사구와 용언구의 문법관계 확률을 이용하여 용언구의 술어-논항 관계 중의성을 해소하는 통계적 방법을 제안한다. 제안된 방법은 어절간의 거리, 교차구조 금지, 일문일격의 원칙 둥의 한국어 언어 특성을 반영하였다. 용언구와 명사구 사이의 문법관계에 대한 확률은 지지벡터 분류기를 이용하여 추정하였다. 제안된 방법은 문법관계 및 구문구조 부착 말뭉치를 이용하여 자동으로 문법관계를 학습하였고 주어, 목적어, 부사 각각의 문법관계분석에 대해 각각 $84.8\%,\;94.1\%,\;84.8\%$의 성능을 얻었다.

Keywords

References

Grenfenstette, G. (1997). SQLET: Short query linguistic expansion techniques, palliating one-word queries by providing intermediate structure to text. In Proc. of the RIAO'97, 500-509
Palmer, M., Passonneau, R., Weir, C. & Finin, T (1993). The KERNEL text understanding system. Artificial Intelligence, 63, 17-68 https://doi.org/10.1016/0004-3702(93)90014-3
Yeh, A. (2000). Using existing systems to supplement small amounts of annotated GRs training data. Proc. of the ACL2000, 126-132. Hong Kong https://doi.org/10.3115/1075218.1075235
Grenfenstette, G. (1996). Light parsing as finite-state filtering. Workshop on Extended Finite State Models of Language, ECAI'96, Budapest, Hungary
Ait-Mokhtar, S. & Chanod, J-P. (1997). Subject and object dependency extraction using finitestate transducers. In Proceedings of the ACL/EACL'97 Workshop on Automatic Information Extraction and Building of Lexical Semantic Resources, 71-77. Madrid, Spain
Brants, T., Skut, W. & Krenn, B. (1997). Tagging grammatical functions. In Proceedings of the 2nd Conference on EMNLP, 64-74. Providence, RI.
Argamon, S., Dagan, I. & Krymolowski, Y. (1998). A memory-based approach to learning shallow natural language patterns. In Proceedings of the 36th Annual Meeting of the ACL, 67-73. Montreal, Canada https://doi.org/10.3115/980451.980857
Buchholz, S., Veenstra, J. & Daelemans, W. (1999). Cascaded GR assignment. In Proceedings of the Joint Conference on EMNLP and Very Large Corpora, 239-246
C. Stanfill and D. Waltz, 'Toward Memory-based Reasoning,' Communications of the ACM, 29(12), pp. 1213-1228, 1986 https://doi.org/10.1145/7902.7906
Carroll, J. & E. Briscoe (2002). High precision extraction of GRs. In Proceedings of the 19th International Conference on Computational Linguistics (COLING), Taipei, Taiwan
양재형, 김영택, '통계정보를 활용한 한국어 미지격 명사구의 문법기능 결정', 정보과학회논문지, Vol. 21, No.5, pp. 808-15, 1994. 5
양재형, 심광섭, '시소러스와 하위범주화 사전을 이용한 격모호성 해결', 정보과학회논문지(B) 제26권 제9호 1999. 9
Lee, S., Seo, J. & Jang, T. Y. (2003). Analysis of the grammatical functions between adnoun and NPs in Korean using Support Vector Machines. Natural Language Engineering, Cambridge University Press, Vol. 9, No.3, pp. 269-280, Sept.
Hindle, D. and Rooth, M. (1993). 'Structural ambiguity and lexical relations,' Computational Linguistics, 19:103-120
Lee, K. J., Kim, J. H., & Kim, G. C. (1997). An Efficient Parsing of Korean Sentence Using Restricted Phrase Structure Grammar, Computer Processing of Oriental Languages, Vol. 12, No.1, pp. 49-62
Collins, Michael. (1996). A New Statistical Parser Based on Bigram Lexical Dependencies. In Proceedings of ACL-96, Sant Cruz, CA, USA
Charniak, E. (2001). Immediate-head parsing for language models. Proceedings of ACL 2001, 116-123
Srinivas, B. (2000). A lightweight dependency analyzer for partial parsing. Natural Language Engineering, 6(2), 113-138 https://doi.org/10.1017/S1351324900002345
Viterbi, A. J. (1967). Error bounds for convolution codes and an asymptotically optimal decoding algorithm. IEEE trans. on Information Theory, 12:260-269 https://doi.org/10.1109/TIT.1967.1054010
Vapnik, V. N. (1995). The Nature of Statistical Learning Theory. Springer, New York
Lee, K. J., KIM, J. H., Choi, K. S. & Kim, G. C. (1996). Korean syntactic tagset for building a tree annotated corpus. Korean Journal of Cognitive Science, 7(4):7-24
Rijsbergen, C.J.van. (1979). Information Retrieval. Buttersworth, London
김길창, 임해창, 서정연, 나동렬, '한국어 이해에 나타나는 중의성 문제 처리 모델에 관한 연구', 연구결과보고서, 한국과학재단, 1997.10

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

Shallow Parsing on Grammatical Relations in Korean Sentences

한국어 문법관계에 대한 부분구문 분석

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)