DOI QR코드

DOI QR Code

Cascaded Parsing Korean Sentences Using Grammatical Relations

문법관계 정보를 이용한 단계적 한국어 구문 분석

  • 이성욱 (국립충주대학교 컴퓨터과학과)
  • Published : 2008.02.29

Abstract

This study aims to identify dependency structures in Korean sentences with the cascaded chunking. In the first stage of the cascade, we find chunks of NP and guess grammatical relations (GRs) using Support Vector Machine (SVM) classifiers for all possible modifier-head pairs of chunks in terms of GR categories as subject, object, complement, adverbial, etc. In the next stages, we filter out incorrect modifier-head relations in each cascade for its corresponding GR using the SVM classifiers and the characteristics of the Korean language such as distance between relations, no-crossing and case property. Through an experiment with a parsed and GR tagged corpus for training the proposed parser, we achieved an overall accuracy of 85.7%.

본 연구는 한국어 의존 구조를 결정하는 단계적 의존 구조 분석기를 제안한다. 각 단계에서는 주어진 문법관계의 후보열에서 올바른 문법관계를 결정하는데, 대상문법관계의 종류에 따라 독립적으로 수행된다. 문법관계의 후보열은 미리 학습된 지지벡터기계를 이용하여 주어, 목적어, 보어, 부사어 등 7가지의 문법관계로 추정한다. 각 단계에서는 지지벡터기계 분류기와 어절 간의 거리, 교차 구조 금지, 격 제한의 원칙 등의 한국어 언어 특성을 이용하여 대상문법관계를 결정하며, 모든 단계를 거쳐 최종적으로 전체 의존 구조와 문법관계가 결정된다. 트리 및 문법관계 부착 말뭉치를 이용하여 제안된 시스템을 구현 및 실험하였으며 약 85.7%의 정확률을 얻었다.

Keywords

References

  1. Grenfenstette, G. 'SQLET: Short query linguistic expansion techniques, palliating one-word queries by providing intermediate structure to text', In Proc. of the RIAO'97, pp.500-509, 1997
  2. Palmer, M., Passonneau, R., Weir, C. & Finin, T. 'The KERNEL text understanding system', Artificial Intelligence, Vol. 63, pp.17-68, 1993 https://doi.org/10.1016/0004-3702(93)90014-3
  3. Yeh, A. 'Using existing systems to supplement small amounts of annotated GRs training data', Proc. of the ACL2000, pp.126-132. Hong Kong, 2000
  4. Brants, T., Skut, W. & Krenn, B. 'Tagging grammatical functions', In Proceedings of the 2nd Conference on EMNLP, pp.64-74. Providence, RI., 1997
  5. Argamon, S., Dagan, I. & Krymolowski, Y. 'A memory-based approach to learning shallow natural language patterns', In Proceedings of the 36th Annual Meeting of the ACL, pp.67-73. Montreal, Canada, 1998
  6. Buchholz, S., Veenstra, J. & Daelemans, W. 'Cascaded GR assignment', In Proceedings of the Joint Conference on EMNLP and Very Large Corpora, pp.239-246, 1999
  7. Stanfill, C. & Waltz, D. 'Toward memory-based reasoning', Communications of the ACM, Vol. 29, pp.1213-1228, 1986 https://doi.org/10.1145/7902.7906
  8. Blaheta, D. & Charniak, E. 'Assigning function tags to parsed text', In Proceedings of the 1st Conference of the NAACL, pp.234-240. Seattle, WA, 2000
  9. Carroll, J. & E. Briscoe 'High precision extraction of GRs', In Proceedings of the 19th International Conference on Computational Linguistics (COLING), pp.134-240, Taipei, Taiwan, 2002
  10. 이성욱, 서정연, '한국어 문법관계에 대한 부분구문 분석', 정보과학회논문지 소프트웨어 및 응용, 제32권10호지 pp.984-989, Oct. 2005
  11. Viterbi, A. J. 'Error bounds for convolution codes and an asymptotically optimal decoding algorithm', IEEE trans. on Information Theory, Vol. 12, pp.260-269, 1967 https://doi.org/10.1109/TIT.1967.1054010
  12. Vapnik, V. N. 'The Nature of Statistical Learning Theory', Springer, New York, 1995
  13. Lee, K. J., KIM, J. H., Choi, K. S. & Kim, G. C. 'Korean syntactic tagset for building a tree annotated corpus', Korean Journal of Cognitive Science, Vol. 7, No. 4, pp.7-24, 1996
  14. Lee, K. J., Kim, J. H., & Kim, G. C. 'An Efficient Parsing of Korean Sentence Using Restricted Phrase Structure Grammar', Computer Processing of Oriental Languages, Vol.12, No. 1, pp. 49-62, 1997
  15. http://svmlight.joachims.org

Cited by

  1. An Analysis of Korean Dependency Relation by Homograph Disambiguation vol.3, pp.6, 2014, https://doi.org/10.3745/KTSDE.2014.3.6.219