DOI QR코드

DOI QR Code

격틀 사전과 하위 범주 정보를 이용한 한국어 의미역 결정

Korean Semantic Role Labeling Using Case Frame Dictionary and Subcategorization

  • 투고 : 2016.08.30
  • 심사 : 2016.10.10
  • 발행 : 2016.12.15

초록

기계가 사람과 같이 문장을 처리하게 하려면 사람이 쓴 문장을 토대로 사람이 문장을 통해 발현하는 모든 문장의 표현 양상을 학습해 사람처럼 분석하고 처리할 수 있어야 한다. 이를 위해 기본적으로 처리되어야 할 부분은 언어학적인 정보처리이다. 언어학에서 통사론적으로 문장을 분석할 때 필요한 것이 문장을 성분별로 나눌 수 있고, 문장의 핵심인 용언을 중심으로 필수 논항을 찾아 해당 논항이 용언과 어떤 의미역 관계를 맺고 있는지를 파악할 수 있어야 한다. 본 연구에서는 국립국어원 표준국어대사전을 기반으로 구축한 격틀사전과 한국어 어휘 의미망에서 용언의 하위 범주를 자질로 구축한 CRF 모델을 적용하여 의미역을 결정하는 방법을 사용하였다. 문장의 어절, 용언, 격틀사전, 단어의 상위어 정보를 자질로 구축한 CRF 모델을 기반으로 하여 의미역을 자동으로 태깅하는 실험을 한 결과 정확률이 83.13%로 기존의 규칙 기반 방법을 사용한 의미역 태깅 결과의 정확률 81.2%보다 높은 성능을 보였다.

Computers require analytic and processing capability for all possibilities of human expression in order to process sentences like human beings. Linguistic information processing thus forms the initial basis. When analyzing a sentence syntactically, it is necessary to divide the sentence into components, find obligatory arguments focusing on predicates, identify the sentence core, and understand semantic relations between the arguments and predicates. In this study, the method applied a case frame dictionary based on The Korean Standard Dictionary of The National Institute of the Korean Language; in addition, we used a CRF Model that constructed subcategorization of predicates as featured in Korean Lexical Semantic Network (UWordMap) for semantic role labeling. Automatically tagged semantic roles based on the CRF model, which established the information of words, predicates, the case-frame dictionary and hypernyms of words as features, were used. This method demonstrated higher performance in comparison with the existing method, with accuracy rate of 83.13% as compared to 81.2%, respectively.

키워드

과제정보

연구 과제번호 : Symbolic Approach 기반 인간모사형 자가 학습 지능 원천 기술 개발

연구 과제 주관 기관 : 정보통신기술진흥센터

참고문헌

  1. Byoung-Soo Kim, Yong-Hun Lee, Seung-Hoon Na, Jun-Gi Kim, Jong-Hyeok Lee, "Bootstrapping for Semantic Role Assignment of Korean Case Marker," Proc. of Korean Computer Congress 2006 (KCC 2006), Vol. 33, No. 1, pp. 4-6, 2006.
  2. Changki Lee, Soojong Lim, and Hyunki Kim, "Korean Semantic Role Labeling Using Structured SVM," Journal of KIISE, Vol. 42, No. 2, pp. 220-226, 2015. https://doi.org/10.5626/JOK.2015.42.2.220
  3. Kazi Saidul Hasan and Vincent Ng, "Why are You Taking this Stance? Identifying and Classifying Reasons in Ideological Debates," Proc. of the Conference on Empirical Methods in natural Language Processing, pp. 751-762, 2014.
  4. Yun-Jeong Kim, Wan-su Kim, Cheol-Young Ock, "Consideration of Semantic Roles of Korean Subcategory in Computational Linguistics," Korean Society for Language and Information, Vol. 18, No. 2, 2014.
  5. Kim Wansu, Ock CheolYoung, "Korean semantic Role Labeling using Case Frame and Subcategory of Predicate," The 27th Annual Conference on Human & Cognitive Language Technology, pp. 198-201, Oct. 2015.
  6. Jangseong Bae, Changki Lee, "End-to-end Learning of Korean Semantic Role Labeling Using Bidirectional LSTM CRF," Proc. of the KIISE Korea Computer Congress 2015, pp. 566-568, 2015.
  7. Zhou, Jie, and Wei Xu, "End-to-end Learning of Semantic Role Labeling Using Recurrent Neural Networks," Proc. 53rd ACL, pp. 1127-1137, 2015.
  8. Young-Jun Bae, CheolYoung Ock, "Introduction to the Korean Word Map(UWordMap) and API," The 26th Annual Conference on Human and Language Technology, pp. 27-31, 2014.
  9. Myung-Chul Shin, "Integration of Case-Frame Dictionary into Machine Learning Techniques for Semantic Role Assignment of Korean Adverbial Cases," M.Sc. diss. Pohang University of Science and Technology, 2005.
  10. Jangseong Bae, Changki Lee, Soojong Lim, "Korean Semantic Role Labeling using Deep Learning," Proc. of the KIISE Korea Computer Congress 2015, pp. 690-692, 2015.
  11. R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa, "Natural language processing (almost) from scratch," The Journal of Machine Learning Research, 12, pp. 2493-2537, 2011.
  12. Soojong Lim, Yongjin Bae, Hyunki Kim, and Dongyul Ra, "Korean Semantic Role Labeling Using Domain Adaptation Technique," Journal of KIISE, Vol. 42, No. 4, pp. 475-482, 2015. https://doi.org/10.5626/JOK.2015.42.4.475
  13. J. Lafferty, A. McCallum, F. Pereira, "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data," Proc. of International Conference on Machine Learning, ICML-01, pp. 282-289, 2001.
  14. Naoki Okazaki. (2007). CRFsuite: a fast implementation of Conditional Random Fields (CRFs) [Online]. Available: http://www.chokkan.org/software/crfsuite/
  15. Jorge Nocedal, "Updating Quasi-Newton Matrices with Limited Storage," Mathematics of Computation, [Online]. Available: http://www.chokkan.org/software/crfsuite/. Vol. 35, No. 151, pp. 773-782, 1980. https://doi.org/10.1090/S0025-5718-1980-0572855-7