DOI QR코드

DOI QR Code

Coreference Resolution을 위한 3인칭 대명사의 선행사 결정 규칙

Antecedent Decision Rules of Personal Pronouns for Coreference Resolution

  • 강승식 (국민대학교 컴퓨터학부) ;
  • 윤보현 (목원대학교 컴퓨터교육학과) ;
  • 우종우 (국민대학교 컴퓨터학부)
  • 발행 : 2004.04.01

초록

정보 검색 시스템에서 문서의 내용을 대표하는 용어를 추출하거나 정보 추출 및 텍스트 마이닝에서 특정 정보만을 추출하려면 고유명사에 대한 대용어 문제가 해결되어야 한다. 대용어 해소 문제는 인칭 명사에 대한 대명사의 선행사 결정 문제가 대표적이다. 본 논문에서는 한국어에서 문서의 내용을 보다 정확히 분석하기 위해 3인칭 대명사 “그/그녀/그들/그녀들”의 선행사를 결정하는 방법을 제안한다. 일반적으로 3인칭 대명사의 선행사는 현재 문장 또는 이전 문장의 주어인 경우가 많고, 또한 3인칭 대명사가 2회 이상 반복되는 경우가 자주 발생한다. 이러한 특성을 이용하여 현재 문장과 이전 문장에 출현한 인칭 명사들 중에서 선행사로 사용되는 경우를 조사하여 선행사 결정 규칙을 발견하였다. 이 경험 규칙은 3인칭 대명사의 격에 따라 조금씩 달라지기 때문에 대명사의 격에 따라 주격, 목적격, 소유격으로 구분하여 기술하였다. 제안한 방법의 타당성을 검증하기 위하여 신문 기사의 정치 관련 문서에서 대명사의 격에 따라 100개씩 총 300개의 실험 대상을 선정하였으며, 실험 결과로 3인칭 대명사의 선행사 결정 정확도는 재현율이 79.0%, 정확률이 86.8%로 나타났다.

When we extract a representative term from text for information retrieval system or a special information for information retrieval and text milling system, we often need to solve the anaphora resolution problem. The antecedent decision problem of a pronoun is one of the major issues for anaphora resolution. In this paper, we are suggesting a method of deciding an antecedent of the third personal pronouns, such as “he/she/they” to analyze the contents of documents precisely. Generally, the antecedent of the third personal Pronouns seem to be the subject of the current statement or previous statement, and also it occasionally happens more than twice. Based on these characteristics, we have found rules for deciding an antecedent, by investigating a case of being an antecedent from the personal pronouns, which appears in the current statement and the previous statements. Since the heuristic rule differs on the case of the third personal pronouns, we described it as subjective case, objective case, and possessive case based on the case of the pronouns. We collected 300 sentences that include a pronoun from the newspaper articles on political issues. The result of our experiment shows that the recall and precision ratio on deciding the antecedent of the third personal pronouns are 79.0% and 86.8%, respectively.

키워드

참고문헌

  1. C. Cardie, 'Corpus-Based Acquisition of Relative Pronoun Disambiguation Heuristics,' Proceedings of the 30th Annual Meeting of the ACL, Association for Computational Linguistics, pp.216-233, 1992 https://doi.org/10.3115/981967.981995
  2. C. Cardie, 'Learning to Disambiguate Relative Pronouns,' Proceedings of the Tenth National Conference on Artificial Intelligence, American Association for Artificial Intelligence, pp.38-43, 1992
  3. A. Kehler, 'Probabilistic Coreference in Information Extraction,' Proceedings of the Second Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, pp.163-173, 1997
  4. R. Weischedel, M. Meteer, R. Schwartz, L. Ramshaw, and J. Palmucci, 'Coping with Ambiguity and Unknown Words through Probabilistic Models,' Computational Linguistics, Vol.19, No.2, pp.359-382, 1993
  5. J. Carbonell and R. Brown, 'Anaphora resolution : a Multi-strategy Approach,' Proceedings of the 12th International Conference on Computational Linguistics COLING'88, pp.96-101, 1988
  6. D. M. Carter, 'Interpreting Anaphora in Natural Language Texts,' Chichester : Ellis Horwood, 1987
  7. R. Mitkov, 'Robust Pronoun Resolution with limited Knowledge,' COLING'98, pp.869-875, 1998 https://doi.org/10.3115/980691.980712
  8. R. Mitkov, 'An Integrated Model for Anaphora Resolution,' Proceedings of the 15th International Conference on Computational Linguistics COLING'94, pp.1170-1176, 1994 https://doi.org/10.3115/991250.991342
  9. C. Kennedy and B. Boguraev, 'Anaphora for Everyone: Pronominal Anaphora Resolution without a Parser,' Proceedings of the 16th International conference on Computational Linguistics COLING'96, pp.113-118, 1996 https://doi.org/10.3115/992628.992651
  10. I. Dagan, and A. Itai, 'Automatic Processing of Large Corpora for the Resolution of Anaphora References,' Proceedings of the 13th International Conference on Computational Linguistics, COLING'90, Vol.III, pp.1-3, 1990 https://doi.org/10.3115/991146.991209
  11. C. Aone and W. Bennett, 'Evaluation Automated and Manual Acquisition of Anaphora Resolution Strategies,' Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, Association for Computational Linguistics, pp.122-129, 1995 https://doi.org/10.3115/981658.981675
  12. J. F. McCarthy and W. G. Lehnert, 'Using Decision trees for Coreference Resolution,' Proceedings of the Four-teenth International Conference on Artificial Intelligence, International Joint Conferences on Artificial Intelligence, pp.1050-1055, 1995
  13. J. Hobbs, 'Pronoun Resolution', Research Report #76-1, City College, City University of New York, 1976
  14. B. Baldwin, 'CogNLAC: High Precision Co-Reference with Limited Knowledge and Linguistic Resources,' ACL'97/EACL '97 Workshop on Operational Factors in Practical, Robust Anaphora Resolution, pp.38-45, 1997
  15. R. Stuckardt, 'Anaphora Resolution and the Scope of Syntactic Constraints,' COLING-96, pp.937-943, 1996 https://doi.org/10.3115/993268.993330
  16. H. Nakaiwa and S. Shirai, 'Anaphora Resolution of Jpanese Zero Pronouns with Deictic Reference,' COLING-96, pp.812-817, 1996 https://doi.org/10.3115/993268.993309
  17. I. Paraboni and V. L. S. Lima, 'Possessive Pronominal Anaphor Resolution in Portuguese Written Texts,' COLING-98, pp.1010-1014, 1998 https://doi.org/10.3115/980432.980735
  18. 정래정, 김준태, '고유명사 출현 패턴을 이용한 색인의 성능향상에 관한 연구', 제8회 한글 및 한국어 정보처리 학술발표논문집, pp.68-72, 1996
  19. 황이규, 윤보현, 'HMM에 기반한 한국어 개체명 인식', 정보처리학회논문지B, 제10-B권 제2호, pp.229-237, 2003 https://doi.org/10.3745/KIPSTB.2003.10B.2.229
  20. M. Sanda Harabagiu and Steven J. Maiorano, 'Knowledge-Lean Coreference Resolution and its Relation to Textual Cohesion and Coherence,' Proceedings of the ACL-99 Workshop on the Relation of Discourse/Dialogue Structure and Reference, pp.29-38, 1999

피인용 문헌

  1. Anaphora Resolution System for Natural Language Requirements Document in Korean based on Syntactic Structure vol.17B, pp.3, 2010, https://doi.org/10.3745/KIPSTB.2010.17B.3.255