Browse > Article

Homonym Disambiguation based on Mutual Information and Sense-Tagged Compound Noun Dictionary  

Heo, Jeong (한국전자통신연구원 지식마이닝연구팀)
Seo, Hee-Cheol (한국전자통신연구원 지식마이닝연구팀)
Jang, Myung-Gil (한국전자통신연구원 지식마이닝연구팀)
Abstract
The goal of Natural Language Processing(NLP) is to make a computer understand a natural language and to deliver the meanings of natural language to humans. Word sense Disambiguation(WSD is a very important technology to achieve the goal of NLP. In this paper, we describe a technology for automatic homonyms disambiguation using both Mutual Information(MI) and a Sense-Tagged Compound Noun Dictionary. Previous research work using word definitions in dictionary suffered from the problem of data sparseness because of the use of exact word matching. Our work overcomes this problem by using MI which is an association measure between words. To reflect language features, the rate of word-pairs with MI values, sense frequency and site of word definitions are used as weights in our system. We constructed a Sense-Tagged Compound Noun Dictionary for high frequency compound nouns and used it to resolve homonym sense disambiguation. Experimental data for testing and evaluating our system is constructed from QA(Question Answering) test data which consisted of about 200 query sentences and answer paragraphs. We performed 4 types of experiments. In case of being used only MI, the result of experiment showed a precision of 65.06%. When we used the weighted values, we achieved a precision of 85.35% and when we used the Sense-Tagged Compound Noun Dictionary, we achieved a precision of 88.82%, respectively.
Keywords
MI; Sense-Tagged Dictionary; Homonym; WSD;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Miran Choi, Jeong Hur, Myung-Gil Jang, 'Constructing Korean Lexical Concept Network for Encyclopedia Question-Answering System,' In proceedings of IECON, 2004
2 Yoong Keok Lee, Hwee Tou Ng, Tee Kiah Chia, 'Supervised Word Sense Disambiguation with Support Vector Machine and Multiple Knowledge Sources,' In Proceedings of SENSEV AL-3, 2004
3 David Yarowsky, 'Unsupervised Word Sense Disambiguation Rivaling Supervised Mehtods,' In proceedings of ACL, 1995   DOI
4 Kenneth C. Litkowski, 'SENSEVAL-3 TASK: Word-Sense Disambiguation of WordNet Glosses,' In Proceedings of SENSEVAL-3, 2004
5 이창기, 이근배, '의미 애매서 해소를 이용한 WordNet 자동 매핑', 제12회 한글 및 한국어 정보처리 학술대회, 1997   과학기술학회마을
6 조평옥, 옥철영, '사전 뜻풀이에서 구축한 한국어 명사 의미계층구조', 인지과학회 논문지 제10권 제4호, 1999년
7 왕지현, 장명길, '정보검색을 위한 한국어 명사 개념망 구축에 관한 연구', 제1회 한국시소러스연구회 국제학술포럼, 2003
8 Eneko Agirre, David Martinez,' The Basque Country University system: English and Basque tasks,' In Proceedings of SENSE VAL-3, 2004
9 Gerard Escudero, Lluis Marquez, German Rigau,' Naive Bayes and Exemplar-Based approaches to Word Sense Disambiguation Revisited,' In proceedings of ECAI, 2000
10 Namhee Kwon, Michael Fleischman, Eduard Hovy, 'Senseval automatic labeling of semantic roles using Maximum Entropy models,' In Proceedings of SENSEV AL-3, 2004
11 Cowie, J, L. Guthrie, J. Guthrie, 'Lexical disambiguation using simulated annealing,' In Proceedings of COLING, 1992   DOI
12 Armando Suarez, 'A Maximum Entropy-based Word Sense Disambiguation system,' In proceedings of COLING, 2002   DOI
13 Carlo Strapparava, Alfio Gliozzo, Claudio Giuliano, 'Pattern Abstraction and Term Similarity for Word Sense Disambiguation: IRST at Senseval-S,' In Proceedings of SENSEV AL-3, 2004
14 Hee-Cheol Seo, Hac-Chang Rim, Soo-Hong Kim, 'KUNLP System in SENSEV AL-3,' In Proceedings of SENSEV AL-3, 2004
15 Ganesh Ramakrishnan, B.Prithviraj, Pushpak Bhattacharyya,' A Gloss-centered Algorithm for Disambiguation,' In Proceedings of SENSEV AL-3, 2004
16 Philip Resnik, 'Disambiguation Noun Groupings with Respect to WordNet Senses,' In Proceedings of the Third Workshop on Very Large Corpora, 1995
17 Mauro Castillo, Real Francis, Jordi Asterias, Ger?man Rigau,' The TALP Systems for Disambiguating WordNet Glosses,' In Proceedings of SENSEV AL-3, 2004
18 Andrew Harley, Dominic Glennon 'Sense Tagging in action: Combining different tests with additive weights,' In Proceedings of the SIGLEX Workshop 'Tagging Text with Lexical Semantics,' 1997
19 David Yarowsky, 'Word-Sense Disambiguation Using Statistical Models of Roget's Categories Trained on Large Corpora,' In Proceeding of COLING, 1992   DOI
20 Eneko Agirre, German Rigau, 'Word Sense Disambiguation Using Conceptual Density,' In proceedings of ACL, 1996   DOI
21 Christiane Fellbaum, 'WORDNET:An Electronic Lexical Database,' The MIT Press, 1998
22 정영미, 이재윤, '한국어 텍스트 내 용어연관성 분석을 위한 기초 연구', 제5회 한국정보관리학회, 1998   과학기술학회마을
23 Hyun-Kyu Kang, Se-Young Park, Key-Sun Choi, 'A Word Sense Disambiguation Model Using Two-level Document Ranking with Mutual Information in Natural Language Information Retrieval,' In Proceeding of ICCPOL, 1997
24 Philip Edmonds, 'SENSEV AL: The evaluation of word sense disambiguation systems,' in the ELRA Newsletter, 2002
25 Mark Stevenson, 'Word Sense Disambiguation : The Case for Combinations of Knowledge Sources,' CSU Publications, 2003
26 Philip Edmonds, Scott Cotton, 'SENSEV AL-2: Overview,' In Proceedings of SENSEV AL-2, 2001
27 Hee-Cheol Seo, Sang-Zoo Lee, Hac-Chang Rim, Ho Lee, KUNLP system using Classification Information Model at SESENVAL-2,' In Proceedings of SENSEV AL -2, 200l
28 M. Lesk, 'Automatic sense disambiguation using machine readable dictionaries : how to tell a pine cone from an ice cream cone.,' In Proceedings of ACM DIGDOC, 1986   DOI
29 Adam Kilgarriff, 'SENSEV AL: An Exercise in Evaluating Word Sense Disambiguation Programs,' In Proceedings LREC, 1998
30 Adam Kilgarriff, 'SENSEVAL: An Exercise in Evaluating Word Sense Disambiguation Programs,' In Proceedings LREC, 1998
31 Adam Kilgarriff, 'What is word sense disambiguation good for? ,' In Proceedings of NLP Pacific Rim Symposium, 1997