Browse > Article
http://dx.doi.org/10.5762/KAIS.2015.16.11.7718

Semantic Similarity Measures Between Words within a Document using WordNet  

Kang, SeokHoon (Dept. of Embedded Systems Engineering, Incheon National Univ.)
Park, JongMin (Dept. of Embedded Systems Engineering, Incheon National Univ.)
Publication Information
Journal of the Korea Academia-Industrial cooperation Society / v.16, no.11, 2015 , pp. 7718-7728 More about this Journal
Abstract
Semantic similarity between words can be applied in many fields including computational linguistics, artificial intelligence, and information retrieval. In this paper, we present weighted method for measuring a semantic similarity between words in a document. This method uses edge distance and depth of WordNet. The method calculates a semantic similarity between words on the basis of document information. Document information uses word term frequencies(TF) and word concept frequencies(CF). Each word weight value is calculated by TF and CF in the document. The method includes the edge distance between words, the depth of subsumer, and the word weight in the document. We compared out scheme with the other method by experiments. As the result, the proposed method outperforms other similarity measures. In the document, the word weight value is calculated by the proposed method. Other methods which based simple shortest distance or depth had difficult to represent the information or merge informations. This paper considered shortest distance, depth and information of words in the document, and also improved the performance.
Keywords
Corpus statistics; Information content; Lexical database; Semantic similarity; WordNet;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Yuhua Li, "an approach for measuring semantic similarity between words using multiple information sources," IEEE Trans. Knowl. DataEng. 15(4) ,871-882 ,2003 DOI: http://dx.doi.org/10.1109/TKDE.2003.1209005   DOI
2 Wu and M. Palmer, "Verb semantics and lexical selection," In: Proceedings of the 32nd Annual Meeting of the Association for Computational Linguistics, LasCruces, New Mexico , 133-138, 1994
3 G.A. Miller, "wordnet: a lexical database for English," Comm. ACM, Vol. 38, no. 11, 39-41, 1995. DOI: http://dx.doi.org/10.1145/219717.219748   DOI
4 C.Leacock and M. Chodorow, "Combining local context and wordnet similarity for word sense identification," WordNet: An Electronic Lexical Database. The MIT Press, Cambridge, MA, 265-283 , 1998
5 P. Resnik, "Using information content to evaluate semantic similarity," Proc. 14th Int'l Joint Conf. Artificial Intelligence, 1995
6 A. McCallum and K. Nigam "A comparison of event model for naive Bayes text classification", AAAI Workshop on Learning for Text Categorization, 1998
7 R.Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval. Addison-Wesley, 1999.
8 H. Rubenstein and J.B. Goodenough, "Contextual Correlates of Synonymy," Comm. ACM, vol. 8, 627-633, 1965 DOI: http://dx.doi.org/10.1145/365628.365657   DOI
9 G.A. Miller and W.G. Charles, "Contextual Correlates of Semantic Similarity," Language and Cognitive Processes, vol. 6, no. 1, 1-28, 1991. DOI: http://dx.doi.org/10.1080/01690969108406936   DOI
10 J.J. Jiang and D.W. Conrath, "Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy," Proc. ROCLING X, 1997.
11 D. Lin, "An Information-Theoretic Definition of Similarity," Proc. Int'l Conf. Machine Learning, July 1998
12 M. McHale, "A Comparison of WordNet and Roget's Taxonomy for Measuring Semantic Similarity," Proc. COLING/ACL Workshop Usage of WordNet in Natural Language Processing Systems, 1998.
13 P. Resnik, "Semantic Similarity in a Taxonomy: An Information-Based Measure and Its Application to Problems of Ambiguity in Natural Language," J. Artificial Intelligence Research, vol. 11, pp. 95- 130, 1999.   DOI
14 R. Rada, H. Mili, E. Bichnell, and M. Blettner, "Development and Application of a Metric on Semantic Nets," IEEE Trans. Systems, Man, and Cybernetics, vol. 9, no. 1, pp. 17-30, Jan. 1989. DOI: http://dx.doi.org/10.1109/21.24528   DOI
15 B. Spell. Java API for WordNet Searching (JAWS). http://lyle.smu.edu/ tspell/jaws/index.html, 2009.
16 JB Gao, BW Zhang and XH Chen, "A WordNet-based semantic similarity measurement combining edgecounting and information content theory," 2015
17 Pirro,G. A Semantic similarity metric comibing features and intrinsic information content. Data & Knowledge Engineering, 1289-1308, 2009. DOI: http://dx.doi.org/10.1016/j.datak.2009.06.008   DOI
18 David D Sanchez, Montserrat Batet, David Isern, Aida Valls, Ontology-based semantic similarity: A new feature-based approach, Expert Systems with Applications 39 ,7718-7728, 2012. DOI: http://dx.doi.org/10.1016/j.eswa.2012.01.082   DOI