Browse > Article

An Efficient Index Term Extraction Method in IR using Lexical Chains  

Kang, Bo-Yeong (Dept.of Computer Engineering, Kyungpook National University)
Lee, Sang-Jo (Dept.of Computer Engineering, Kyungpook National University)
Abstract
In information retrieval or digital library, one of the most important factors is to find out the exact information which users need. In this paper, we present an efficient index term extraction method which makes it possible to guess the content of documents and get the information more exactly. To find out index terms in a document, we use lexical chains. Before generating lexical chains, we roughly disambiguate the senses of nouns in a document using specific concept, called semantic window. Semantic window is that we look ahead semantic relations of peripheral nouns and disambiguate the senses of nouns. After generating lexical chains with sense-disambiguated nouns, we find out strong chains by some metrics and extract index terms from a few strong chains. We evaluated our system, using results of a key phrase extraction system, KEA. This system works in general domains of documents Including Information Retrieval and Digital Library.
Keywords
Index Extraction; Lexical Chain; Information Retrieval; Keyword Extraction;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Moens, M.-F., Automatic Indexing and Abstracting of Document Texts, Kluwer Academic Publishers, 2000
2 Lancaster, F.W., and Warner, A.J., Information Retrieval Today, Arlington, VA: Information Resources Press, 1993
3 Lewis, D.D., and Sparck Jones, K., 'Natural language processing for information retrieval,' Communications of the ACM, Vol. 39, No. 1, 92-101, 1996   DOI
4 Morris, J., and Hirst, G., 'Lexical cohesion computed by thesaural relations as an indicator of the structure of text,' Computational Linguistics, Vol. 17, No. 1, pp. 21-43, 1991
5 Morris, J., 'Lexical cohesion, the thesaurus, and the structure of text,' Master's thesis, Department of Computer Science, University of Toronto, 1988
6 Barzilay, R. and Elhadad, M., 'Using lexical chains for text summarization,' In the Proceedings of the ACL'97 Workshop on Intelligent Scalable Text Summarization, 1997
7 Luhn, H.P., 'Statistical approach to mechanized encoding and searching of literary information,' IBM Journal of Research and Development, Vol. 1, No. 4, pp. 309-317, 1957   DOI
8 Bookstein, A., Klein, S.T., and Raita, T., 'Clumping properties of content-bearing words,' JASIS, Vol. 49, No. 2, pp. 102-114, 1998   DOI
9 Salton, G., Singhal, A., Mitra, M. and Buckley, C., 'Automatic text structuring and summarization,' IP&M, Vol. 33, No. 2, 193-207, 1997   DOI   ScienceOn
10 Halliday, M.A.K., and Hasan, R., Cohesion in English, London: Longman, 1976
11 Hasan, R., Coherence and Cohesive Harmony. In J. Flood (Ed.) Understanding Reading Comprehension, pp. 181-219, Newark, DE: IRA, 1984
12 Al-Halimi, R. and Kazman, R., Temporal Indexing through Lexical Chaining. In fellbaum, C., ed., wordNet: An Electronic Lexical Database and Some of its Applications, Cambridge, MA: The MIT Press, 1998
13 Frank, E., Paynter, G., Witten, I., Gutwin, C. and Nevill-Manning, C., 'Domain-specific keyphrase extraction,' In Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence, Morgan-Kaufmann, 668-673, 1999
14 Hahn, U., 'Making unerstanders out of parsers: semantically driven parsing as a key concept for realistic text understanding applications,' International Journal of Intelligent Systems, Vol. 4, pp. 345-393, 1989   DOI   ScienceOn
15 Burnett, M., Fisher, C., and Jones, K., 'In TEXT processing indexing in TREC-4,' The Fourth Text REtrieval Conference (TREC-4), pp. 287-294, 1996
16 Witten, I.H., Paynter, G.W., Frank, E., Gutwin, C. and Nevill-Manning, C.G., 'KEA: Practical Automatic Keyphrase Extraction,' In Proceedings of Digital Libraries (99: The fourth ACM Conference on Digital Libraries), pp. 254-255, 1999
17 Gale, W., Church, K., and Yarwsky, D., 'Estimation upper and lower bounds on the performance of word-sense disambiguation programs,' In Proceedings of the 30th Annual Meeting of the Association for Computational Linguistics(ACL-92), pp. 249-256, 1992   DOI
18 Liddy, E.D., and Myaeng, S.H., 'DR-LINK's: linguistic-comceptual approach to document and detection,' The First Text REtreival Conference (TREC-1), pp. 113-129, 1993