[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3745/KIPSTB.2012.19B.1.037

Query Expansion based on Word Graph using Term Proximity

Jang, Kye-Hun (전북대학교 컴퓨터공학과)
Lee, Kyung-Soon (전북대학교 컴퓨터공학부/영상정보신기술연구센터)

Publication Information

The KIPS Transactions:PartB / v.19B, no.1, 2012 , pp. 37-42 More about this Journal

Abstract

The pseudo relevance feedback suggests that frequent words at the top documents are related to initial query. However, the main drawback associated with the term frequency method is the fact that it relies on feature independence, and disregards any dependencies that may exist between words in the text. In this paper, we propose query expansion based on word graph using term proximity. It supplements term frequency method. On TREC WT10g test collection, experimental results in MAP(Mean Average Precision) show that the proposed method achieved 6.4% improvement over language model.

Keywords

Term Proximity; Word Graph; Context Term; Query Expansion; TextRank;

Citations & Related Records

Reference

1	Sakai, T., Manabe, T., Koyama, M. 2005. Flexible Pseudo-Relevance Feedback via Selective Sampling. ACM Transaction on Asian Language Information Processing(TALIP), 4(2), pp.111-135. DOI
2	Lv, Y., Zhai, C.X. 2009. Positional Language Models for Information Retrieval. In Proc. of 32nd ACM SIGIR on Research and Development in Information Retrieval. pp.299-306.
3	Lv, Y., Zhai, C.X. 2010. Positional Relevance Model for Pseudo-Relevance Feedback. In Proc. of 33rd ACM SIGIR on Research and Development in Information Retrieval.
4	Blanco, R., Lioma, C. 2007. Random Walk Term Weighting for Information. In Proc. of 30th ACM SIGIR on Research and Development in Information Retrieval.
5	Huang, Y., Sun, L., Nie, J.Y., 2009. Smoothing Document Language Model with Local Word Graph. In Proc. of 18th ACM Conference on Information and Knowledge Management.
6	Mei, Q., Zhang, D., Zhai, C.X., 2008. A General Optimization FrameWork for Smoothing Language Models on Graph Structures. In Proc. of 31st ACM SIGIR on Research and Development in Information Retrieval.
7	Mihalcea, R., Tarau, P., 2004. TextRank-Bringing Order into Texts. In Proc. of the Conference on Empirical Methods in Natural Language Processing(EMNLP 2004).
8	Zhao, J., Yun, Y. 2009. A Proximity Language Model for Information Retrieval. In Proc. of 32nd ACM SIGIR on Research and Development in Information Retrieval. pp.291-298.
9	S. Hassan and C. Banea, 2006. Random-Walk Term Weighting for Improved Text Classification. In Proc. of TextGraphs: 2nd Workshop on Graph Based Methods for Natural Language Processing. ACL. pp.53-60.
10	Page, L., Brin, S., Motowani, R. and Winograd, T. 1998. The PageRank Citation Ranking: Bringing Order to the Web, Unpublished manuscript, Stanford University.
11	Strohman, T., Metzler, D., Turtle, H., and Croft, W.B. 2005. Indri: A Language Model-Based Search Engine for Complex Queries. In proc. International Conference on Intelligence Analysis. http://www.lemurproject.org
12	Lavrenko, V., Croft, W.B. 2001. Relevance-based Language Models. In Proc. of 24th ACM SIGIR on Research and Development in Information Retrieval. pp.120-127.
13	Collins-Thompson, K., Callan, J. 2007. Estimation and Use of Uncertainty in Pseudo-Relevance Feedback. In Proc. of 30th ACM SIGIR on Research and Development in Information Retrieval. pp.303-310.
14	Ponte, J.M., Croft, W.B. 1998. A Language Modeling Approach to Information Retrieval. In Proc. of 21st ACM SIGIR on Research and Development in Information Retrieval. pp.275-281.
15	Abdul-Jaleel, N., Allan, J., Croft, W.B., Diaz, F., Larkey, L., Li, X., Smucker, M.D., Wade, C. 2004. UMASS at TREC 2004-novelty and hard. In proc. Of the Thirteenth Text Retrieval Conference(TREC-13). pp.715-725.

KSCI

Query Expansion based on Word Graph using Term Proximity 질의 어휘와의 근접도를 반영한 단어 그래프 기반 질의 확장

Query Expansion based on Word Graph using Term Proximity