Browse > Article
http://dx.doi.org/10.3745/KIPSTB.2012.19B.1.037

Query Expansion based on Word Graph using Term Proximity  

Jang, Kye-Hun (전북대학교 컴퓨터공학과)
Lee, Kyung-Soon (전북대학교 컴퓨터공학부/영상정보신기술연구센터)
Abstract
The pseudo relevance feedback suggests that frequent words at the top documents are related to initial query. However, the main drawback associated with the term frequency method is the fact that it relies on feature independence, and disregards any dependencies that may exist between words in the text. In this paper, we propose query expansion based on word graph using term proximity. It supplements term frequency method. On TREC WT10g test collection, experimental results in MAP(Mean Average Precision) show that the proposed method achieved 6.4% improvement over language model.
Keywords
Term Proximity; Word Graph; Context Term; Query Expansion; TextRank;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Sakai, T., Manabe, T., Koyama, M. 2005. Flexible Pseudo-Relevance Feedback via Selective Sampling. ACM Transaction on Asian Language Information Processing(TALIP), 4(2), pp.111-135.   DOI
2 Lv, Y., Zhai, C.X. 2009. Positional Language Models for Information Retrieval. In Proc. of 32nd ACM SIGIR on Research and Development in Information Retrieval. pp.299-306.
3 Lv, Y., Zhai, C.X. 2010. Positional Relevance Model for Pseudo-Relevance Feedback. In Proc. of 33rd ACM SIGIR on Research and Development in Information Retrieval.
4 Blanco, R., Lioma, C. 2007. Random Walk Term Weighting for Information. In Proc. of 30th ACM SIGIR on Research and Development in Information Retrieval.
5 Huang, Y., Sun, L., Nie, J.Y., 2009. Smoothing Document Language Model with Local Word Graph. In Proc. of 18th ACM Conference on Information and Knowledge Management.
6 Mei, Q., Zhang, D., Zhai, C.X., 2008. A General Optimization FrameWork for Smoothing Language Models on Graph Structures. In Proc. of 31st ACM SIGIR on Research and Development in Information Retrieval.
7 Mihalcea, R., Tarau, P., 2004. TextRank-Bringing Order into Texts. In Proc. of the Conference on Empirical Methods in Natural Language Processing(EMNLP 2004).
8 Zhao, J., Yun, Y. 2009. A Proximity Language Model for Information Retrieval. In Proc. of 32nd ACM SIGIR on Research and Development in Information Retrieval. pp.291-298.
9 S. Hassan and C. Banea, 2006. Random-Walk Term Weighting for Improved Text Classification. In Proc. of TextGraphs: 2nd Workshop on Graph Based Methods for Natural Language Processing. ACL. pp.53-60.
10 Page, L., Brin, S., Motowani, R. and Winograd, T. 1998. The PageRank Citation Ranking: Bringing Order to the Web, Unpublished manuscript, Stanford University.
11 Strohman, T., Metzler, D., Turtle, H., and Croft, W.B. 2005. Indri: A Language Model-Based Search Engine for Complex Queries. In proc. International Conference on Intelligence Analysis. http://www.lemurproject.org
12 Lavrenko, V., Croft, W.B. 2001. Relevance-based Language Models. In Proc. of 24th ACM SIGIR on Research and Development in Information Retrieval. pp.120-127.
13 Collins-Thompson, K., Callan, J. 2007. Estimation and Use of Uncertainty in Pseudo-Relevance Feedback. In Proc. of 30th ACM SIGIR on Research and Development in Information Retrieval. pp.303-310.
14 Ponte, J.M., Croft, W.B. 1998. A Language Modeling Approach to Information Retrieval. In Proc. of 21st ACM SIGIR on Research and Development in Information Retrieval. pp.275-281.
15 Abdul-Jaleel, N., Allan, J., Croft, W.B., Diaz, F., Larkey, L., Li, X., Smucker, M.D., Wade, C. 2004. UMASS at TREC 2004-novelty and hard. In proc. Of the Thirteenth Text Retrieval Conference(TREC-13). pp.715-725.