MetaSearch for Entry Page Finding Task

엔트리 페이지 검색을 위한 메타 검색

  • 강인호 (삼성종합기술원 Computing LAB)
  • Published : 2005.04.01


In this paper, a MetaSearch algorithm for navigational queries is presented. Previous MetaSearch algorithms focused on informational queries. They Eave a high score to an overlapped document. However, the overemphasis of overlapped documents may degrade the performance of a MetaSearch algerian for a navigational query. However, if a lot of result documents are from a certain domain or a directory, then we can assume the importance of the domain or directory. Various experiments are conducted to show the effectiveness of overlap of a domain and directory names. System results from TREC and commercial search engines are used for experiments. From the results of experiments, the overlap of documents showed the better performance for informational queries. However, the overlap of domain names and directory names showed the $10\%$ higher performance for navigational queries.

본 연구에서는 웹에서 사용자가 방문하고자 하는 곳을 찾아가는 엔트리 페이지 검색을 위한 메타검색 방식을 제안한다. 기존의 연구에서 메타 검색이 여러 검색 엔진에서 많이 나타나는 중복된 문서를 강조하는 방식인 반면에 비해, 본 연구에서는 문서의 중복 개념을 확장하여 특정 도메인 및 디렉토리에서 나온 문서들도 중복되었다고 가정하여 메타검색에 이용하는 방식을 보인다. TREC에 제출된 시스템들의 결과물과 상용 검색 엔진의 결과물을 이용하여, 확장된 중복을 이용한 메타 검색의 유용성을 실험한다. 수행된 실험을 통해서 문서의 단순 중복을 이용하는 기존의 방식이 내용 기반 검색에 유용한 반면, 엔트리 페이지 검색에 있어서는 본 연구에서 제안하는 확장된 중복 방식이 기존 방식의 성능보다 $10\%$ 이상의 성능 향상을 얻을 수 있음을 알 수 있었다.



  1. Baeza-Yates, R., and Ribeiro-Neto, E. 'Modem Information Retrieval', Essex England: Addison-Wesley Pub Co, 1999
  2. Frakes, W.E., Baeza-Yates, R. 'Information Retrieval Data Structures & Algorithms', Prentice Hall Inc., Englewood Cliffs, New Jersey 1992
  3. Shaw, J, Fox, E. 'Combination of Multiple Searches', In Text REtrieval Conference (TREC-3), Gaithersburg, Maryland, pp. 105-108, 1994
  4. Bartell, B.T., Cottrell, GW., Belew, R.K. 'Automatic Combination of Multiple Ranked Retrieval Systems', In Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Dublin, Ireland, pp. 173-181, 1994
  5. Dreilinger, D., Howe, AE. 'Experiences with Selecting Search Engines using MetaSearch', ACM Transactions on Information Systems, vol. 15, pp. 195-222, 1997
  6. Lee, J.H. 'Analyses of Multiple Evidence Combination', In Proceedings of the 20th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Philadelphia, pp. 267-276, 1997
  7. Lee, J.H. 'Combining Multiple Evidence from Different Properties of Weighting Schemes', In Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Seattle, Washington, pp.180-188, 1995
  8. Aslam, J., Montague, M. 'Models for MetaSearch', In Proceedings of the 24th Annual International ACM SIGIR conference on Research and Development in Information Retrieval, New Orleans, LA, pp. 267-284, 2001
  9. Montague, M. 'MetaSearch: Data Fusion for Document Retrieval', PhD dissertation, Dartmouth College, 2002
  10. Broder, A. 'A Taxonomy of Web Search', SIGIR Forum, 36(2), 2002
  11. Manmatha, R., Rath, T., Feng, F. 'Modeling Score Distributions for Combining the Outputs of Search Engines', In Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New Orleans, LA , pp. 267-275, 2001
  12. Harman, D. 'Overview of the Third Text REtrieval Conference', In Text REtrieval Conference (TREC-3), Gaithersburg, Maryland pp. 1-20, 1994
  13. KEMONG 'The Kemong Company new Encyclopedia', Kemong Corp., Seoul: KEMONGSA Publishing Co. 1992
  14. Westerveld, T., Kraaij, W., and Hiemstra, D. 'Retrieving Web pages using Content, Links, Urls and Anchors' In Text REtrieval Conference(TREC-10) (pp. 663-672). Gaithersburg, Maryland, 2001