Browse > Article
http://dx.doi.org/10.3745/KIPSTA.2010.17A.5.229

Dynamic Management of Equi-Join Results for Multi-Keyword Searches  

Lim, Sung-Chae (동덕여자대학교 컴퓨터학과)
Abstract
With an increasing number of documents in the Internet or enterprises, it becomes crucial to efficiently support users' queries on those documents. In that situation, the full-text search technique is accepted in general, because it can answer uncontrolled ad-hoc queries by automatically indexing all the keywords found in the documents. The size of index files made for full-text searches grows with the increasing number of indexed documents, and thus the disk cost may be too large to process multi-keyword queries against those enlarged index files. To solve the problem, we propose both of the index file structure and its management scheme suitable to the processing of multi-keyword queries against a large volume of index files. For this, we adopt the structure of inverted-files, which are widely used in the multi-keyword searches, as a basic index structure and modify it to a hierarchical structure for join operations and ranking operations performed during the query processing. In order to save disk costs based on that index structure, we dynamically store in the main memory the results of join operations between two keywords, if they are highly expected to be entered in users' queries. We also do performance comparisons using a cost model of the disk to show the performance advantage of the proposed scheme.
Keywords
Search Engine; Web Searches; Inverted Files; Index Files;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Sung-Ryul Kim, Inbok Lee, and Kunsoo Park, “A fast algorithm for the generalized k-keyword proximity problem given keyword offsets,” Information Processing Letters, Vol.91, No.3, pp.115-120, 2004.   DOI   ScienceOn
2 C. Ruemmler and J. Wikes, “An Introduction to Disk Modeling,” IEEE Computer, Vol.17, No.3, pp.17-28, 1994.   DOI
3 Sergey Melnik, Sriram Raghavan, Beverly Yang, and Hector Garcia-Molina. “Building a Distributed Full-text Index for the Web,” In Proc. of the 10th International World Wide Web Conference, pp.396-406, 2001.   DOI
4 Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung, “The Google File System,” 19th ACM Symposium on Operating Systems Principles, October, 2003.   DOI
5 Soyeon Park, Joon Ho Lee, and Hee Jin Bae, “End user searching : A Web log analysis of NAVER, a Korean Web search engine,” Vol.27, No.2, pp.203-221, 2005.   DOI   ScienceOn
6 Ronny Lempel and Shlomo Moran, “Predictive Caching and Prefetching of Query Results in Search Engines,” In Proc. of the 12th International Conf. on World Wide Web, pp.19-28, New York, 2003.   DOI
7 Maxim Lifantsev and Tzicker Chiueh, “Implementation of a Modern Web Search Engine Cluster,” In Proc. of the USENIX Annual Technical Conference, Texas, 2003.
8 BoostingCraig Silverstein, Hannes Marais, Monika Henzinger, and Michael Moricz, “Analysis of a very large web search engine query log,” ACM SIGIR Forum, Vol.33(1), pp.6-12, 1999.   DOI
9 Tiziano Fagni, Raffaele Perego, Fabrizio Silvestri, and Salvatore Orlando, “Boosting the performance of Web Search Engines: Caching and Prefetching Query Results by Exploiting Historical usage Data,” ACM Trans. on Information Systems, Vol.24(1), pp.51-78, 2006.   DOI   ScienceOn
10 Hao Yan, Shuai Ding, and Torsten Suel, “Inverted Index Compression and Query Processing with Optimized Document Ordering,” In Proc. of the WWW Conference, pp.401-410, 2009.   DOI
11 Vo Ngoc Anh and Alistair Moffat, “Inverted Index Compression Using Word-Aligned Binary Codes,” Information Retrieval, Vol.8, No.1, pp.151-166, 2005.   DOI   ScienceOn
12 Search Engine Report, Http://www.searchenginewatch.com, 2010.
13 이주남, Google과 함께 떠오르는 검색엔진, 소프트웨어진흥원 시장 이슈 보고서, 2004.
14 Steve Lawrence, C. Lee Giles, and Kurt Bollacker, “Digital Libraries and Autonomous Citation Indexing,” IEEE Computer, Vol.32, No.6, pp.67-71, 1999.   DOI
15 Arvind Arasu, et al., “Searching the Web,” ACM Trans. on Internet Technology, Vol.1, No.1, pp.2-43, August, 2001.   DOI
16 Sergey Brin, Lawrence Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine,” Computer Networks and ISDN Systems, Vol.30, Issue 1-7, pp.107-117, 1998.   DOI   ScienceOn
17 임성채, “계층적 캐시 기법을 이용한 대용량 웹 검색 질의 처리 시스템의 구현”, 정보과학회논문지 : 컴퓨팅의 실제 및 레터, Vol.14, No.7, pp.669-679, 2008.
18 Maxim Lifantsev and Tzi-cker Chiueh, “I/O-Conscious Data Preparation for Large-Scale Web Search Engines,” In Proc. the VLDB Conf., Hong Kong, 2002.