A Popularity-driven Cache Management and its Performance Evaluation in Meta-search Engines

메타 검색 엔진을 위한 인기도 기반 캐쉬 관리 및 성능 평가

  • Published : 2002.04.01

Abstract

Caching in meta-search engines can improve the response time of users' request. We describe the cache scheme in our meta-search engine in terms of its architecture and operational flow. In particular, we propose a popularity-driven cache algorithm that utilizes popularities of queries to determine cached data to be purged. The popularity is a value that represents the normalized occurrence frequency of user queries. This paper presents how to collect popular queries and how to calculate query popularities. An empirical performance evaluation of the popularity-driven caching with the traditional schemes (i.e., least recently used (LRU) and least frequently used (LFU)) has been carried out on a collection of real data. In almost all cases, the proposed replacement policy outperforms LRU and LFU.

메타 검색 엔진에서 캐쉬의 사용은 사용자의 응답시간을 향상시킬 수 있다. 본 논문에서는 메타 검색 엔진의 구조와 동작을 보이고, 메타 검색 엔진을 위한 인기도 기반의 새로운 캐쉬 대체 방법을 제안한다. 인기도는 사용자들이 검색 엔진에 요청한 단어들의 출현 빈도수를 정규화한 값으로, 캐쉬 대체를 위한 기준치로 이용된다. 본 논문에서는 인기 검색어 수집 방법, 인기도 산출방법을 기술하고, 인기도를 기반으로 하는 새로운 알고리즘을 제안한다. 또한 실제 사용자가 검색 엔진에 입력한 자료를 바탕으로, 전통적인 캐쉬 대체 기법인 LRU, LFU 알고리즘과 제안된 알고리즘을 성능 평가하였다. 본 성능 평가에서는 제안된 알고리즘이 대다수의 경우 우수한 성능을 나타내었다.

Keywords

References

  1. S. Lawrence, and C. Giles, 'Accessibility of Information on the Web,' Nature, 400: 107-109, 1999 https://doi.org/10.1038/21987
  2. A. Scime and L. Kerschberg, 'WebSifter: An Ontology-based Personalizable Search Agent for the Web,' Proceedings International Conference on Digital Libraries: Research and Practice, Kyoto, Japan, 2000 https://doi.org/10.1109/DLRP.2000.942176
  3. J. Cheong and S. Lee, 'A Boolean Query Processing with a Result Cache in Mediator Systems,' in Advances in Digital Libraries, 2000 Proceedings, IEEE, 2000, pages 218-227, 2000 https://doi.org/10.1109/ADL.2000.848392
  4. S. Dar, M. Franklin, B. Jonsson, D. Srivastava, M. Tan, 'Semantic Data Caching and Replacement,' Proceedings of the 22nd VLDB Conference, India, 1996
  5. B. Chidlovskii, C. Roncancio, and M. Schneider, 'Semantic Cache Mechanism for Heterogeneous Web Querying,' WWW8, 1999
  6. D. Lee, and W. Chu, 'Semantic Caching via Query Matching for Web Sources,' Proceedings of the 8th International Conference on Information Knowledge management, pages 77-85, 1999 https://doi.org/10.1145/319950.319960
  7. J. Robinson and M. Devarakonda, 'Data Cache Management Using Frequency-Based Replacement,' In Proc.of the 1990 ACM SIGMOD Int'l Conf. on management of data pages 134-142, Augest 1990 https://doi.org/10.1145/98457.98523
  8. E. ONeil, P. ONeil, and G. Weikum, 'The LRU-K Page Replacement Algorithm for Database Disk Buffering,' Proc. 1993 ACM SIGMOD, 1993 https://doi.org/10.1145/170036.170081
  9. T. Johnson, and D. Shasha, '2Q: A Low Overhead High Performance Buffer Management Replacement Algorithm,' Proc. of the 20th International Conference on VLDB, pages 439-450, 1994
  10. D. Lee, J. Choi, J. Kim, S. Noh, S. Min,Y. Cho, C. Kim, 'On the Existence of a Spectrum of Policies theat Subsumes the Least Recently Used(LRU) and Least Frequently Used (LFU) Policies,' Proc. of the International Conference on Measurement and Modeling of Computer Systems, 1999
  11. S. Lee, J. Hong, and L. Kerschberg, 'A Popularity-driven Caching Scheme for Meta-search Engines: An Empirical Study,' Springer-verlag Lecture Notes in Computer Science, Vol. 2113, pages 877-886, 2001