A Cache Manager for Enhancing the Performance of Query Evaluation in Data Warehousing Environment

데이타웨어하우스 환경에서의 질의 처리 성능 향상을 위한 캐시 관리자

  • 심준호 (숙명여자대학교 정보과학부)
  • Published : 2003.08.01

Abstract

Data warehouses are usually dedicated to the processing of quires issued by decision support system(DSS). The response time of DSS queries is typically several orders of magnitude higher than the one of OLTP queries. Since DSS queries are often submitted interactively, techniques for reducing their response time are important. The caching of query results is one such technique particularly well suited to the DSS environment. In this paper, we present a cache manager for such an environment. Specifically, we define a canonical form of query. The cache manager looks up a query based on the exact query match or using a suggested query split process if the query is found is non-canonical form or in canonical form, respectively. It dynamically maintains the cache content by employing a profit function which reflects in an integrated manner the query execution cost, the size of query result, the reference rate, the maintenance cost of each result due to updates of their base tables, and the frequency of such updates. We performed the experimental evaluation and it positively shows the performance benefit of our cache manager.

데이타웨어하우스는 의사결정시스템의 질의처리에 사용되는데, 통상적으로 의사결정질의의 응답 속도는 OLTP 질의 응답속도에 비해 수십 배 이상 오래 걸린다. 의사결정은 대부분 빠른 시간 안에 이루어지는 것이 필수적이므로 의사결정질의 응답 속도를 단축시키는 기술은 중요하다. 본 논문에서는 기존의 질의결과를 캐싱하여 주어진 질의처리에 이용하는 기법을 제시한다. 이를 위해 먼저 의사결정시스템이 이 기법에 적합한 환경을 가지고 있는지 살펴본다. 그 다음, 임의 형태의 모든 질의를 처리한다는 것은 불가능하므로 우리가 다루는 질의 형태인 정규화형태를 정의한다. 질의가 정규화형태를 따르지 않으면 단순 스트링 매칭을 하고, 정규화된 경우라면 질의스플릿이란 질의 변환 과정과 질의종속그래프를 통해 캐시된 질의결과를 찾은 후 그 결과 위에서 질의를 수행한다. 캐시 관리자는 질의응답시간을 최소화하도록 캐시를 유지해야한다. 이를 위해 질의 수행비용, 질의결과의 크기, 레퍼런스비율, 베이스 테이블의 업데이트비율 및 그에 따른 질의결과 유지비용 등을 고려하여 캐싱하는 동적 캐시효환기법을 제안한다. 제안된 기법은 실험을 통해 그 성능을 검증하였다.

Keywords

References

  1. W. Inmon, 'Building the Data Warehouse, 3rd edition, John Wiley and Sons, 2002
  2. J. Han, and M. Kamber, Data Mining: Concepts and Techniques, Morgan Kaufmann, 2001
  3. V. Harinarayan, A. Rajaraman, and J. Ullman, 'Implementing Data Cubes Efficiently,' Proc. of the ACM SIGMOD International Conference on Management of Data, ACM Press,1996 https://doi.org/10.1145/233269.233333
  4. P. Scheuermann, J. Shim, and R. Vingralek, 'WATCIMAN: A Data Warehouse Intelligent Cache Manager,' Proc. of the International Conference on Very Large Databases, Morgan Kaufmann, 1996
  5. J. Shim, P. Scheuermann, and R. Vingralek, 'Dynamic Caching of Query Results for Decision Support Systems,' Proc. of the 11th International Conference on Scientific and Statistical Database Management, IEEE Computer Society,1999 https://doi.org/10.1109/SSDM.1999.787641
  6. E. O'Neil, P. O'Neil, and G. Weikum, 'The LRU K Page Replacement Algorithm For Database Disk Buffering,' Proc. of the ACM SIGMOD International Conference on Management of Data, ACM Press, 1993 https://doi.org/10.1145/170035.170081
  7. W.P. Yan, and P.A. Larson, 'Eager Aggregation and Lazy Aggregation, 'Proc. of the International Conference on Very Large Databases, Morgan Kaufmann, 1995
  8. S. Chaudhuri, R. Krshnamurthy, S. Potamianos, and K. Shim, 'Optimizing Queries with Materialized Views,' Proc. of International Conference on Data Engineering, IEEE Computer Society, 1995 https://doi.org/10.1109/ICDE.1995.380392
  9. A. Gupta, V. Hariarayan, and D. Quass, 'Aggregate query Processing in Data Warehousing Environment,' Proc. of the International Conference on Very Large Databases, Morgan Kaufmann, 1995
  10. P. Deshpande, and J.F. Naughton, 'Aggregate Aware Caching for Multi dimensional Queries,' Proc. of the 7th International Conference on Extending Database Technology, Springer, 2000
  11. J. Yang, K. Karlapalem, and Q. Li, 'Algorithms for materialized view design in data warehousing environment,' Proc. of the International Conference on Very Large Databases, Morgan Kaufmann, 1997
  12. Transaction Processing Performance Council, TPC Benchmark II/R, http://www.tpc.org
  13. T.H. Cormen, C.E. Leiserson, and R.L. Rivest, Introduction to Algorithms, McGraw Hill, 1990
  14. I.S. Mumick, D. Quass, and B.S. Mumick, 'Maintenance of data cubes and summary tables in a warehouse,' Proc. of the ACM SIGMOD International Conference on Management of Data, ACM Press, 1997 https://doi.org/10.1145/253260.253277
  15. Introduction to Algorithms T.H.Cormen;C.E.Leiserson;R.L.Rivest
  16. Proc. of the ACM SIGMOD International Conference on Management of Data Maintenance of data cubes and summary tables in a warehouse I.S.Mumick;D.Quass;B.S.Mumick