데이터 웨어하우스에서 클러스터링 기법을 이용한 실체화 뷰 선택 알고리즘

Materialized View Selection Algorithm using Clustering Technique in Data Warehouse

  • 양진혁 (고려대학교 대학원 전산학과) ;
  • 정인정 (고려대학교 전산학과)
  • 발행 : 2000.08.01

초록

데이터 웨어하우스에서 실체화 할 뷰들을 알맞게 선택하는 것은 분석적인 질의에 대한 정확하고 신속한 응답을 얻기 위해서 대단히 중요한 문제이다. 기존의 뷰 선택 알고리즘들에서는 릴레이션 전체가 실체화 뷰들로서 고려되었다. 그러나, 릴레이션의 부분 대신 전체를 실체화한다는 것은 시간과 공간 비용측면에서 좋지 못한 성능을 초래한다. 따라서, 우리는 기존 뷰 선택 알고리즘들에서의 문제점을 극복하기 위해서 개선된 실체화 뷰 선택 알고리즘을 제안한다. 제안된 알고리즘 ASVMRT(Algorithm for Selection on Views to Materialize using Reduced Table)에서는 먼저 속성-값들의 농도에 기반을 둔 자동 클러스터링을 사용하여 축약 테이블들을 데이터 웨어하우스에서 생성한 다음, 원래의 베이스 릴레이션들의 조합 대신에 축약 테이블들의 조합을 실체화 뷰들로 고려한다. 제안한 알고리즘의 타당성 검증을 위하여 우리는 실험결과에서 시간 및 공간 모두에서 기존 알고리즘들보다 약 1.8배의 성능향상이 있음을 보인다.

In order to acquire the precise and fast response for an analytical query, proper selection of the views to materialize in data warehouse is very crucial. In traditional view selection algorithms, the whole relations are considered to be selected as materialized views. However, materializing the whole relations rather than a part of relations results in much worse performance in terms of time and space cost. Therefore, we present an improved algorithm for selection of views to materialize using clustering method to overcome the problem resulted from conventional view selection algorithms. In the presented algorithm, ASVMRT(Algorithm for Selection of Views to daterialize using Iteduced Table). we first generate reduced tables in clata warehouse using automatic clustering based on attrihute-values density, then we consider the combination of reduced tables as materialized views instead of the combination of the original hase relations. For the justification of the proposecl algorithm. we show the experimental results in which both time and space cost are approximately 1.8 times better than the conventional algorithms.

키워드

참고문헌

  1. V. Harinarayan, A. Rajaraman, and J. Ulman, Implementing data cubes efficiently. In Proc. of the ACM SIGMOD International Conference of Man agement of Data, Canada, June 1996
  2. H. Gupta, Selection of views to materialized in a data warehouse, in ICDT, 1997
  3. J. Yang, K. Karlapalem, Q. Li, Algorithms for materialized view design in data warehousing environment, Proc. VLDB '97, pp.136-145
  4. W. H. Inmon, Building the Data Warehouse, Second Edition, John Wiley and Sons. Inc., 1996
  5. A. Gupta, I. S. Mumick, Maintenance of Materialized Views : Problems, Techniques, and Applications, IEEE Data Engineering Bulletin, Special Issue on Materialized Views and Data Warehousing, 18(2), pp.3-18, June 1995
  6. Red Brick System, Ins & Outs(and everything in between) of Data Warehousing, Red Brick Systems white paper, 1996
  7. M.-S. Chen, J. Han and P. Yu, Data Minning : An Overview from Database Perspective, IEEE Trans. on Knowledge and Data Engineering, 1997
  8. Rakesh Agrawal, Tomasz Imielinski, and Arun Swami, Database Mining: A Performance Perspective, IEEE Transactions on Knowledge and Data Engineering, Vol.5, No.6, pp.914-925, December 1993 https://doi.org/10.1109/69.250074
  9. Berson, J. Smith, Data Warehousing, Data Mining, & OLAP, McGraw-Hill, 1997
  10. Fayyad U. M., Piatetsky-Shapiro G., Smyth P and Uthurusamy R., Advances in Knowledge Discovery and Data Mining., Cambridge Ma : AAAI Press/MIT press 1996
  11. R. Agrawal and H. Srikant. Fast algorithms for mining association rules, In Processings of the 20th VLDB Conference, Santiago, Chile, Sept. 1994
  12. J. S. Park, M. S. Chen, and P. S. Yu, An effective hash-based algorithm for mining association rules, In Preceedings of ACM SIGMOD Conference on Management of Data, pp.175-186, SanJose, California, May, 1995 https://doi.org/10.1145/223784.223813
  13. J. Gary, A. Bosworth, A. Layman, H. Pirahesh, Data Cube: A Relational Aggregation Operator Generalizing Group-By, Cross-Tab, and Sub-Totals, Micro soft Technical Report No. MSR-TR-95-22
  14. K. A. Ross, D. Srivastava, and S. Sudarshan, Materialized View Maintenance and Integrity Constraint Checking : Trading Space for Time. In Proc. ACM SIGMOD '96, pp.447-458, Montreal, June 1996 https://doi.org/10.1145/233269.233361
  15. H. Gupta, V. Harinarayan, A. Rajaraman, J. D. Ullman, Index Selection for OLAP, Proceedings of the International Conference on Data Engineering, pp.208-219, Binghamton, UK, April, 1997
  16. E. Baralis, S. Paraboschi, E. Teniente, Materialized View Selection in a Multidimensional Database, Proc. VLDB '97, pp.156-165