Browse > Article

Dense Sub-Cube Extraction Algorithm for a Multidimensional Large Sparse Data Cube  

Lee Seok-Lyong (한국외국어대학교 산업정보시스템공학부)
Chun Seok-Ju (서울대학교 컴퓨터교육학과)
Chung Chin-Wan (한국과학기술원 전자전산학과 및 Image Information Research Center)
Abstract
A data warehouse is a data repository that enables users to store large volume of data and to analyze it effectively. In this research, we investigate an algorithm to establish a multidimensional data cube which is a powerful analysis tool for the contents of data warehouses and databases. There exists an inevitable retrieval overhead in a multidimensional data cube due to the sparsity of the cube. In this paper, we propose a dense sub-cube extraction algorithm that identifies dense regions from a large sparse data cube and constructs the sub-cubes based on the dense regions found. It reduces the retrieval overhead remarkably by retrieving those small dense sub-cubes instead of scanning a large sparse cube. The algorithm utilizes the bitmap and histogram based techniques to extract dense sub-cubes from the data cube, and its effectiveness is demonstrated via an experiment.
Keywords
Data Cube; Data Warehouse;
Citations & Related Records
연도 인용수 순위
  • Reference
1 U. S. Census Bureau, Census bureau databases, The online data are available on the web at http://www.census.gov/
2 C. Ho, R. Agrawal, N. Megido, and R. Srikant, Range queries in OLAP Data Cubes, Proceedings of ACM SIGMOD Int'l Conference on Management of Data, 1997, pp, 73-88   DOI
3 S. Geffner, D. Agrawal, and A. El Abbadi, T. Smith, Relative prefix sums: an efficient approach for quering dynamic OLAP Data Cubes, Proceedings of Int'l Conference on Data Engineering, Australia, 1999, pp, 328-335
4 J. S. Vitter and M. Wang, Approximate Computation of Multidimensional Aggregates of Sparse Data Using Wavelets, Proceedings of ACM SIGMOD Int'l Conference on Management of Data, Pennsylvania, 1999, pp. 193-204   DOI
5 S. Geffner, D. Agrawal, and A. El Abbadi, The Dynamic Data Cube, Proceedings of Int'l Conference on Extending Database Technology, Germany, 2000, pp. 237-253
6 D.W. Cheung, B. Zhou, B. Kao, H. Kan and S.D. Lee, Towards the building of a Dense-Region Based OLAP System, Data and Knowledge Engineering, Elsevier Science, V36, 1-27, 2001   DOI   ScienceOn
7 C. Y. Chan and Y. E. Ioannidis, Hierarchical cubes for range-sum queries, Proceedings of Int'l Conference on Very Large Data Bases, Scotland, 1999, pp. 675-686
8 S. J. Chun, C. W. Chung, J. H. Lee and S. L. Lee, Dynamic Update Cube for Range-Sum Queries, Proceedings of Int'l Conference on Very Large Data Bases, Italy, 2001, pp. 521-530