Browse > Article

SPEC: Space Efficient Cubes for Data Warehouses  

Chun Seok-Ju (서울교육대학교 컴퓨터교육과)
Lee Seok-Lyong (한국외국어대하교 산업정보시스템공학부)
Kang Heum-Geun (우송공업대학 컴퓨터정보통신계열)
Chung Chin-Wan (한국과학기술원 전산학과)
Abstract
An aggregation query computes aggregate information over a data cube in the query range specified by a user Existing methods based on the prefix-sum approach use an additional cube called the prefix-sum cube(PC), to store the cumulative sums of data, causing a high space overhead. This space overhead not only leads to extra costs for storage devices, but also causes additional propagations of updates and longer access time on physical devices. In this paper, we propose a new prefix-sum cube called 'SPEC' which drastically reduces the space of the PC in a large data warehouse. The SPEC decreases the update propagation caused by the dependency between values in cells of the PC. We develop an effective algorithm which finds dense sub-cubes from a large data cube. We perform an extensive experiment with respect to various dimensions of the data cube and query sizes, and examine the effectiveness and performance ot our proposed method. Experimental results show that the SPEC significantly reduces the space of the PC while maintaining a reasonable query performance.
Keywords
Data warehouses; Data cube; Prefix-sum cube; OLAP; Range-sum query; Clustering;
Citations & Related Records
연도 인용수 순위
  • Reference
1 C. Ho, R. Agrawal, N. Megido, and R. Srikant, 'Range queries in OLAP Data Cubes,' ACM SIGMOD Conference, pp. 73-88. 1997
2 C.-Y. Chan, and Y. E. Ioannidis, 'Hierarchical cubes for range-sum queries.' VLDB Conference, Scotland, pp. 675-686. 1999
3 W. Liang, H. Wang, and M. E. Orlowska, 'Range Queries in dynamic OLAP data cubes,' Data & Knowledge Engineering, Vol. 34, pp. 21-38, 2000   DOI   ScienceOn
4 M. Riedewald, D. Agrawal, and A. E. Abbadi, 'pCube : Update-Efficient Online Aggregation with Progressive Feedback and Error Bounds,' SSDBM conference, pp. 95-108, 2000   DOI
5 T. Zhang, R. Ramakrishnan, and M. Livny, 'BIRCH : An efficient data clustering method for very large databases,' ACM SIGMOD Conference, Canada, pp. 103-114, 1996   DOI
6 S. Guha, R. Rastogi, and K. Shim, 'CURE : An efficient clustering algorithm for large databases,' ACM SIGMOD Conference, Washington, pp. 73-84, 1998   DOI
7 M. Ester, H. P. Kriegel, J. Sander, and X. Xu, 'A density-based algorithm for discovering clusters in large spatial databases with noise,' KDD Conference, Oregon, pp. 226-231, 1996
8 R. T. Ng and J. Han, 'Efficient and effective clustering methods for spatial data mining,' VLDB Conference, Chile, pp. 144-155, 1994
9 N. Beckmann, H. Kriegel, R. Schneider, and B. Seeger, 'The R*-tree : an efficient and robust access method for points and rectangles,' ACM SIGMOD Conference, New Jersey, pp. 322-331, 1990
10 R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan, 'Automatic subspace clustering of high dimensional data for data mining applications,' ACM SIGMOD Conference, Washington, pp. 94-105, 1998   DOI   ScienceOn
11 S. Goil and A. Choudhary, 'BESS : Sparse data storage of multi-dimensional data for OLAP and data mining,' Technical report, Northwestern University, 1997
12 Seok-Ju Chun, Chin-Wan Chung, Ju-Hong Lee, and Seok-Lyong Lee, 'Dynamic Update Cube for Range-Sum Queries,' VLDB Conference, Italy, pp. 521-530, 2001
13 M. Riedewald, D. Agrawal, A. E. Abbadi, and R. Pajarola, 'Space-Efficient Data Cubes for Dynamic Environments,' DaWaK conference, pp. 24-33, 2000   DOI
14 S. Geffner, D. Agrawal, and A. El Abbadi, 'The Dynamic Data Cube, EDBT Conference,' Germany, pp. 237-253, 2000
15 S. Geffner, D. Agrawal, A. El Abbadi, and T. Smith, 'Relative prefix sums : an efficient approach for quering dynamic OLAP Data Cubes,' ICDE Conference, Australia, pp. 328-335, 1999