Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2007.14-D.6.597

Efficient Computation of Stream Cubes Using AVL Trees  

Kim, Ji-Hyun (이화여자대학교 컴퓨터학과)
Kim, Myung (이화여자대학교 컴퓨터학과)
Abstract
Stream data is a continuous flow of information that mostly arrives as the form of an infinite rapid stream. Recently researchers show a great deal of interests in analyzing such data to obtain value added information. Here, we propose an efficient cube computation algorithm for multidimensional analysis of stream data. The fact that stream data arrives in an unsorted fashion and aggregation results can only be obtained after the last data item has been read. cube computation requires a tremendous amount of memory. In order to resolve such difficulties, we compute user selected aggregation fables only, and use a combination of an way and AVL trees as a temporary storage for aggregation tables. The proposed cube computation algorithm works even when main memory is not large enough to store all the aggregation tables during the computation. We showed that the proposed algorithm is practically fast enough by theoretical analysis and performance evaluation.
Keywords
Stream Data; Cube Computation; Blocking Operator; Data Aggregation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 C. Cranor, T. Johnson, O. Spatscheck, and V. Shkapenyuk, 'Gigascope: A stream database for network applications,' In Proc. ACM SIGMOD, pp.647-651, 2003   DOI
2 M. Datar, A. Gionis, P. Indyk, and R. Motwani. 'Maintaining stream statistics over sliding windows,' In Proc. of the 2002 Annual ACM-SIAM Symposium on Discrete Algorithms, pp.635-644, 2002
3 S. Babu and J. Widom 'Continuous Queries Over Data Streams,' In Proc. ACM SIGMOD Record, Vol.30, pp.109-120, 2001   DOI   ScienceOn
4 B. J. Han, Y. Chen, G. Dong, J. Pei, B. W. Wah, J Wang, Y. D. Cai, 'Stream Cube: An Architecture for Multi-Dimensional Analysis of Data Streams,' Distributed and Parallel Databases, Vol.18, pp,173-197, 2005   DOI
5 김명, 송지숙, '효율적인 큐브 생성 방법,' 한국정보과학회 논문지(데이터베이스), 제29권 2호, pp.99-109, 2002   과학기술학회마을
6 B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom. 'Models and Issues in Data Streams,' In Proc. ACM Symp. on Principles of Database Systems, pp.1-16, June 2002   DOI
7 Stanford Stream Data Management (STREAM) Project. http://www-db.stanford.edu! stream
8 D. J. Abadi, D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebracker, N. Tatbul, and S. Zdonik. 'Aurora: a new model and architecture for data stream management,' The VLDB Journal, Vol.12, pp.120-139, 2003   DOI
9 S. Agarwal, R. Agrawal, P. M. Deshpande, A. Gupta, J. F. Naughton, R. Ramacrishnan, S. Sarawagi, 'On the Computation of Multidimensional Aggregates,' In Proc. cf the 22nd VLDB Conference, pp.506-521, 1996
10 R. Zhang, N. Koudas, B. C.Ook, D. Srivastava, 'Multiple Aggregations Over Data Streams,' In Proc. ACM SIGMOD, pp.299-310, 2005   DOI
11 Y. Sismanis, A. Deligiannakis, N. Roussopoulous, Y. Kotidis, 'Dwarf: Shrinking the PetaCube,' In Proc. ACM SIGMOD, pp.464-475, 2002
12 S. Guha and N. Koudas, and K. Shim. 'Data-streams and histograms,' In Proc. of the 2001 Annual ACM Symposium on Theory of Computing, pp.471-475, 2001   DOI
13 Y. Zhao, P. Deshpande, and J. Naughton, 'An Array-Based Algorithm for Simultaneous Multidimensional Aggregates,' In Proc. ACM SIGMOD, pp,159-170, 1997