Browse > Article

Iceberg Query Evaluation Technical Using a Cuboid Prefix Tree  

Han, Sang-Gil (LG 전자)
Yang, Woo-Sock (연세대학교 컴퓨터과학과)
Lee, Won-Suk (연세대학교 컴퓨터과학과)
Abstract
A data stream is a massive unbounded sequence of data elements continuously generated at a rapid rate. Due to the characteristics of a data stream, it is impossible to save all the data elements of a data stream. Therefore it is necessary to define a new synopsis structure to store the summary information of a data stream. For this purpose, this paper proposes a cuboid prefix tree that can be effectively employed in evaluating an iceberg query over data streams. A cuboid prefix tree only stores those itemsets that consist of grouping attributes used in GROUP BY query. In addition, a cuboid prefix tree can compute multiple iceberg queries simultaneously by sharing their common sub-expressions. A cuboid prefix tree evaluates an iceberg query over an infinitely generated data stream while efficiently reducing memory usage and processing time, which is verified by a series of experiments.
Keywords
Iceberg query; Data Stream; Data mining; Cuboid Prefix Tree; Aggregation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Garofalakis, J. Gehrke and R. Rastogi., 'Querying and mining data streams: you only get one look,' In the tutorial notes of the 28th International Conference on Very Large Databases. TUTORIAL SESSION: Tutorial 1, pp. 635-635, 2002   DOI
2 S. Madden, M.A. Shah, J.M. Hellerstein, and V. Raman., 'Continuously adaptive continuous queries over streams,' In Proceedings of the 2002 ACM SIGMOD International Conference on Management of Data, pp. 49-60, 2002   DOI
3 C. Hidber., 'Online association rule mining,' In Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 145-156, 1999   DOI
4 D. Cheug, J. Han, V. Ng, and C.Y. Wong., 'Maintenance of Discovered Association Rules in Large Databases: An Incremental Updating Technique for Maintaining Discovered Association Rules,' In Proceedings of the 12th International Conference on Data Engineering. pp. 106-114, 1996
5 V. Ganti, J. Gehrke, and R. Ramakrishnan., 'DAEMON: Mining and Monitoring Evolving Data,' In Proceedings of the 16th International Conference on Data Engineering, pp. 439-448, 2000   DOI
6 S. Brin, R. Motwani, J.D. Ullman, and S. Tsur., 'Dynamic itemset counting and implication rules for market basket data,' In Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 255-264, 1997   DOI
7 S. Chandrasekaran, M.J. Franklin., 'Streaming queries over streaming data,' In Proceedings of 28th International Conference on Very Large Data Bases pp. 203-204, 2002
8 M. Fang, N. Shivakumar, H. Garcia-Molina, R. Motwani, J.D. Ullman., 'Computing Iceberg Queries Efficiently' In Proceedings of 24rd International Conference on Very Large Data Bases, pp. 299-310. 1998
9 G.S. Manku and R. Motwani., 'Approximate Frequency Counts over Data Streams,' In Proceedings of the 28th International Conference on Very Large Data Bases, pp. 346-357, 2002
10 S. Krishnamurthy, C. Wu, and M. Franklin., 'On-the-Fly Sharing for Streamed Aggregation,' In Proceedings of the 2006 ACM SIGMOD International Conference on Management of Data, pp. 623-634, 2006   DOI
11 J.H. lang and W,S. Lee., 'Finding recent frequent itemsets adaptively over online data streams,' In Proceedings of the 2003 ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 487-492, 2003   DOI
12 Rui Zhang, Nick Koudas, Beng Chin Ooi, Divesh Srivastava, 'Multiple Aggregations Over Data Streams,' In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp, 299-310, 2005   DOI
13 R. Agrawal, R. Srikant., 'Fast Algorithms for Mining Association Rules in Large Databases,' In Proceedings of 20th International Conference on Very Large Data Bases, pp. 487-499, 2004
14 A. Savasere, E. Omiecinski, and S. Navathe., 'An Efficient Algorithm for Mining Association Rules in Large Databases,' In Proceedings of 20th International Conference on Very Large Data Bases, pp. 432-444. 1995
15 J. Chen, D.J. DeWitt, F. Tian, and Y. Wang., 'NiagaraCQ: A scalable continuous query system for internet databases,' Proceedings of' the' 2000 ACM SIGMOD International Conference on Know-ledge Discovery and Data Mining, pp. 379-390, 2000   DOI   ScienceOn
16 A. Arasu, J. Widom., 'Resource Sharing in Continuous Sliding- Window Aggregates' In Proceedings of 30th International Conference on Very Large Data Bases, pp. 336-347, 2004