Predictive Memory Allocation over Skewed Streams

  • Published : 2009.06.30

Abstract

Adaptive memory management is a serious issue in data stream management. Data stream differ from the traditional stored relational model in several aspect such as the stream arrives online, high volume in size, skewed data distributions. Data skew is a common property of massive data streams. We propose the predicted allocation strategy, which uses predictive processing to cope with time varying data skew. This processing includes memory usage estimation and indexing with timestamp. Our experimental study shows that the predictive strategy reduces both required memory space and latency time for skewed data over varying time.

Keywords

References

  1. P. A. Tucker, D. Maier, T. Sheard, and L. Fegaras, "Exploiting Punctuation Semantics in Continuous Data Streams," IEEE Trans. On Knowledge and Data Engineering, Vol. 15, No.3, pp. 555-568, 2003 https://doi.org/10.1109/TKDE.2003.1198390
  2. S. Badu, U. Srivastava, and J. Widom, "Exploiting kConstraints to Reduce Memory Overhead in Continuous Queries over Data Streams," ACM Trans. On Database Systems, Vol. 29, No.3, pp. 545-580, 2004 https://doi.org/10.1145/1016028.1016032
  3. J. Li, K. Tufte, D. Maier, and V. Papadimos, "AdaptW1D: An Adaptive, Memory-Efficient Window Aggregation Implementation," IEEE Internet Computing, Vol. 12, No.6, pp. 22-29. 2008 https://doi.org/10.1109/MIC.2008.116
  4. M. Cammert, J. Kramer, B. Seeger, and S. Vaupel, "An Approach to Adaptive Memory Management in Data Stream Systems," Proc. of ICDE '06, pp. 137-139,2006 https://doi.org/10.1109/ICDE.2006.17
  5. L. Golab and M. T. Ozsu, "Issues in Data Stream Management," SIGMOD Record, Vol. 32, No.2, pp. 5-14, 2003 https://doi.org/10.1145/776985.776986
  6. F. Wang and P. Liu, "Temporal Management of RFID data," Proceeding of the VLDB '05, pp.1128-1139, 2005
  7. B. Babcock et al., "Model and Issues in Data Stream Systems," Proc. Symp. Principles of Database Systems, ACM Press, pp. 1-16, 2002 https://doi.org/10.1145/543613.543615
  8. M. M. Gaber, A. Zaslavsky, and S. Krishnaswamy, "Mining Data Streams: A Review," ACM SIGMOD Record, Vol. 34, No.2, pp. 18-26, 2005 https://doi.org/10.1145/1083784.1083789
  9. U. Srivastava and J. Widom, "Flexible Time Management in Data Stream Systems," PODS 2004, ACM, pp. 263-274, 2004 https://doi.org/10.1145/1055558.1055596
  10. J. Gao et aI., "Classifying Data Streams with Skewed Class Distributions and Concept Drifs," IEEE Internet Computing, Vol. 12, No.6, pp. 37-49. 2008 https://doi.org/10.1109/MIC.2008.119
  11. J. Li et aI., 'Semantics and Evaluation Techniques for Window Aggregates in Data Streams,' Proc. ACM SIGMOD 05, ACM Press, pp. 311-322, 2008 https://doi.org/10.1145/1066157.1066193
  12. D. Abadi et aI., "Aurora: A New Model and Architecture for Data Stream Management," VLDB J, Vol. 12, No.2, pp. 120-139, 2003 https://doi.org/10.1007/s00778-003-0095-z