DOI QR코드

DOI QR Code

Load Shedding for Temporal Queries over Data Streams

  • Received : 2011.05.08
  • Accepted : 2011.11.09
  • Published : 2011.12.30

Abstract

Enhancing continuous queries over data streams with temporal functions and predicates enriches the expressive power of those queries. While traditional continuous queries retrieve only the values of attributes, temporal continuous queries retrieve the valid time intervals of those values as well. Correctly evaluating such queries requires the coalescing of adjacent timestamps for value-equivalent tuples prior to evaluating temporal functions and predicates. For many stream applications, the available computing resources may be too limited to produce exact query results. These limitations are commonly addressed through load shedding and produce approximated query results. There have been many load shedding mechanisms proposed so far, but for temporal continuous queries, the presence of coalescing makes theses existing methods unsuitable. In this paper, we propose a new accuracy metric and load shedding algorithm that are suitable for temporal query processing when memory is insufficient. The accuracy metric uses a combination of the Jaccard coefficient to measure the accuracy of attribute values and $\mathcal{PQI}$ interval orders to measure the accuracy of the valid time intervals in the approximate query result. The algorithm employs a greedy strategy combining two objectives reflecting the two accuracy metrics (i.e., value and interval). In the performance study, the proposed greedy algorithm outperforms a conventional random load shedding algorithm by up to an order of magnitude in its achieved accuracy.

Keywords

References

  1. S. Babu and J. Widom, "Continuous queries over data streams," SIGMOD Record, vol. 30, no. 3, pp. 109-120, 2001. https://doi.org/10.1145/603867.603884
  2. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, "Models and issues in data stream systems," Proceedings of the 21st ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems, Madison, WI, June 3-5, 2002, pp. 1-16.
  3. J. Allen, "Maintaining knowledge about temporal intervals," Communications of the ACM, vol. 26, no. 11, pp. 832-843, 1983. https://doi.org/10.1145/182.358434
  4. A. U. Tansel, Temporal Databases: Theory, Design, and Implementation, Redwood City, CA: Benjamin/Cummings Publishing Co., 1993.
  5. M. H. Bohlen, R. T. Snodgrass, and M. D. Soo, "Coalescing in temporal databases," Proceedings of the 22th International Conference on Very Large Data Bases, Mumbai, India, September 3- 6, 1996, pp. 180-191.
  6. C. E. Dyreson, "Temporal coalescing with now, granularity, and incomplete information," ACM SIGMOD International Conference on Management of Data, San Diego, CA, June 9-12, 2003, pp. 169-180.
  7. M. Al-Kateb, S. S. Kunta, and B. S. Lee, "Temporal coalescing on window extents over data streams," IEICE Transactions on Information and Systems, vol. E94-D, no. 3, pp. 489-503, 2011. https://doi.org/10.1587/transinf.E94.D.489
  8. N. Tatbul, "Load shedding," Encyclopedia of Database Systems, L. Liu, Ed., Ney York, NY: Springer, 2009, pp. 1632-1636.
  9. R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma, "Query processing, approximation, and resource management in a data stream management system," First Biennial Conference on Innovative Data Systems Research, Asilomar, CA, January 5-8, 2003.
  10. A. M. Ayad and J. F. Naughton, "Static optimization of conjunctive queries with sliding windows over infinite streams," Proceedings of the ACM SIGMOD International Conference on Management of Data, Paris, France, June 13-18, 2004, pp. 419- 430.
  11. A. Das, J. Gehrke, and M. Riedewald, "Approximate join processing over data streams," ACM SIGMOD International Conference on Management of Data, San Diego, CA, June 9-12, 2003, pp. 40-51.
  12. B. Gedik, K. L. Wu, P. S. Yu, and L. Liu, "Adaptive load shedding for windowed stream joins," Proceedings of the 14th ACM International Conference on Information and Knowledge Management, Bremen, Germany, October 31-November 5, 2005, pp. 171-178.
  13. B. Gedik, K. L. Wu, P. S. Yu, and L. Liu, "A load shedding framework and optimizations for M-way windowed stream joins," Proceedings of the 23rd International Conference on Data Engineering, Istanbul, Turkey, April 15-20, 2007, pp. 536-545.
  14. J. Kang, J. F. Naughton, and S. D. Viglas, "Evaluating window joins over unbounded streams," Proceedings of the 19th International Conference on Data Engineering, Bangalore, India, March 5-8, 2003, pp. 341-352.
  15. N. Tatbul and S. Zdonik, "Window-aware load shedding for aggregation queries over data streams," Proceedings of the 32nd International Conference on Very Large Data Bases, Seoul,Korea, 2006, pp. 799-810.
  16. J. Xie, J. Yang, and Y. Chen, "On joining and caching stochastic streams," ACM SIGMOD International Conference on Management of Data, Baltimore, MD, June 14-16, 2005, pp. 359-370.
  17. B. Babcock, M. Datar, and R. Motwani, "Load shedding for aggregation queries over data streams," Proceedings of the 20th International Conference on Data Engineering, Boston, MA, March 30-April 2, 2004, pp. 350-361.
  18. Y. N. Law and C. Zaniolo, "Improving the accuracy of continuous aggregates and mining queries on data streams under load shedding," International Journal of Business Intelligence and Data Mining, vol. 3, no. 1, pp. 99-117, 2008. https://doi.org/10.1504/IJBIDM.2008.017978
  19. P. Jaccard, "Etude comparative de la distribution orale dans une portion des alpes et des jura," Bulletin del la Socit Vaudoise des Sciences Naturelles, vol. 37, pp. 241-272, 1901.
  20. M. Ozturk and A. Tsoukias, "Valued hesitation in intervals comparison," Lecture Notes in Computer Science Vol. 4772: Scalable Uncertainty Management (First International Conference, SUM 2007, Washington,DC, USA, October 10-12, 2007. Proceedings), H. Prade and V. Subrahmanian, Eds., Heidelberg, Germany: Springer Berlin, 2007, pp. 157-170.
  21. "Intel Lab Data," http://berkeley.intel-research.net/labdata.
  22. A. Arasu, S. Babu, and J. Widom, "The CQL continuous query language: semantic foundations and query execution," VLDB Journal, vol. 15, no. 2, pp. 121-142, 2006. https://doi.org/10.1007/s00778-004-0147-z
  23. F. Wang, C. Zaniolo, and X. Zhou, "Temporal XML? SQL strikes back!," Proceedings of the 12th International Symposium on Temporal Representation and Reasoning, Burlington, VT, June 23-25, 2005, pp. 47-55.
  24. U. Srivastava and J. Widom, "Flexible time management in data stream systems," Proceedings of the 23rd ACM SIGMODSIGACT-SIGART Symposium on Principles of Database Systems, Paris, France, June 14-16, 2004, pp. 263-274.
  25. "An Oracle White Paper Sep 2009: Oracle Database 11g Workspace Manager Overview," http://www.oracle.com/technetwork/ database/twp-appdev-workspace-manager-11g-128289.pdf.
  26. "Teradata," http://teradata.us/t/database/teradata-temporal/.
  27. R. T. Snodgrass, I. Ahn, G. Ariav, D. Batory, J. Clifford, C. E. Dyreson, R. Elmasrik, F. Grandik, C. S. Jensen, W. Kafer, N. Kline, K. Kulkarni, T. Y. C. Leung, N. Lorentzos, J. F. Roddick, A. Segev, M. D. Soo, and S. M. Sripada, "TSQL2 language specification," ACM SIGMOD Record, vol. 23, no. 1, pp. 65-86, 1994. https://doi.org/10.1145/181550.181562
  28. A. Tsoukias and P. Vincke, "A characterization of PQI interval orders," Discrete Applied Mathematics, vol. 127, no. 2 SPEC., pp. 387-397, 2003. https://doi.org/10.1016/S0166-218X(02)00256-1
  29. B. Mozafari and C. Zaniolo, "Optimal load shedding with aggregates and mining queries," Proceedings of the 26th IEEE International Conference on Data Engineering, Long Beach, CA, March 1-6, 2010, pp. 76-88.
  30. R. S. Barga, J. Goldstein, M. Ali, and M. Hong, "Consistent streaming through time: a vision for event stream processing," Third Biennial Conference on Innovative Data Systems Research, Asilomar, CA, January 7-10, 2007, pp. 363-374.
  31. C. Zaniolo, "Event-oriented data models and temporal queries in transaction-time databases," Proceedings of the 16th International Symposium on Temporal Representation and Reasoning, Bressanone-Brixen, Italy, July 23-25, 2009, pp. 47-53.
  32. J. Kramer and B. Seeger, "A temporal foundation for continuous queries over data streams," Proceedings of the 11th International Conference on Management of Data, Athens, Greece, 2005, pp. 70-82.

Cited by

  1. Scalable Activity-Travel Pattern Monitoring Framework for Large-Scale City Environment vol.11, pp.4, 2012, https://doi.org/10.1109/TMC.2011.113
  2. A Review of Window Query Processing for Data Streams vol.7, pp.4, 2013, https://doi.org/10.5626/JCSE.2013.7.4.220