Linear Resource Sharing Method for Query Optimization of Sliding Window Aggregates in Multiple Continuous Queries

다중 연속질의에서 슬라이딩 윈도우 집계질의 최적화를 위한 선형 자원공유 기법

  • 백성하 (인하대학교 컴퓨터정보공학과) ;
  • 유병섭 (인하대학교 컴퓨터정보공학과) ;
  • 조숙경 (인하대학교 컴퓨터정보공학과) ;
  • 배해영 (인하대학교 컴퓨터정보공학과)
  • Published : 2006.11.15

Abstract

A stream processor uses resource sharing method for efficient of limited resource in multiple continuous queries. The previous methods process aggregate queries to consist the level structure. So insert operation needs to reconstruct cost of the level structure. Also a search operation needs to search cost of aggregation information in each size of sliding windows. Therefore this paper uses linear structure for optimization of sliding window aggregations. The method comprises of making decision, generation and deletion of panes in sequence. The decision phase determines optimum pane size for holding accurate aggregate information. The generation phase stores aggregate information of data per pane from stream buffer. At the deletion phase, panes are deleted that are no longer used. The proposed method uses resources less than the method where level structures were used as data structures as it uses linear data format. The input cost of aggregate information is saved by calculating only pane size of data though numerous stream data is arrived, and the search cost of aggregate information is also saved by linear searching though those sliding window size is different each other. In experiment, the proposed method has low usage of memory and the speed of query processing is increased.

스트림 처리기는 다수의 연속질의에서 제한된 자원을 효율적으로 이용하기 위하여 자원공유 기법을 이용한다. 기존의 기법은 계층구조를 유지하여 집계질의를 처리한다. 그래서 삽입연산은 계층구조 재구성 비용이 필요하다. 또한 검색연산은 서로 다른 슬라이딩 윈도우 크기에 속하는 집계정보 검색비용이 필요하다. 그래서 본 논문에서는 보다 빠른 질의 처리를 위해 선형 자료구조를 사용한다. 제안기법은 팬(Pane)크기 결정단계와 팬 생성단계, 팬 삭제단계로 구성된다. 팬 크기 결정단계는 정확한 집계정보를 유지하기 위한 최적 팬 크기를 결정하는 단계이며, 팬 생성단계는 스트림 버퍼로부터 팬 크기만큼의 데이타에 대한 집계정보를 저장하는 단계이다. 팬 삭제단계는 더 이상 연속질의가 사용하지 않는 팬을 삭제하는 단계이다. 제안 기법은 선형 자료 구조를 이용하므로 계층구조를 이용하는 자료 구조에 비해 자원을 적게 사용한다. 또한 스트림 데이타가 입력되어도 팬 크기에 해당하는 집계정보만 계산하면 되므로 집계정보 삽입비용이 감소하고, 서로 다른 슬라이딩 윈도우 크기에 대해서도 선형검색으로 집계정보 검색비용이 감소한다. 성능평가를 통하여 제안기법이 적은 메모리 사용 결과를 보였으며, 질의 처리 속도가 증가하였다.

Keywords

References

  1. B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom., 'Models and Issues in Data Stream Systems,' Invited paper in Proc of PODS, 2002 https://doi.org/10.1145/543613.543615
  2. R. Motwani, J. Widom, A. Arasu, B. Babcock, S. Babu, M. Datar, G. Manku, C. Olston, J. Rosenstein, and R. Varma., 'Query Processing, Resource Management, and Approximation in a Data Stream Management System,' In Proc of CIDR, 2003
  3. Abadi, D. J, Carney, D., Centintemel, U., Cherniack, M., Convey, C., Lee, S., Stonebraker, M., Tatbul, N., Zdonik, S., 'Aurora: A New Model and Architecture for Data Stream Management,' VLDB Journal, 2003 https://doi.org/10.1007/s00778-003-0095-z
  4. A. Arasu, S. Babu, and J. Widom. The CQL Continuous Query Language : Semantic Foundations and Query Execution. Stanford University Technical Report, 2003
  5. S. Chandrasekharan and M. J. Franklin., 'Streaming Queries over streaming data,' In Proc of the 28th Intl. Conf. On VLDB, pp. 203-214, Aug. 2002
  6. Hammand, M., Franklin, M., Aref, W., and Elmagarmid, 'A. Scheduling for shared window joins over data streams,' In Proc of the 29th VLDB Sep, 2003
  7. Arvind Arasu, Jennifer Widom, 'Resource Sharing in Continuous Sliding-Window Aggregates,' In Proc. of the 30th VLDB 2004
  8. J. Li, D. Maier, 'Semantics and Evaluation Techniques for Window Aggregates in Data Streams,' In Proc of ACM SIGMOD International Conference on the management of Data, 2005 https://doi.org/10.1145/1066157.1066193
  9. J. Li, D. Maier, 'No Pane, No Gain : Efficient Evaluation of Sliding-Window Aggregates over Data Streams,' SIGMOD Record, Vol. 34, No. 1, March 2005 https://doi.org/10.1145/1058150.1058158
  10. M. Datar, A Gionis, P. Indyk, and R. Motwani. 'Maintaining stream statistics over sliding windows,' In Proc. of the 13th Annual ACM SIAM Symp. On Discrete Algorithms, pp. 635-644, Jan. 2002
  11. J. Gehrke, F. Korn, and D. Srivastava. 'On computing correlated aggregates over continual data streams,' In Proc. of the 2001 ACM SIGMOD IntI. Conf. on Management of Data, pp. 13-24, May 2001 https://doi.org/10.1145/375663.375665
  12. P.B. Gibbons and S. Tirthapura, 'Distributed streams algorithms for sliding windows,' In Proc. of the 14th Annual ACM Symp. On Parallel Algs. And Architectures, pp. 63-72, Aug. 2002 https://doi.org/10.1145/564870.564880
  13. R. Zhang, Nich. Koudas, 'Multiple Aggregations Over Data Streams,' In Proc, of the 2005 ACM SIGMOD IntI. Conf. on Management of Data, pp. 299-310, June 2005 https://doi.org/10.1145/1066157.1066192
  14. Tucker, P., Maier, D., Sherad, T. and Fegaras, L. 'Exploiting Punctuation Semantics in Continuous Data Streams,' Transactions on Knowledge and Data Engineering, 15,3, May 2003 https://doi.org/10.1109/TKDE.2003.1198390
  15. Hammand, M., Aref, W., Franklin, M., Mokbel, M., and Elmagarmid, A.K. 'Efficient Execution of Sliding Window Queries over Data Streams,' Purdue University Department of Computer Sciences Technical Report Number CSD TR 03-035, Dec 2003
  16. A. Dobra, M. N. Garofalakis, J. Gehrke, and R. Rastogi, 'Sketch-based multi-query processing over data streams,' In EDBT, 2004
  17. N. Koudas and D. Srivastava, 'Data stream query processing: A tutorial,' In VLDB, 2003
  18. R. E. Gruber. B. Krishanmurthy, and E. Panagos 'READY: A high performance event notification system,' In proc. of the 16th IntI. Conf. on Data Engineering, pp. 668-669, Mar. 2000 https://doi.org/10.1109/ICDE.2000.839487