Adaptive Buffer Control over Disordered Streams

Kim, Hyeon-Gyu;Kim, Cheol-Gi;Lee, Chung-Ho;Kim, Myoung-Ho;

Journal of KIISE:Databases (한국정보과학회논문지:데이타베이스)

Volume 34 Issue 5
/
Pages.379-388
/
2007
/
1229-7739(pISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

Adaptive Buffer Control over Disordered Streams

비순서화된 스트림 처리를 위한 적응적 버퍼 제어 기법

김현규 (한국과학기술원 전산학과) ;
김철기 (한국정보통신대학교 전산학과) ;
이충호 (한국전자통신연구원 텔레메틱스.USN연구단) ;
김명호 (한국과학기술원 전산학과)

Published : 2007.10.15

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Disordered streams may cause inaccurate or delayed results in window-based queries. Existing approaches usually leverage buffers to hand]e the streams. However, most of the approaches estimate the buffer size simply based on the maximum network delay in the streams, which tends to over-estimate the buffer size and result in high latency. In this paper, we propose a probabilistic approach to estimate the buffer size adaptively according to the fluctuated network delays. We first assume that intervals of tuple generations follow an exponential distribution and network delays have a normal distribution. Then, we derive an estimation function from the assumptions. The function takes a drop ratio as an input parameter, which denotes a percentage of tuple drops permissible during query execution. By describing the drop ratio in a query specification, users can control the quality of query results such as accuracy or latency according to application requirements. Our experimental results show that the proposed function has better adaptivity than the existing function based on the maximum network delay.

비순서화된 스트림은 윈도우 기반의 질의를 처리할 때 부정확하거나 지연된 결과를 유발할 수 있다. 기존의 방식에서는 일반적으로 버퍼를 이용하여 비순서화된 스트림을 정렬하며, 버퍼의 크기를 추정하기 위해 네트워크 지연의 최대값에 기반한 방식을 이용한다. 그러나 이러한 방식은 버퍼의 크기를 불필요하게 큰 값으로 추정할 수 있으며, 지연된 질의 결과를 발생시킬 수 있다. 본 논문에서는 네트워크 지연의 변화에 따라 적응적으로 버퍼의 크기를 추정하기 위한 확률론적인 접근 방법을 제안한다 제안하는 방법에서는 튜플의 생성이 포아송 분포를 따르며 네트워크 지연은 정규 분포를 따른다고 가정한다. 그리고 이러한 가정을 바탕으로 추정식을 유도한다. 추정식은 튜플의 손실율을 입력인자로 요구하며, 이는 실시간에 튜플의 손실에 있어서 허용 가능한 백분율을 나타낸다. 사용자는 손실율을 질의문에서 정의함으로써, 응용의 요구에 따라 질의 결과의 정확성이나 처리속도 중 원하는 특성에 중점을 둘 수 있다. 본 논문의 실험 결과는 제안한 추정식이 기존의 네트워크 지연의 최대값에 기반한 추정식에 비해 적응성이 우수함을 보인다.

Keywords

References

Douglas Terry, David Goldberg, David Nichols, and Brian Oki, Continuous Queries over Append-Only Databases. ACM SIGMOD, 1992
Samuel R. Madden, Mehul A. Shah, Joseph M. Hellerstein and Vijayshankar Raman, Continuously Adaptive Continuous Queries over Streams. ACM SIGMOD Conference, Madison, WI, June 2002
S. Babu and J. Widom, Continuous Queries over Data Streams. ACM SIGMOD Record, Sep. 2001
Rajeev Motwani et al, Query Proessing, Resource Management, and Approximation in a Data Stream Management System. CIDR 2003, Jan. 2003
B. Babcock, S. Babu, M. Datar, R. Motwani, and J. Widom, Models and Issues in Data Stream Systems. Invited paper in Proc. of the 2002 ACM Symp. on Principles of Database Systems (PODS 2002), June 2002
Arvind Arasu et al, STREAM: The Stanford Data Stream Management System. IEEE Data Engineering Bulletin, Vol. 26 No. 1, March 2003
Sirish Chandrasekaran et al, TelegraphCQ: Continuous Dataflow Processing for an Uncertain World. CIDR 2003
D. Abadi, D. Carney, U. Cetintemel, M. Cherniack, C. Convey, S. Lee, M. Stonebraker, N. Tatbul, S. Zdonik. Aurora: A New Model and Architecture for Data Stream Management. VLDB Journal (12)2: 120-139, August 2003
D. Abadi at al, The Design of the Borealis Stream Processing Engine. CIDR 2005, Asilomar, CA, January 2005
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, Peter A. Tucker, Semantics and Evaluation Techniques for Window Aggregates in Data Streams. ACM SIGMOD 2005, June 14-16, 2005, Baltimore, Maryland, USA
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, Peter A. Tucker, No Pane, No Gain: Efficient Evaluation of Sliding Window Aggregates over Data Streams. SIGMOD Record, Vol 34, No. 1, March 2005
A. Arasu, S. Babu and J. Widom, The CQL Continuous Query Language: Semantic Foundations and Query Execution. Stanford University Technical Report, Oct. 2003
U. Srivastava and J. Widom. Flexible Time Management in Data Stream Systems. ACM PODS 2004, June 2004
S. Babu, U. Srivastava and J. Widom, Exploiting k-Constraints to Reduce Memory Overhead in Continuous Queries over Data Streams. ACM TODS, Sep. 2004
J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. ACM SIGMOD pages 379-390, May 2000
Chuck Cranor, Theodore Johnson, Oliver Spataschek and Vladislav Shkapenyuk, Gigascope: A Stream Database for Network Applications. ACM SIGMOD, June 9-12 2003
Peter A. Tucker, David Maier, Time Sheard, Leonidas Fegaras, Exploiting Punctuation Semantics in Continuous Data Streams. IEEE Transactions on Knowledge and Data Engineering, May/June 2003
David Maier, Jin Li, Peter A. Tucker, Kristin Tufte and Vassilis Papadimos, Semantics of Data Streams and Operators. ICDT 2005, LNCS 3363, pp.37-52, 2005
Lukasz Golab, Shaveen Garg, and M.Tamer Ozsu, On Indexing Sliding Windows over Online Data Streams, EDBT 2004, LNCS 2992, pp.712-729, 2004
Dimitry P. Bertsekas and John N. Tsitsiklis, Introduction to Probability: International Edition, Athena Scientific, Belmont, Massachusetts, 2002
TinyDB: http://www.tinyos.net
SENSIM: http://csc.lsu.edu/sensor_web/simulator.html
NS2 Sensor Network Extension: http://pf.itd.nrl.navy.mil/nrlsensorsim

Journal of KIISE:Databases (한국정보과학회논문지:데이타베이스)

Adaptive Buffer Control over Disordered Streams

비순서화된 스트림 처리를 위한 적응적 버퍼 제어 기법

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)