An Optimal Way to Index Searching of Duality-Based Time-Series Subsequence Matching

Kim, Sang-Wook;Park, Dae-Hyun;Lee, Heon-Gil;

doi:10.3745/KIPSTD.2004.11D.5.1003

The KIPS Transactions:PartD (정보처리학회논문지D)

Volume 11D Issue 5
/
Pages.1003-1010
/
2004
/
1598-2866(pISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

An Optimal Way to Index Searching of Duality-Based Time-Series Subsequence Matching

이원성 기반 시계열 서브시퀀스 매칭의 인덱스 검색을 위한 최적의 기법

김상욱 (한양대학교 정보통신학부) ;
박대현 (강원대학교 컴퓨터정보통신공학부) ;
이헌길 (강원대학교 컴퓨터정보통신공학부)

Published : 2004.10.01

https://doi.org/10.3745/KIPSTD.2004.11D.5.1003 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we address efficient processing of subsequence matching in time-series databases. We first point out the performance problems occurring in the index searching of a prior method for subsequence matching. Then, we propose a new method that resolves these problems. Our method starts with viewing the index searching of subsequence matching from a new angle, thereby regarding it as a kind of a spatial-join called a window-join. For speeding up the window-join, our method builds an R＊-tree in main memory for f query sequence at starting of sub-sequence matching. Our method also includes a novel algorithm for joining effectively one R＊-tree in disk, which is for data sequences, and another R＊-tree in main memory, which is for a query sequence. This algorithm accesses each R＊-tree page built on data sequences exactly cure without incurring any index-level false alarms. Therefore, in terms of the number of disk accesses, the proposed algorithm proves to be optimal. Also, performance evaluation through extensive experiments shows the superiority of our method quantitatively.

본 논문에서는 시계열 데이터베이스에서 서브시퀀스 매칭을 효과적으로 처리하는 방안에 관하여 논의한다. 먼저, 본 논문에서는 서브시퀀스 매칭을 위한 기존 기법의 인덱스 검색에서 발생하는 성능상의 문제점들을 지적하고, 이들을 해결할 수 있는 새로운 방법을 제시한다. 제안된 기법은 서브시퀀스 매칭의 인덱스 검색 문제를 윈도우-조인이라는 일종의 공간 조인 문제로 새롭게 해석하는 것에서 출발한다. 윈도우-조인의 빠른 처리를 위하여 제안된 기법에서는 서브시퀀스 매칭을 시작할 때 질의 시퀀스를 위한 R＊-트리를 주기억장치 내에 구성한다. 또한, 제안된 기법은 데이터 시퀀스들을 위한 디스크 상의 R＊-트리와 질의 시퀀스를 위한 주기억장치 상의 R＊-트리를 효과적으로 조인할 수 있는 새로운 알고리즘을 포함한다. 이 알고리즘은 데이터 시퀀스들을 위한 R＊-트리 페이지들을 인덱스 단계의 착오 채택 없이 단 한번만 디스크로부터 액세스하므로 디스크 액세스 측면에서 최적의 기법임이 증명된다. 또한, 다양한 실험을 통한 성능 평가를 통하여 제안된 기법의 우수성을 정량적으로 규명한다.

Keywords

References

R. Agrawal, C. Faloutsos and A. Swami, 'Efficient Similarity Search in Sequence Databases,' In Proc. Int'l. Conf. on Foundations of Data Organization and Algorithms, FODO, pp.69-84, Oct., 1993
N. Beckmann et al., 'The $R^{\ast}$-tree: An Efficient and Robust Access Method for Points and Rectangles,' In Proc. Int'l. Conf. on Management of Data, ACM SIGMOD, pp.322-331, May, 1990
T. Brinkhoff, H.-P. Kriegel and B. Seeger, 'Efficient Processing of Spatial Joins Using R-Trees,' In Proc. ACM Int'l. Conf. on Management of Data, ACM SIGMOD, pp.237-246, 1993 https://doi.org/10.1145/170036.170075
P. P. Chan and A. W. C. Fu, 'Efficient Time-Series Matching By Wavelets,' In Proc. IEEE Int'l Conf. on Data Engineering, IEEE ICDE, pp.126-133, 1999 https://doi.org/10.1109/ICDE.1999.754915
C. Faloutsos, M. Ranganathan and Y. Manolopoulos, 'Fast Subsequence Matching in Time-series Databases,' In Proc. Int'l. Conf. on Management of Data, ACM SIGMOD, pp.419-429, May, 1994 https://doi.org/10.1145/191843.191925
Y.-W. Huang, N. Jing and E. A. Rundensteiner, 'Spatial Joins Using R-trees : Breadth-First Traversal with Global Optimizations,' In Proc. Int'l. Conf. on Very Large Data Bases, VLDB, pp.396-405, 1997
S. W. Kim, S. H. Park and W. W. Chu, 'An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases,' In Proc. IEEE Int'l. Conf. on Data Engineering, IEEE ICDE, pp.607-614, 2001 https://doi.org/10.1109/ICDE.2001.914875
E. J. Keogh et al., 'Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases,' In Proc. Int'l. Conf. on Management of Data, ACM SIGMOD, pp.419-429, May, 2001
W. K. Loh, S. W. Kim and K. Y. Whang, 'Index Interpolation : An Approach for Subsequence Matching Supporting Normalization Transform in Time-Series Databases,' In Proc. ACM Int'l. Conf. on Information and Knowledge Management, ACM CIKM, pp.480-487, 2000
Y. S. Moon, K. Y. Whang and W. K. Loh, 'Duality-Based Subsequence Matching in Time-Series Databases,' In Proc. IEEE Int'! Conf. on Data Engineering, IEEE ICDE, pp.263-272, 2001 https://doi.org/10.1109/ICDE.2001.914837
D. Rafiei and A. Mendelzon, 'Similarity-Based Queries for Time-Series Data,' In Proc. Int'l. Conf. on Management of Data, ACM SIGMOD, pp.13-24, 1997 https://doi.org/10.1145/253260.253264
J. W. Song, K. Y. Whang, Y. K. Lee and S. W. Kim, 'Spatial Join Processing Using Corner Transformation,' IEEE Trans. on Knowledge and Data Engineering, Vol.11, No.4, pp.688-695, 1999 https://doi.org/10.1109/69.790844
B. K. Yi and C. Faloutsos, 'Fast Time Sequence Indexing for Arbitrary Lp Norms,' In Proc. Int'l. Conf. on Very Large Data Bases, VLDB, pp.385-394, 2000