[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3745/KIPSTD.2002.9D.4.555

Optimization of Post-Processing for Subsequence Matching in Time-Series Databases

Kim, Sang-Uk

Publication Information

The KIPS Transactions:PartD / v.9D, no.4, 2002 , pp. 555-560 More about this Journal

Abstract

Subsequence matching, which consists of index searching and post-processing steps, is an operation that finds those subsequences whose changing patterns are similar to that of a given query sequence from a time-series database. This paper discusses optimization of post-processing for subsequence matching. The common problem occurred in post-processing of previous methods is to compare the candidate subsequence with the query sequence for discarding false alarms whenever each candidate subsequence appears during index searching. This makes a sequence containing candidate subsequences to be accessed multiple times from disk, and also have a candidate subsequence to be compared with the query sequence multiple times. These redundancies cause the performance of subsequence matching to degrade seriously. In this paper, we propose a new optimal method for resolving the problem. The proposed method stores ail the candidate subsequences returned by index searching into a binary search tree, and performs post-processing in a batch fashion after finishing the index searching. By this method, we are able to completely eliminate the redundancies mentioned above. For verifying the performance improvement effect of the proposed method, we perform extensive experiments using a real-life stock data set. The results reveal that the proposed method achieves 55 times to 156 times speedup over the previous methods.

Keywords

Time-Series Databases; Subsequence Matching; Post-Processing;

Citations & Related Records

Reference

1	Y. S. Moon, K. Y. Whang, and W. K. Loh, 'Duality-Based Subsequence Matching in Time-Series Databases,' In Proc. IEEE Int'l Conf. on Data Engineering, IEEE ICDE, pp.263-272, 2001 DOI
2	S. H. Park, S. W. Kim, and W. W. Chu, 'Segment-Based Approach for Subsequence Searches in Sequence Databases,' In Proc. ACM Int'l. Symp. on Applied Computing, ACM SAC, pp.248-252, 2001 DOI
3	D. Rafiei and A. Mendelzon, 'Similarity-Based Queries for Time-Series Data,' In Proc. Int'l, Conf. on Management of Data, ACM SIGMOD, pp.13-24, 1997 DOI
4	C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, 'Fast Subsequence Matching in Time-series Databases,' In Proc. Int'l. Conf. on Management of Data, ACM SIGMOD, pp.419-429, May, 1994 DOI
5	J. Gray and A. Reuter, Transaction Processing : Concepts and Techniques, Morgan Kaufman Publishers, 1993
6	S. W. Kim, S. H. Park, and W. W. Chu, 'An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases,' In Proc. IEEE Int'l. Conf. on Data Engineering, IEEE ICDE, pp.607-614, 2001 DOI
7	W. K. Loh, S. W. Kim, and K. Y. Whang, 'Index Interpolation : An Approach for Subsequence Matching Supporting Normalization Transform in Time-Series Databases,' In Proc. ACM Int'l. Conf. on Information and Knowledge Management, ACM CIKM, pp.480-487, 2000 DOI
8	R. Agrawal, C. Faloutsos, and A. Swami, 'Efficient Similarity Search in Sequence Databases,' In Proc. Int'l, Conf. on Foundations of Data Organization and Algorithms, FODO, pp.69-84, Oct., 1993
9	N. Beckmann et al., 'The $R^{*}$ -tree : An Efficient and Robust Access Method for Points and Rectangles,' In Proc. Int'l, Conf. on Management of Data, ACM SIGMOD, pp.322-331, May, 1990 DOI
10	M-S Chen, J. Han, and Philip S. Yu, 'Data Mining : An Overview from a Database Perspective,' IEEE Transactions on Knowledge and Data Engineering, 8(6) : pp.866-883, 1996 DOI ScienceOn

KSCI

Optimization of Post-Processing for Subsequence Matching in Time-Series Databases 시계열 데이터베이스에서 서브시퀀스 매칭을 위한 후처리 과정의 최적화

Optimization of Post-Processing for Subsequence Matching in Time-Series Databases