Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2002.9D.4.555

Optimization of Post-Processing for Subsequence Matching in Time-Series Databases  

Kim, Sang-Uk
Abstract
Subsequence matching, which consists of index searching and post-processing steps, is an operation that finds those subsequences whose changing patterns are similar to that of a given query sequence from a time-series database. This paper discusses optimization of post-processing for subsequence matching. The common problem occurred in post-processing of previous methods is to compare the candidate subsequence with the query sequence for discarding false alarms whenever each candidate subsequence appears during index searching. This makes a sequence containing candidate subsequences to be accessed multiple times from disk, and also have a candidate subsequence to be compared with the query sequence multiple times. These redundancies cause the performance of subsequence matching to degrade seriously. In this paper, we propose a new optimal method for resolving the problem. The proposed method stores ail the candidate subsequences returned by index searching into a binary search tree, and performs post-processing in a batch fashion after finishing the index searching. By this method, we are able to completely eliminate the redundancies mentioned above. For verifying the performance improvement effect of the proposed method, we perform extensive experiments using a real-life stock data set. The results reveal that the proposed method achieves 55 times to 156 times speedup over the previous methods.
Keywords
Time-Series Databases; Subsequence Matching; Post-Processing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. S. Moon, K. Y. Whang, and W. K. Loh, 'Duality-Based Subsequence Matching in Time-Series Databases,' In Proc. IEEE Int'l Conf. on Data Engineering, IEEE ICDE, pp.263-272, 2001   DOI
2 S. H. Park, S. W. Kim, and W. W. Chu, 'Segment-Based Approach for Subsequence Searches in Sequence Databases,' In Proc. ACM Int'l. Symp. on Applied Computing, ACM SAC, pp.248-252, 2001   DOI
3 D. Rafiei and A. Mendelzon, 'Similarity-Based Queries for Time-Series Data,' In Proc. Int'l, Conf. on Management of Data, ACM SIGMOD, pp.13-24, 1997   DOI
4 C. Faloutsos, M. Ranganathan, and Y. Manolopoulos, 'Fast Subsequence Matching in Time-series Databases,' In Proc. Int'l. Conf. on Management of Data, ACM SIGMOD, pp.419-429, May, 1994   DOI
5 J. Gray and A. Reuter, Transaction Processing : Concepts and Techniques, Morgan Kaufman Publishers, 1993
6 S. W. Kim, S. H. Park, and W. W. Chu, 'An Index-Based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases,' In Proc. IEEE Int'l. Conf. on Data Engineering, IEEE ICDE, pp.607-614, 2001   DOI
7 W. K. Loh, S. W. Kim, and K. Y. Whang, 'Index Interpolation : An Approach for Subsequence Matching Supporting Normalization Transform in Time-Series Databases,' In Proc. ACM Int'l. Conf. on Information and Knowledge Management, ACM CIKM, pp.480-487, 2000   DOI
8 R. Agrawal, C. Faloutsos, and A. Swami, 'Efficient Similarity Search in Sequence Databases,' In Proc. Int'l, Conf. on Foundations of Data Organization and Algorithms, FODO, pp.69-84, Oct., 1993
9 N. Beckmann et al., 'The $R^{*}$-tree : An Efficient and Robust Access Method for Points and Rectangles,' In Proc. Int'l, Conf. on Management of Data, ACM SIGMOD, pp.322-331, May, 1990   DOI
10 M-S Chen, J. Han, and Philip S. Yu, 'Data Mining : An Overview from a Database Perspective,' IEEE Transactions on Knowledge and Data Engineering, 8(6) : pp.866-883, 1996   DOI   ScienceOn