Browse > Article
http://dx.doi.org/10.3745/KIPSTD.2006.13D.4.513

A Single Index Approach for Subsequence Matching that Supports Normalization Transform in Time-Series Databases  

Moon Yang-Sae (강원대학교 IT특성화대학 컴퓨터학부)
Kim Jin-Ho (강원대학교 IT특성화대학 컴퓨터학부)
Loh Woong-Kee (한국과학기술원 전자전산학과/첨단정보기술연구센터)
Abstract
Normalization transform is very useful for finding the overall trend of the time-series data since it enables finding sequences with similar fluctuation patterns. The previous subsequence matching method with normalization transform, however, would incur index overhead both in storage space and in update maintenance since it should build multiple indexes for supporting arbitrary length of query sequences. To solve this problem, we propose a single index approach for the normalization transformed subsequence matching that supports arbitrary length of query sequences. For the single index approach, we first provide the notion of inclusion-normalization transform by generalizing the original definition of normalization transform. The inclusion-normalization transform normalizes a window by using the mean and the standard deviation of a subsequence that includes the window. Next, we formally prove correctness of the proposed method that uses the inclusion-normalization transform for the normalization transformed subsequence matching. We then propose subsequence matching and index building algorithms to implement the proposed method. Experimental results for real stock data show that our method improves performance by up to $2.5{\sim}2.8$ times over the previous method. Our approach has an additional advantage of being generalized to support many sorts of other transforms as well as normalization transform. Therefore, we believe our work will be widely used in many sorts of transform-based subsequence matching methods.
Keywords
Data Mining; Time-Series Database; Subsequence Matching; Normalization Transform;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Wu, H., Salzberg, B., and Zhang, D., 'Online Event-driven Subsequence Matching Over Financial Data Streams,' In Proc. of Int'l Conf. on Management of Data, ACM SIGMOD, Paris, France, pp.23-34, June, 2004   DOI
2 Rafiei, D., and Mendelzon, A. O., 'Querying Time Series Data Based on Similarity,' IEEE Trans. on Knowledge and Data Engineering, Vol.12, No.5, pp.675-693, Sept./Oct., 2000   DOI   ScienceOn
3 Park, S., Chu, W. W., Yoon, J., and Won, J., 'Similarity Search of Time-Warped Subsequences via a Suffix Tree,' Information Systems, Vol.28, No.7, pp.867-883, Oct., 2003   DOI   ScienceOn
4 Loh, W.-K., Kim, S.-W., and Whang, K.-Y., 'A Subsequence Matching Algorithm that Supports Normalization Transform in Time-Series Databases,' Data Mining and Knowledge Discovery, Vol.9, No.1, pp.5-28, July, 2004   DOI
5 Yi, B.-K., Jagadish, H. V., and Faloutsos, C., 'Efficient Retrieval of Similar Time Sequences Under Time Warping,' In Proc. the 14th Int'l Conf. on Data Engineering(ICDE), IEEE, Orlando, Florida, pp.201-208, Feb., 1998   DOI
6 Rafiei, D., 'On Similarity-Based Queries for Time Series Data,' In Proc. the 15th Int'l Conf. on Data Engineering(ICDE), IEEE, Sydney, Australia, pp.410-417, Feb., 1999
7 Oppenheim, A. V. and Schafer, R. W., Digital Signal Processing, Prentice-Hall, 1975
8 Moon, Y.-S., Whang, K. Y., and Loh, W.-K., 'Duality-Based Subsequence Matching in Time-Series Databases,' In Proc. the 17th Int'l Conf. on Data Engineering (ICDE), IEEE, Heidelberg, Germany, pp.263-272, April, 2001   DOI
9 Beckmann, N., Kriegel, H.-P., Schneider, R., and Seeger, B., 'The R*-tree: An Efficient and Robust Access Method for Points and Rectangles,' In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Atlantic City, New Jersey, pp.322-331, May, 1990   DOI
10 Agrawal, R., Lin, K.-I., Sawhney, H. S., and Shim, K., 'Fast Similarity Search in the Presence of Noise, Scaling, and Translation in Time-Series Databases,' In Proc. the 21st Int'l Conf. on Very Large Data Bases, Zurich, Switzerland, pp.490-501, Sept., 1995
11 Moon, Y.-S., Whang, K. Y., and Han, W.-S., 'General Match: A Subsequence Matching Method in Time-Series Databases Based on Generalized Windows,' In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Madison, Wisconsin, pp.382-393, June, 2002   DOI
12 Loh, W.-K., Kim, S-W., and Whang, K.-Y., 'Index Interpolation: A Subsequence Matching Algorithm Supporting Moving Average Transform of Arbitrary Order in Time-Series Databases,' IEICE Transactions on Information and Systems, Vol.E84-D, No.1, pp.76-86, 2000
13 Agrawal, R., Faloutsos, C., and Swami, A., 'Efficient Similarity Search in Sequence Databases,' In Proc. the 4th Int'l Conf. on Foundations of Data Organization and Algorithms, Chicago, Illinois, pp.69-84, Oct., 1993
14 Kim, S.-W., Park, S, and Chu, W. W., 'Efficient Processing of Similarity Search Under Time Warping in Sequence Databases: An Index-based Approach,' Information Systems, Vol.29, No.5, pp.405-420, July 2004   DOI   ScienceOn
15 Faloutsos, C., Ranganathan, M., and Manolopoulos, Y., 'Fast Subsequence Matching in Time-Series Databases,' In Proc. Int'l Conf. on Management of Data, ACM SIGMOD, Minneapolis, Minnesota, pp.419-429, May, 1994   DOI   ScienceOn
16 Chu, K. W. and Wong, M. H., 'Fast Time-Series Searching with Scaling and Shifting,' In Proc. the 15th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems, Philadelphia, Pennsylvania, pp.237-248, June, 1999   DOI
17 Chan, K.-P., Fu, A. W. C., and Yu, C. T., 'Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping,' IEEE Trans. on Knowledge and Data Engineering, Vol.15, No.3, pp.686-705, Jan./Feb., 2003   DOI   ScienceOn