[KSCI] Korea Science Citation Index Service

Similarity Search in Time Series Databases based on the Normalized Distance

이상준 (서울대학교 전기컴퓨터공학부)
이석호 (서울대학교 전기컴퓨터공학부)

Publication Information

Journal of KIISE:Databases / v.31, no.1, 2004 , pp. 23-29 More about this Journal

Abstract

In this paper, we propose a search method for time sequences which supports the normalized distance as a similarity measure. In many applications where the shape of the time sequence is a major consideration, the normalized distance is a more suitable similarity measure than the simple Lp distance. To support normalized distance queries, most of the previous work has the preprocessing step for vertical shifting which normalizes each sequence by its mean. The proposed method is motivated by the property of sequence for feature extraction. That is, the variation between two adjacent elements of a time sequence is invariant under vertical shifting. The extracted feature is indexed by the spatial access method such as R-tree. The proposed method can match time series of similar shape without vertical shifting and guarantees no false dismissals. The experiments are performed on real data(stock price movement) to verify the performance of the proposed method.

Keywords

database; time series; similarity search; normalized distance;

Citations & Related Records

Reference

1	Eamonn J. Keogh, Kaushik Chakrabareti, Sharad Mehrotra, Michael J. Pazzani, 'Locally Adaptive Dimensionality Reduction for Indexing Large Time Series Databases,' In Proceediings of ACM SIGMOD Conference, pp. 151-162, 2001 DOI
2	Yang-Sae Moon, Kyu-Young Whang, Woong-Kee Loh, 'Duality-Based Subsequence Matching in Time-Series Databases,' In Proceedings of ICDE, pp. 263-272, 2001 DOI
3	Sze Kin Lam, Man Hon Wong, 'A Fast Projection Algorithm for Sequence Data Searching,' DKE 28(3), pp. 321-339, 1998 DOI ScienceOn
4	Antonin Guttman, 'R-trees: A Dynamic Index Structure for Spatial Searching,' In Proceedings of ACM SIGMOD Conference, pp. 47-57, 1984
5	Norbert Beckmann, Hans-Peter Kriegel, Ralf Schneider, Bernhard Seeger, 'The R*-tree: An Efficient and Robust Access Method for Points and Rectangles,' In Proceedings of ACM SIGMOD Conference, pp. 322-331, 1990 DOI
6	Sangwook Kim, Sanghyun Park and W. Chu, 'An Index-based Approach for Similarity Search Supporting Time Warping in Large Sequence Databases,' In Proceedings of ICDE, pp. 607-614, 2001 DOI
7	Byoung-Kee Yi, Christos Faloutsos, 'Fast Time Sequence Indexing for Arbitrary Lp Norms,' In Proceedings of VLDB Conference, pp. 385-394, 2000
8	Eamonn J. Keogh, Michael J. Pazzani, 'An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification,Clustering and Relevance Feedback,' In Proceedings of KDD Conference, pp. 239-243, 1998
9	Sangjun Lee, Dongseop Kwon, Sukho Lee, 'Efficient Similarity Search for Time Series Data Based on the Minimum Distanc,' In Proceedings of CAiSE, pp. 377-391, 2002
10	Eamonn J. Keogh, 'Exact Indexing of Dyanmic Time Warping,' In Proceedings of VLDB Conference, pp. 406-417, 2002
11	Flip Korn, H. V. Jagadish, Christos Faloutsos, 'Efficiently Supporting Ad Hoc Queries in Large Datasets of Time Sequences,' In Proceedings of ACM SIGMOD Conference, pp. 289-300, 1997 DOI
12	Kelvin Kam Wing Chu, Man Hon Wong, 'Fast Time-Series Searching with Scaling and Shifting,' In Proceedings of PODS, pp. 237-248, 1999 DOI
13	Byoung-Kee Yi, H. V. Jagadish, Christos Faloutsos, 'Efficient Retrieval of Similar Time Sequences Under Time Warping,' In Proceedings of ICDE, pp. 201-208, 1998 DOI
14	Rakesh Agrawal, King-Ip Lin, Harpreet S. Sawhney, Kyuseok Shim, 'Fast Similarity Search in the Presence of Noise, Scaling and Transiation in Time-Series Databases,' In Proceedings of VLDB Conference, pp. 490-501, 1995
15	Chung-Sheng Li, Philip S. Yu, Vittorio Castelli, 'Hierarchy Scan: A Hierarchical Similarity Search Algorithm for Databases of Long Sequences,' In Proceedings of ICDE, pp. 546-553, 1996 DOI
16	Davood Rafiei, Alberto O. Mendelzon, 'Similarity Based Queries for Time Series Data,' In Proceedings of ACM SIGMOD Conference, pp. 12-25, 1997 DOI
17	Chang-Shing Perng, Haixun Wang, Sylvia R. Zhang, D. Stott Parker, 'Landmarks:a New Model for Similarity-based Pattern Querying in Time Series Databases,' In Proceedings of ICDE, pp. 33-42, 2000 DOI
18	M. H. Protter, C. B. Morrey, 'A First Course in Real Analysis,' Springer-Verlag, 1997
19	Rakesh Agrawal, Christos Faloutsos, Arun N, Swami, 'Efficient Similarity Search In Sequence Databases,' In Proceedings of FODO, pp. 69-84, 1993 DOI ScienceOn
20	Rakesh Agrawal, T. Imielinski, Arun N. Swami, 'Database Mining: A Performance Perspective,' IEEE TKDE, Special issue on Learning and Discovery in Knowledge-Based Databases 5(6), pp. 914-925, 1993 DOI ScienceOn
21	Kelvin Kam Wing Chu, Sze Kin Lam, Man Hon Wong, 'An Efficient Hash-Based Algorithm for Sequence Data Searching,' The Computer Journal 41(6), pp. 402-415, 1998 DOI ScienceOn
22	Usama M. Fayyad, Gregory Piatetsky-Shapiro, Padhraic Smyth, 'Knowledge Discovery and Data Mining : Towards a Unifying Framework,' In Proceedings of KDD conference, pp. 82-88, 1996
23	Sanghyun Park, Wesley W. Chu, Jeehee Yoon, Chihcheng Hsu, 'Efficient Searches for Similar Subsequences of Different Lengths in Sequence Databases,' In Proceedings of ICDE pp. 23-32, 2000 DOI
24	Christos Faloutsos, M. Ranganathan, Yannis. Manolopoulos, 'Fast Subsequence Matching in Time-Series Databases,' In Proceedings of ACM SIGMOD Conference, pp. 419-429, 1994 DOI
25	Davood Rafiei, 'On Similarity-Based Queries for Time Series Data,' In Proceedings of ICDE, pp. 410-417, 1999
26	Kin-pong Chan, Ada Wai-chee Fu, 'Efficient Time Series Matching by Wavelets,' In Proceedings of ICDE 1999: 126-133 DOI
27	Eamonn J. Keogh, Michael J. Pazzani, 'A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases,' In Proceedings of PAKDD Conference, pp. 122-133, 2000

KSCI

Similarity Search in Time Series Databases based on the Normalized Distance 정규 거리에 기반한 시계열 데이터베이스의 유사 검색 기법

Similarity Search in Time Series Databases based on the Normalized Distance