• Title/Summary/Keyword: Multidimensional Data Sequences

Search Result 12, Processing Time 0.023 seconds

A Study of Similarity Measures on Multidimensional Data Sequences Using Semantic Information (의미 정보를 이용한 다차원 데이터 시퀀스의 유사성 척도 연구)

  • Lee, Seok-Lyong;Lee, Ju-Hong;Chun, Seok-Ju
    • The KIPS Transactions:PartD
    • /
    • v.10D no.2
    • /
    • pp.283-292
    • /
    • 2003
  • One-dimensional time-series data have been studied in various database applications such as data mining and data warehousing. However, in the current complex business environment, multidimensional data sequences (MDS') become increasingly important in addition to one-dimensional time-series data. For example, a video stream can be modeled as an MDS in the multidimensional space with respect to color and texture attributes. In this paper, we propose the effective similarity measures on which the similar pattern retrieval is based. An MDS is partitioned into segments, each of which is represented by various geometric and semantic features. The similarity measures are defined on the basis of these segments. Using the measures, irrelevant segments are pruned from a database with respect to a given query. Both data sequences and query sequences are partitioned into segments, and the query processing is based upon the comparison of the features between data and query segments, instead of scanning all data elements of entire sequences.

Clustering Technique for Sequence Data Sets in Multidimensional Data Space (다차원 데이타 공간에서 시뭔스 데이타 세트를 위한 클러스터링 기법)

  • Lee, Seok-Lyong;LiIm, Tong-Hyeok;Chung, Chin-Wan
    • Journal of KIISE:Databases
    • /
    • v.28 no.4
    • /
    • pp.655-664
    • /
    • 2001
  • The continuous data such as video streams and voice analog signals can be modeled as multidimensional data sequences(MDS's) in the feature space, In this paper, we investigate the clustering technique for multidimensional data sequence, Each sequence is represented by a small number by hyper rectangular clusters for subsequent storage and similarity search processing. We present a linear clustering algorithm that guarantees a predefined level of clustering quality and show its effectiveness via experiments on various video data sets.

  • PDF

VDCluster : A Video Segmentation and Clustering Algorithm for Large Video Sequences (VDCluster : 대용량 비디오 시퀀스를 위한 비디오 세그멘테이션 및 클러스터링 알고리즘)

  • Lee, Seok-Ryong;Lee, Ju-Hong;Kim, Deok-Hwan;Jeong, Jin-Wan
    • Journal of KIISE:Databases
    • /
    • v.29 no.3
    • /
    • pp.168-179
    • /
    • 2002
  • In this paper, we investigate video representation techniques that are the foundational work for the subsequent video processing such as video storage and retrieval. A video data set if a collection of video clips, each of which is a sequence of video frames and is represented by a multidimensional data sequence (MDS). An MDS is partitioned into video segments considering temporal relationship among frames, and then similar segments of the clip are grouped into video clusters. Thus, the video clip is represented by a small number of video clusters. The video segmentation and clustering algorithm, VDCluster, proposed in this paper guarantee clustering quality to south an extent that satisfies predefined conditions. The experiments show that our algorithm performs very effectively with respect to various video data sets.

Pattern Similarity Retrieval of Data Sequences for Video Retrieval System (비디오 검색 시스템을 위한 데이터 시퀀스 패턴 유사성 검색)

  • Lee Seok-Lyong
    • The KIPS Transactions:PartD
    • /
    • v.13D no.3 s.106
    • /
    • pp.347-356
    • /
    • 2006
  • A video stream can be represented by a sequence of data points in a multidimensional space. In this paper, we introduce a trend vector that approximates values of data points in a sequence and represents the moving trend of points in the sequence, and present a pattern similarity matching method for data sequences using the trend vector. A sequence is partitioned into multiple segments, each of which is represented by a trend vector. The query processing is based on the comparison of these vectors instead of scanning data elements of entire sequences. Using the trend vector, our method is designed to filter out irrelevant sequences from a database and to find similar sequences with respect to a query. We have performed an extensive experiment on synthetic sequences as well as video streams. Experimental results show that the precision of our method is up to 2.1 times higher and the processing time is up to 45% reduced, compared with an existing method.

Physical Database Design for DFT-Based Multidimensional Indexes in Time-Series Databases (시계열 데이터베이스에서 DFT-기반 다차원 인덱스를 위한 물리적 데이터베이스 설계)

  • Kim, Sang-Wook;Kim, Jin-Ho;Han, Byung-ll
    • Journal of Korea Multimedia Society
    • /
    • v.7 no.11
    • /
    • pp.1505-1514
    • /
    • 2004
  • Sequence matching in time-series databases is an operation that finds the data sequences whose changing patterns are similar to that of a query sequence. Typically, sequence matching hires a multi-dimensional index for its efficient processing. In order to alleviate the dimensionality curse problem of the multi-dimensional index in high-dimensional cases, the previous methods for sequence matching apply the Discrete Fourier Transform(DFT) to data sequences, and take only the first two or three DFT coefficients as organizing attributes of the multi-dimensional index. This paper first points out the problems in such simple methods taking the firs two or three coefficients, and proposes a novel solution to construct the optimal multi -dimensional index. The proposed method analyzes the characteristics of a target database, and identifies the organizing attributes having the best discrimination power based on the analysis. It also determines the optimal number of organizing attributes for efficient sequence matching by using a cost model. To show the effectiveness of the proposed method, we perform a series of experiments. The results show that the Proposed method outperforms the previous ones significantly.

  • PDF

A Subsequence Matching Technique that Supports Time Warping Efficiently (타임 워핑을 지원하는 효율적인 서브시퀀스 매칭 기법)

  • Park, Sang-Hyun;Kim, Sang-Wook;Cho, June-Suh;Lee, Hoen-Gil
    • Journal of Industrial Technology
    • /
    • v.21 no.A
    • /
    • pp.167-179
    • /
    • 2001
  • This paper discusses an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In earlier work, we suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multi-dimensional index using a feature vector as indexing attributes. For query precessing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verily the superiority of our method, we perform extensive experiments. The results reseal that our method achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.

  • PDF

An Index-Based Approach for Subsequence Matching Under Time Warping in Sequence Databases (시퀀스 데이터베이스에서 타임 워핑을 지원하는 효과적인 인덱스 기반 서브시퀀스 매칭)

  • Park, Sang-Hyeon;Kim, Sang-Uk;Jo, Jun-Seo;Lee, Heon-Gil
    • The KIPS Transactions:PartD
    • /
    • v.9D no.2
    • /
    • pp.173-184
    • /
    • 2002
  • This paper discuss an index-based subsequence matching that supports time warping in large sequence databases. Time warping enables finding sequences with similar patterns even when they are of different lengths. In earlier work, Kim et al. suggested an efficient method for whole matching under time warping. This method constructs a multidimensional index on a set of feature vectors, which are invariant to time warping, from data sequences. For filtering at feature space, it also applies a lower-bound function, which consistently underestimates the time warping distance as well as satisfies the triangular inequality. In this paper, we incorporate the prefix-querying approach based on sliding windows into the earlier approach. For indexing, we extract a feature vector from every subsequence inside a sliding window and construct a multidimensional index using a feature vector as indexing attributes. For query processing, we perform a series of index searches using the feature vectors of qualifying query prefixes. Our approach provides effective and scalable subsequence matching even with a large volume of a database. We also prove that our approach does not incur false dismissal. To verify the superiority of our approach, we perform extensive experiments. The results reveal that our approach achieves significant speedup with real-world S&P 500 stock data and with very large synthetic data.

EMRQ: An Efficient Multi-keyword Range Query Scheme in Smart Grid Auction Market

  • Li, Hongwei;Yang, Yi;Wen, Mi;Luo, Hongwei;Lu, Rongxing
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.8 no.11
    • /
    • pp.3937-3954
    • /
    • 2014
  • With the increasing electricity consumption and the wide application of renewable energy sources, energy auction attracts a lot of attention due to its economic benefits. Many schemes have been proposed to support energy auction in smart grid. However, few of them can achieve range query, ranked search and personalized search. In this paper, we propose an efficient multi-keyword range query (EMRQ) scheme, which can support range query, ranked search and personalized search simultaneously. Based on the homomorphic Paillier cryptosystem, we use two super-increasing sequences to aggregate multidimensional keywords. The first one is used to aggregate one buyer's or seller's multidimensional keywords to an aggregated number. The second one is used to create a summary number by aggregating the aggregated numbers of all sellers. As a result, the comparison between the keywords of all sellers and those of one buyer can be achieved with only one calculation. Security analysis demonstrates that EMRQ can achieve confidentiality of keywords, authentication, data integrity and query privacy. Extensive experiments show that EMRQ is more efficient compared with the scheme in [3] in terms of computation and communication overhead.

Hybrid Lower-Dimensional Transformation for Similar Sequence Matching (유사 시퀀스 매칭을 위한 하이브리드 저차원 변환)

  • Moon, Yang-Sae;Kim, Jin-Ho
    • The KIPS Transactions:PartD
    • /
    • v.15D no.1
    • /
    • pp.31-40
    • /
    • 2008
  • We generally use lower-dimensional transformations to convert high-dimensional sequences into low-dimensional points in similar sequence matching. These traditional transformations, however, show different characteristics in indexing performance by the type of time-series data. It means that the selection of lower-dimensional transformations makes a significant influence on the indexing performance in similar sequence matching. To solve this problem, in this paper we propose a hybrid approach that integrates multiple transformations and uses them in a single multidimensional index. We first propose a new notion of hybrid lower-dimensional transformation that exploits different lower-dimensional transformations for a sequence. We next define the hybrid distance to compute the distance between the transformed sequences. We then formally prove that the hybrid approach performs the similar sequence matching correctly. We also present the index building and the similar sequence matching algorithms that use the hybrid approach. Experimental results for various time-series data sets show that our hybrid approach outperforms the single transformation-based approach. These results indicate that the hybrid approach can be widely used for various time-series data with different characteristics.

A Study on Comparing algorithms for Boxing Motion Recognition (권투 모션 인식을 위한 알고리즘 비교 연구)

  • Han, Chang-Ho;Kim, Soon-Chul;Oh, Choon-Suk;Ryu, Young-Kee
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.8 no.6
    • /
    • pp.111-117
    • /
    • 2008
  • In this paper, we describes the boxing motion recognition which is used in the part of games, animation. To recognize the boxing motion, we have used two algorithms, one is principle component analysis, the other is dynamic time warping algorithm. PCA is the simplest of the true eigenvector-based multivariate analyses and often used to reduce multidimensional data sets to lower dimensions for analysis. DTW is an algorithm for measuring similarity between two sequences which may vary in time or speed. We introduce and compare PCA and DTW algorithms respectively. We implemented the recognition of boxing motion on the motion capture system which is developed in out research, and depict the system also. The motion graph will be created by boxing motion data which is acquired from motion capture system, and will be normalized in a process. The result has implemented in the motion recognition system with five actors, and showed the performance of the recognition.

  • PDF