• Title/Summary/Keyword: Subsequence

Search Result 103, Processing Time 0.027 seconds

Time-Series Data Prediction using Hidden Markov Model and Similarity Search for CRM (CRM을 위한 은닉 마코프 모델과 유사도 검색을 사용한 시계열 데이터 예측)

  • Cho, Young-Hee;Jeon, Jin-Ho;Lee, Gye-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.14 no.5
    • /
    • pp.19-28
    • /
    • 2009
  • Prediction problem of the time-series data has been a research issue for a long time among many researchers and a number of methods have been proposed in the literatures. In this paper, a method is proposed that similarities among time-series data are examined by use of Hidden Markov Model and Likelihood and future direction of the data movement is determined. Query sequence is modeled by Hidden Markov Modeling and then the model is examined over the pre-recorded time-series to find the subsequence which has the greatest similarity between the model and the extracted subsequence. The similarity is evaluated by likelihood. When the best subsequence is chosen, the next portion of the subsequence is used to predict the next phase of the data movement. A number of experiments with different parameters have been conducted to confirm the validity of the method. We used KOSPI to verify suggested method.

Extracting Subsequence of Boolean Variables using SAT-solver (만족가능성 처리기를 이용한 이진 변수 서브시퀀스 추출)

  • Park, Sa-Choun;Kwon, Gi-Hwon
    • The KIPS Transactions:PartD
    • /
    • v.15D no.6
    • /
    • pp.777-784
    • /
    • 2008
  • Recently in the field of model checking, to overcome the state explosion problem, the method of using a SAT-solver is mainly researched. To use a SAT-solver, the system to be verified is translated into CNF and the Boolean cardinality constraint is widely used in translating the system into CNF. In BCC it is dealt with set of boolean variables, but there is no translating method of the sequence among Boolean variables. In this paper, we propose methods for translating the problem, which is extracting a subsequence with length k from a sequence of Boolean variables, into CNF formulas. Through experimental results, we show that our method is more efficient than using only BCC.

Instance-Level Subsequence Matching Method based on a Virtual Window (가상 윈도우 기반 인스턴스 레벨 서브시퀀스 매칭 방안)

  • Ihm, Sun-Young;Park, Young-Ho
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.3 no.2
    • /
    • pp.43-46
    • /
    • 2014
  • A time-series data is the collection of real numbers over the time intervals. One of the main tasks in time-series data is efficiently to find subsequences similar to a given query sequence. In this paper, we propose an efficient subsequence matching method, which is called Instance-Match (I-Match). I-Match constructs a virtual window in order to reduce false alarms. Through the experiment with real data set and query sets, we show that I-Match improves query processing time by up to 2.95 times and significantly reduces the number of candidates comparing to Dual Match.

A Visualization Tool for Ranked Subsequence Matching in Time-Series Databases (시계열 데이터베이스에서 순위를 지원하는 서브시퀀스 매칭 방법을 위한 시각화 툴)

  • Lee, Sung-Jin;Lee, Jinsoo;Cho, Hune;Han, Wook-Shin
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2009.11a
    • /
    • pp.787-788
    • /
    • 2009
  • 시계열 데이터(time-series data)는 연속적인 데이터를 고정된 시간 간격으로 샘플링한 실수 값들의 연속을 의미한다. 시계열 데이터의 예로는, 음악 및 동영상 데이터, 심전도 데이터, 주식 그래프 등의 데이터가 있다. 시계열 데이터는 다시 데이터베이스에 저장 되어있는 데이터 시퀀스(data sequence)와, 사용자에 의해 주어지는 질의 시퀀스(query sequence)로 분류된다. 시계열 데이터베이스(time-series database)에서 순위를 지원하는 서브시퀀스 매칭 방법(ranked subsequence matching)은 데이터 시퀀스와 질의 시퀀스가 주어졌을 때, 질의 시퀀스의 길이와 같은 데이터 시퀀스의 서브시퀀스(subsequence)들 중에서 질의 시퀀스와 가장 유사한 상위 k개의 서브시퀀스들을 찾는 것이다. 본 논문의 목적은 사용자가 매칭 방법에 대한 인식과 이해가 부족하더라도 기존의 콘솔 기반의 매칭 프로그램을 보다 쉽게 사용할 수 있도록 이용성을 향상시키기 위하여 시각화 툴을 개발하는 것이다. 구체적으로, 5가지 시각화(visualization) 기능을 제공하는 사용자 인터페이스를 구현하였다. 구현된 사용자 인터페이스를 통해 사용자가 기존의 매칭 프로그램을 보다 쉽고 간편하게 사용할 수 있도록 기여한다.

A Single Index Approach for Time-Series Subsequence Matching that Supports Moving Average Transform of Arbitrary Order (단일 색인을 사용한 임의 계수의 이동평균 변환 지원 시계열 서브시퀀스 매칭)

  • Moon Yang-Sae;Kim Jinho
    • Journal of KIISE:Databases
    • /
    • v.33 no.1
    • /
    • pp.42-55
    • /
    • 2006
  • We propose a single Index approach for subsequence matching that supports moving average transform of arbitrary order in time-series databases. Using the single index approach, we can reduce both storage space overhead and index maintenance overhead. Moving average transform is known to reduce the effect of noise and has been used in many areas such as econometrics since it is useful in finding overall trends. However, the previous research results have a problem of occurring index overhead both in storage space and in update maintenance since tile methods build several indexes to support arbitrary orders. In this paper, we first propose the concept of poly-order moving average transform, which uses a set of order values rather than one order value, by extending the original definition of moving average transform. That is, the poly-order transform makes a set of transformed windows from each original window since it transforms each window not for just one order value but for a set of order values. We then present theorems to formally prove the correctness of the poly-order transform based subsequence matching methods. Moreover, we propose two different subsequence matching methods supporting moving average transform of arbitrary order by applying the poly-order transform to the previous subsequence matching methods. Experimental results show that, for all the cases, the proposed methods improve performance significantly over the sequential scan. For real stock data, the proposed methods improve average performance by 22.4${\~}$33.8 times over the sequential scan. And, when comparing with the cases of building each index for all moving average orders, the proposed methods reduce the storage space required for indexes significantly by sacrificing only a little performance degradation(when we use 7 orders, the methods reduce the space by up to 1/7.0 while the performance degradation is only $9\%{\~}42\%$ on the average). In addition to the superiority in performance, index space, and index maintenance, the proposed methods have an advantage of being generalized to many sorts of other transforms including moving average transform. Therefore, we believe that our work can be widely and practically used in many sort of transform based subsequence matching methods.

Implementation of Engine Generating Mutation Worm Signature Using LCSeq (LCSeq를 이용한 변형 웜 시그니쳐 생성 엔진 구현)

  • Ko, Joon-Sang;Lee, Jae-Kwang;Kim, Bong-Han
    • The Journal of the Korea Contents Association
    • /
    • v.7 no.11
    • /
    • pp.94-101
    • /
    • 2007
  • We introduce the way to detect the mutation worm. We implemented the program that can generate signature using LCSeq(Longest Common Subsequence) technique in Suffix Tree studied as pattern recognition algorithm. We also showed the process to detect the mutation of CodeRed worm and Nimda worm and evaluated signatures generated by snort and LCSeq.

A Comparative Analysis of GeneralMatch and DualGMatch in Time-Series Subsequence Matching (시계열 서브시퀀스 매칭에서 GeneralMatch와 DualGmatch의 비교 분석)

  • Lee, Sanghun;Moon, Yang-Sae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.04a
    • /
    • pp.751-754
    • /
    • 2015
  • 최근 시계열 데이터베이스 기반의 다양한 응용 분야에서 서브시퀀스 매칭(subsequence matching)연구가 활발히 진행되고 있다. FRM과 DualMatch은 효과적인 서브시퀀스 매칭을 위해 처음 제안된 해결책이다. 이후 이들을 일반화한 GeneralMatch가 제안되었으며, 최근에는 GeneralMatch의 이원적 접근법인 DualGMatch가 제안되었다. 본 논문에서는 GeneralMatch와 DualGMath를 비교 분석 하고자 한다. 이를 위해, 먼저 윈도우 구성 관점에서 GeneralMatch와 DualGMatch를 평가한다. 다음으로, 두 해결책을 최대 윈도우 크기 효과와 인덱스 저장 효율 관점에서 이론적으로 비교 분석한다. 마지막으로, 실제 시계열 데이터를 활용하여 GeneralMatch와 DualGMatch의 인덱스 페이지 접근 횟수를 비교한다. 분석 결과, GeneralMatch가 윈도우 크기 효과와 인덱스 저장 효율 측면에서 DualGMatch보다 우수한 것으로 나타났다.

EQUIVALENT NORMS IN A BANACH FUNCTION SPACE AND THE SUBSEQUENCE PROPERTY

  • Calabuig, Jose M.;Fernandez-Unzueta, Maite;Galaz-Fontes, Fernando;Sanchez-Perez, Enrique A.
    • Journal of the Korean Mathematical Society
    • /
    • v.56 no.5
    • /
    • pp.1387-1401
    • /
    • 2019
  • Consider a finite measure space (${\Omega}$, ${\Sigma}$, ${\mu}$) and a Banach space $X({\mu})$ consisting of (equivalence classes of) real measurable functions defined on ${\Omega}$ such that $f{\chi}_A{\in}X({\mu})$ and ${\parallel}f{\chi}_A{\parallel}{\leq}{\parallel}f{\parallel}$, ${\forall}f{\in}({\mu})$, $A{\in}{\Sigma}$. We prove that if it satisfies the subsequence property, then it is an ideal of measurable functions and has an equivalent norm under which it is a Banach function space. As an application we characterize norms that are equivalent to a Banach function space norm.

Sequence-based Similar Music Retrieval Scheme (시퀀스 기반의 유사 음악 검색 기법)

  • Jun, Sang-Hoon;Hwang, Een-Jun
    • Journal of IKEEE
    • /
    • v.13 no.2
    • /
    • pp.167-174
    • /
    • 2009
  • Music evokes human emotions or creates music moods through various low-level musical features. Typical music clip consists of one or more moods and this can be used as an important criteria for determining the similarity between music clips. In this paper, we propose a new music retrieval scheme based on the mood change patterns of music clips. For this, we first divide music clips into segments based on low level musical features. Then, we apply K-means clustering algorithm for grouping them into clusters with similar features. By assigning a unique mood symbol for each cluster, we can represent each music clip by a sequence of mood symbols. Finally, to estimate the similarity of music clips, we measure the similarity of their musical mood sequence using the Longest Common Subsequence (LCS) algorithm. To evaluate the performance of our scheme, we carried out various experiments and measured the user evaluation. We report some of the results.

  • PDF

A Dynamic Hand Gesture Recognition System Incorporating Orientation-based Linear Extrapolation Predictor and Velocity-assisted Longest Common Subsequence Algorithm

  • Yuan, Min;Yao, Heng;Qin, Chuan;Tian, Ying
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.11 no.9
    • /
    • pp.4491-4509
    • /
    • 2017
  • The present paper proposes a novel dynamic system for hand gesture recognition. The approach involved is comprised of three main steps: detection, tracking and recognition. First, the gesture contour captured by a 2D-camera is detected by combining the three-frame difference method and skin-color elliptic boundary model. Then, the trajectory of the hand gesture is extracted via a gesture-tracking algorithm based on an occlusion-direction oriented linear extrapolation predictor, where the gesture coordinate in next frame is predicted by the judgment of current occlusion direction. Finally, to overcome the interference of insignificant trajectory segments, the longest common subsequence (LCS) is employed with the aid of velocity information. Besides, to tackle the subgesture problem, i.e., some gestures may also be a part of others, the most probable gesture category is identified through comparison of the relative LCS length of each gesture, i.e., the proportion between the LCS length and the total length of each template, rather than the length of LCS for each gesture. The gesture dataset for system performance test contains digits ranged from 0 to 9, and experimental results demonstrate the robustness and effectiveness of the proposed approach.