• Title/Summary/Keyword: Sequence Matching

Search Result 302, Processing Time 0.032 seconds

Effectiveness Evaluations of Subsequence Matching Methods Using KOSPI Data (한국 주식 데이터를 이용한 서브시퀀스 매칭 방법의 효과성 평가)

  • Yoo Seung Keun;Lee Sang Ho
    • The KIPS Transactions:PartD
    • /
    • v.12D no.3 s.99
    • /
    • pp.355-364
    • /
    • 2005
  • Previous researches on subsequence matching have been focused on how to make indexes in order to speed up the matching time, and do not take into account the effectiveness issues of subsequence matching methods. This paper considers the effectiveness of subsequence matching methods and proposes two metrics for effectiveness evaluations of subsequence matching algorithms. We have applied the proposed metrics to Korean stock data and five known matching algorithms. The analysis on the empirical data shows that two methods (i.e., the method supporting normalization, and the method supporting scaling and shifting) outperform the others in terms of the effectiveness of subsequence matching.

An Analysis System for Whole Genomic Sequence Using String B-Tree (스트링 B-트리를 이용한 게놈 서열 분석 시스템)

  • Choe, Jeong-Hyeon;Jo, Hwan-Gyu
    • The KIPS Transactions:PartA
    • /
    • v.8A no.4
    • /
    • pp.509-516
    • /
    • 2001
  • As results of many genome projects, genomic sequences of many organisms are revealed. Various methods such as global alignment, local alignment are used to analyze the sequences of the organisms, and k -mer analysis is one of the methods for analyzing the genomic sequences. The k -mer analysis explores the frequencies of all k-mers or the symmetry of them where the k -mer is the sequenced base with the length of k. However, existing on-memory algorithms are not applicable to the k -mer analysis because a whole genomic sequence is usually a large text. Therefore, efficient data structures and algorithms are needed. String B-tree is a good data structure that supports external memory and fits into pattern matching. In this paper, we improve the string B-tree in order to efficiently apply the data structure to k -mer analysis, and the results of k -mer analysis for C. elegans and other 30 genomic sequences are shown. We present a visualization system which enables users to investigate the distribution and symmetry of the frequencies of all k -mers using CGR (Chaotic Game Representation). We also describe the method to find the signature which is the part of the sequence that is similar to the whole genomic sequence.

  • PDF

NBR-Safe Transform: Lower-Dimensional Transformation of High-Dimensional MBRs in Similar Sequence Matching (MBR-Safe 변환 : 유사 시퀀스 매칭에서 고차원 MBR의 저차원 변환)

  • Moon, Yang-Sae
    • Journal of KIISE:Databases
    • /
    • v.33 no.7
    • /
    • pp.693-707
    • /
    • 2006
  • To improve performance using a multidimensional index in similar sequence matching, we transform a high-dimensional sequence to a low-dimensional sequence, and then construct a low-dimensional MBR that contains multiple transformed sequences. In this paper we propose a formal method that transforms a high-dimensional MBR itself to a low-dimensional MBR, and show that this method significantly reduces the number of lower-dimensional transformations. To achieve this goal, we first formally define the new notion of MBR-safe. We say that a transform is MBR-safe if a low-dimensional MBR to which a high-dimensional MBR is transformed by the transform contains every individual low-dimensional sequence to which a high-dimensional sequence is transformed. We then propose two MBR-safe transforms based on DFT and DCT, the most representative lower-dimensional transformations. For this, we prove the traditional DFT and DCT are not MBR-safe, and define new transforms, called mbrDFT and mbrDCT, by extending DFT and DCT, respectively. We also formally prove these mbrDFT and mbrDCT are MBR-safe. Moreover, we show that mbrDFT(or mbrDCT) is optimal among the DFT-based(or DCT-based) MBR-safe transforms that directly convert a high-dimensional MBR itself into a low-dimensional MBR. Analytical and experimental results show that the proposed mbrDFT and mbrDCT reduce the number of lower-dimensional transformations drastically, and improve performance significantly compared with the $na\"{\i}ve$ transforms. These results indicate that our MBR- safe transforms provides a useful framework for a variety of applications that require the lower-dimensional transformation of high-dimensional MBRs.

Conceptual Pattern Matching of Time Series Data using Hidden Markov Model (은닉 마코프 모델을 이용한 시계열 데이터의 의미기반 패턴 매칭)

  • Cho, Young-Hee;Jeon, Jin-Ho;Lee, Gye-Sung
    • The Journal of the Korea Contents Association
    • /
    • v.8 no.5
    • /
    • pp.44-51
    • /
    • 2008
  • Pattern matching and pattern searching in time series data have been active issues in a number of disciplines. This paper suggests a novel pattern matching technology which can be used in the field of stock market analysis as well as in forecasting stock market trend. First, we define conceptual patterns, and extract data forming each pattern from given time series, and then generate learning model using Hidden Markov Model. The results show that the context-based pattern matching makes the matching more accountable and the method would be effectively used in real world applications. This is because the pattern for new data sequence carries not only the matching itself but also a given context in which the data implies.

A Block Matching using the Motion Information of Previous Frame and the Predictor Candidate Point on each Search Region (이전 프레임의 움직임 정보와 탐색 구간별 예측 후보점을 이용하는 블록 정합)

  • 곽성근;위영철;김하진
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.3
    • /
    • pp.273-281
    • /
    • 2004
  • There is the temporal correlation of the video sequence between the motion vector of current block and the motion vector of previous block. In this paper, we propose the prediction search algorithm for block matching using the temporal correlation of the video sequence and the center-biased property of motion vectors. The proposed algorithm determines the location of a better starting point for the search of an exact motion vector using the point of the smallest SAD(sum of absolute difference) value by the predicted motion vector from the same block of the previous frame and the predictor candidate point on each search region. Simulation results show that PSNR(Peak-to-Signal Noise Ratio) values are improved up to the 1.06㏈ as depend on the video sequences and improved about 0.19∼0.46㏈ on an average except the full search(FS) algorithm.

Analysis of Gendered Job Sequence through Optimal Matching (최적일치법을 이용한 남녀간 직업 배열의 분석)

  • Han, Joon
    • Journal of Labour Economics
    • /
    • v.24 no.1
    • /
    • pp.149-176
    • /
    • 2001
  • This paper analyzes job sequences of men and women using optimal matching in order to find patterns of intra-generational mobility in Korean society. Men and women differ in their job careers: men show long-lasting job sequences with few gaps, while women either have short careers or have interrupted careers with long gaps. Long gaps in men's career are limited to those cases in which men move from agricultural to other job. The results from a combination of optimal matching and cluster analysis show that men's job sequences cluster around major occupations while women's cluster in terms of sequence length. The interrupted careers characteristic of women are considered to be consequent on the burdens of house keeping and child raising together with the discrimination against women pursuing careers.

  • PDF

Gene Sequences Clustering for the Prediction of Functional Domain (기능 도메인 예측을 위한 유전자 서열 클러스터링)

  • Han Sang-Il;Lee Sung-Gun;Hou Bo-Kyeng;Byun Yoon-Sup;Hwang Kyu-Suk
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.12 no.10
    • /
    • pp.1044-1049
    • /
    • 2006
  • Multiple sequence alignment is a method to compare two or more DNA or protein sequences. Most of multiple sequence alignment tools rely on pairwise alignment and Smith-Waterman algorithm to generate an alignment hierarchy. Therefore, in the existing multiple alignment method as the number of sequences increases, the runtime increases exponentially. In order to remedy this problem, we adopted a parallel processing suffix tree algorithm that is able to search for common subsequences at one time without pairwise alignment. Also, the cross-matching subsequences triggering inexact-matching among the searched common subsequences might be produced. So, the cross-matching masking process was suggested in this paper. To identify the function of the clusters generated by suffix tree clustering, BLAST and CDD (Conserved Domain Database)search were combined with a clustering tool. Our clustering and annotating tool consists of constructing suffix tree, overlapping common subsequences, clustering gene sequences and annotating gene clusters by BLAST and CDD search. The system was successfully evaluated with 36 gene sequences in the pentose phosphate pathway, clustering 10 clusters, finding out representative common subsequences, and finally identifying functional domains by searching CDD database.

Recovering the Elevation Map by Stereo Modeling of the Aerial Image Sequence (연속 항공영상의 스테레오 모델링에 의한 지형 복원)

  • 강민석;김준식;박래홍;이쾌희
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.30B no.9
    • /
    • pp.64-75
    • /
    • 1993
  • This paper proposes a recovering technique of the elevation map by stereo modeling of the aerial image sequence which is transformed based on the aircraft situation. The area-based stereo matching method is simulated and the various parameters are experimentally chosen. In a depth extraction step, the depth is determined by solving the vector equation. The equation is suitable for stereo modeling of aerial images which do not satisfy the epipolar constraint. Also, the performance of the conventional feature-based matching scheme is compared. Finally, techniques analyzing the accuracy of the recovered elevation map (REM) are described. The analysis includes the error estimation for both height and contour lines, where the accuracy is based on the measurements of deviations from the estimates obtained manually. The experimental results show the efficiency of the proposed technique.

  • PDF

Moving Object Edge Extraction from Sequence Image Based on the Structured Edge Matching (구조화된 에지정합을 통한 영상 열에서의 이동물체 에지검출)

  • 안기옥;채옥삼
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.425-428
    • /
    • 2003
  • Recently, the IDS(Intrusion Detection System) using a video camera is an important part of the home security systems which start gaining popularity. However, the video intruder detection has not been widely used in the home surveillance systems due to its unreliable performance in the environment with abrupt illumination change. In this paper, we propose an effective moving edge extraction algorithm from a sequence image. The proposed algorithm extracts edge segments from current image and eliminates the background edge segments by matching them with reference edge list, which is updated at every frame, to find the moving edge segments. The test results show that it can detect the contour of moving object in the noisy environment with abrupt illumination change.

  • PDF

Ground Plane Detection Using Homography Matrix (호모그래피행렬을 이용한 노면검출)

  • Lee, Ki-Yong;Lee, Joon-Woong
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.17 no.10
    • /
    • pp.983-988
    • /
    • 2011
  • This paper presents a robust method for ground plane detection in vision-based applications based on a monocular sequence of images with a non-stationary camera. The proposed method, which is based on the reliable estimation of the homography between two frames taken from the sequence, aims at designing a practical system to detect road surface from traffic scenes. The homography is computed using a feature matching approach, which often gives rise to inaccurate matches or undesirable matches from out of the ground plane. Hence, the proposed homography estimation minimizes the effects from erroneous feature matching by the evaluation of the difference between the predicted and the observed matrices. The method is successfully demonstrated for the detection of road surface performed on experiments to fill an information void area taken place from geometric transformation applied to captured images by an in-vehicle camera system.