• Title/Summary/Keyword: Sequence Matching

Search Result 302, Processing Time 0.031 seconds

Quantum-based exact pattern matching algorithms for biological sequences

  • Soni, Kapil Kumar;Rasool, Akhtar
    • ETRI Journal
    • /
    • v.43 no.3
    • /
    • pp.483-510
    • /
    • 2021
  • In computational biology, desired patterns are searched in large text databases, and an exact match is preferable. Classical benchmark algorithms obtain competent solutions for pattern matching in O (N) time, whereas quantum algorithm design is based on Grover's method, which completes the search in $O(\sqrt{N})$ time. This paper briefly explains existing quantum algorithms and defines their processing limitations. Our initial work overcomes existing algorithmic constraints by proposing the quantum-based combined exact (QBCE) algorithm for the pattern-matching problem to process exact patterns. Next, quantum random access memory (QRAM) processing is discussed, and based on it, we propose the QRAM processing-based exact (QPBE) pattern-matching algorithm. We show that to find all t occurrences of a pattern, the best case time complexities of the QBCE and QPBE algorithms are $O(\sqrt{t})$ and $O(\sqrt{N})$, and the exceptional worst case is bounded by O (t) and O (N). Thus, the proposed quantum algorithms achieve computational speedup. Our work is proved mathematically and validated with simulation, and complexity analysis demonstrates that our quantum algorithms are better than existing pattern-matching methods.

Context Prediction based on Sequence Matching for Contexts with Discrete Attribute (이산 속성 컨텍스트를 위한 시퀀스 매칭 기반 컨텍스트 예측)

  • Choi, Young-Hwan;Lee, Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.4
    • /
    • pp.463-468
    • /
    • 2011
  • Context prediction methods have been developed in two ways - one is a prediction for discrete context and the other is for continuous context. As most of the prediction methods have been used with prediction algorithms in specific domains suitable to the environment and characteristics of contexts, it is difficult to conduct a prediction for a user's context which is based on various environments and characteristics. This study suggests a context prediction method available for both discrete and continuous contexts without being limited to the characteristics of a specific domain or context. For this, we conducted a context prediction based on sequence matching by generating sequences from contexts in consideration of association rules between context attributes and by applying variable weights according to each context attribute. Simulations for discrete and continuous contexts were conducted to evaluate proposed methods and the results showed that the methods produced a similar performance to existing prediction methods with a prediction accuracy of 80.12% in discrete context and 81.43% in continuous context.

Fragment Combination From DNA Sequence Data Using Fuzzy Reasoning Method (퍼지 추론기법을 이용한 DNA 염기 서열의 단편결합)

  • Kim, Kwang-Baek;Park, Hyun-Jung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.12
    • /
    • pp.2329-2334
    • /
    • 2006
  • In this paper, we proposed a method complementing failure of combining DNA fragments, defect of conventional contig assembly programs. In the proposed method, very long DNA sequence data are made into a prototype of fragment of about 700 bases that can be analyzed by automatic sequence analyzer at one time, and then matching ratio is calculated by comparing a standard prototype with 3 fragmented clones of about 700 bases generated by the PCR method. In this process, the time for calculation of matching ratio is reduced by Compute Agreement algorithm. Two candidates of combined fragments of every prototype are extracted by the degree of overlapping of calculated fragment pairs, and then degree of combination is decided using a fuzzy reasoning method that utilizes the matching ratios of each extracted fragment, and A, C, G, T membership degrees of each DNA sequence, and previous frequencies of each A, C, G, T. In this paper. DNA sequence combination is completed by the iteration of the process to combine decided optimal test fragments until no fragment remains. For the experiments, fragments or about 700 bases were generated from each sequence of 10,000 bases and 100,000 bases extracted from 'PCC6803', complete protein genome. From the experiments by applying random notations on these fragments, we could see that the proposed method was faster than FAP program, and combination failure, defect of conventional contig assembly programs, did not occur.

Indoor Location Positioning System for Image Recognition based LBS (영상인식 기반의 위치기반서비스를 위한 실내위치인식 시스템)

  • Kim, Jong-Bae
    • Journal of Korea Spatial Information System Society
    • /
    • v.10 no.2
    • /
    • pp.49-62
    • /
    • 2008
  • This paper proposes an indoor location positioning system for the image recognition based LBS. The proposed system is a vision-based location positioning system that is implemented the augmented reality by overlaying the location results with the view of the user. For implementing, the proposed system uses the pattern matching and location model to recognize user location from images taken by a wearable mobile PC with camera. In the proposed system, the system uses the pattern matching and location model for recognizing a personal location in image sequences. The system is estimated user location by the image sequence matching and marker detection methods, and is recognized user location by using the pre-defined location model. To detect marker in image sequences, the proposed system apply to the adaptive thresholding method, and by using the location model to recognize a location, the system can be obtained more accurate and efficient results. Experimental results show that the proposed system has both quality and performance to be used as an indoor location-based services(LBS) for visitors in various environments.

  • PDF

A Study on Clustering and Identifying Gene Sequences using Suffix Tree Clustering Method and BLAST (서픽스트리 클러스터링 방법과 블라스트를 통합한 유전자 서열의 클러스터링과 기능검색에 관한 연구)

  • Han, Sang-Il;Lee, Sung-Gun;Kim, Kyung-Hoon;Lee, Ju-Yeong;Kim, Young-Han;Hwang, Kyu-Suk
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.11 no.10
    • /
    • pp.851-856
    • /
    • 2005
  • The DNA and protein data of diverse species have been daily discovered and deposited in the public archives according to each established format. Database systems in the public archives provide not only an easy-to-use, flexible interface to the public, but also in silico analysis tools of unidentified sequence data. Of such in silico analysis tools, multiple sequence alignment [1] methods relying on pairwise alignment and Smith-Waterman algorithm [2] enable us to identify unknown DNA, protein sequences or phylogenetic relation among several species. However, in the existing multiple alignment method as the number of sequences increases, the runtime increases exponentially. In order to remedy this problem, we adopted a parallel processing suffix tree algorithm that is able to search for common subsequences at one time without pairwise alignment. Also, the cross-matching subsequences triggering inexact-matching among the searched common subsequences might be produced. So, the cross-matching masking process was suggested in this paper. To identify the function of the clusters generated by suffix tree clustering, BLAST was combined with a clustering tool. Our clustering and annotating tool is summarized as the following steps: (1) construction of suffix tree; (2) masking of cross-matching pairs; (3) clustering of gene sequences and (4) annotating gene clusters by BLAST search. The system was successfully evaluated with 22 gene sequences in the pyrubate pathway of bacteria, clustering 7 clusters and finding out representative common subsequences of each cluster

Image Mosaic from a Video Sequence using Block Matching Method (블록매칭을 이용한 비디오 시퀀스의 이미지 모자익)

  • 이지근;정성태
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.7 no.8
    • /
    • pp.1792-1801
    • /
    • 2003
  • In these days, image mosaic is getting interest in the field of advertisement, tourism, game, medical imaging, and so on with the development of internet technology and the performance of personal computers. The main problem of mage mosaic is searching corresponding points correctly in the overlapped area between images. However, previous methods requires a lot of CPU times and data processing for finding corresponding points. And they need repeated recording with a revolution of 360 degree around objects or background. This paper presents a new image mosaic method which generates a panorama image from a video sequence recorded by a general video camera. Our method finds the corresponding points between two successive images by using a new direction oriented 3­step block matching methods. Experimental results show that the suggested method is more efficient than the methods based on existing block matching algorithm, such as full search and K­step search algorithm.

A Fast Block Matching Motion Estimation Algorithm by using the Enhanced Cross-Hexagonal Search Pattern (개선된 크로스-육각 패턴을 이용한 고속 블록 정합 움직임 추정 알고리즘)

  • Nam Hyeon-Woo
    • Journal of the Korea Society of Computer and Information
    • /
    • v.11 no.4 s.42
    • /
    • pp.77-85
    • /
    • 2006
  • There is the spatial correlation of the video sequence between the motion vector of current blocks. In this paper, we propose the enhanced fast block matching algorithm using the spatial correlation of the video sequence and the center-biased properly of motion vectors. The proposed algorithm determines an exact motion vector using the predicted motion vector from the adjacent macro blocks of the current frame and the Cross-Hexagonal search pattern. From the of experimental results, we can see that our proposed algorithm outperforms both the prediction search algorithm (NNS) and the fast block matching algorithm (CHS) in terms of the search speed and the coded video's quality. Using our algorithm, we can improve the search speed by up to $0.1{\sim}38%$ and also diminish the PSNR (Peak Signal Noise Ratio) by at nst $0.05{\sim}2.5dB$, thereby improving the video qualify.

  • PDF

A Memory-Efficient Two-Stage String Matching Engine Using both Content-Addressable Memory and Bit-split String Matchers for Deep Packet Inspection (CAM과 비트 분리 문자열 매처를 이용한 DPI를 위한 2단의 문자열 매칭 엔진의 개발)

  • Kim, HyunJin;Choi, Kang-Il
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.39B no.7
    • /
    • pp.433-439
    • /
    • 2014
  • This paper proposes an architecture of two-stage string matching engine with content-addressable memory(CAM) and parallel bit-split string matchers for deep packet inspection(DPI). Each long signature is divided into subpatterns with the same length, where subpatterns are mapped onto the CAM in the first stage. The long pattern is matched in the second stage using the sequence of the matching indexes from the CAM. By adopting CAM and bit-split string matchers, the memory requirements can be greatly reduced in the heterogeneous string matching environments.

S2-Net: Machine reading comprehension with SRU-based self-matching networks

  • Park, Cheoneum;Lee, Changki;Hong, Lynn;Hwang, Yigyu;Yoo, Taejoon;Jang, Jaeyong;Hong, Yunki;Bae, Kyung-Hoon;Kim, Hyun-Ki
    • ETRI Journal
    • /
    • v.41 no.3
    • /
    • pp.371-382
    • /
    • 2019
  • Machine reading comprehension is the task of understanding a given context and finding the correct response in that context. A simple recurrent unit (SRU) is a model that solves the vanishing gradient problem in a recurrent neural network (RNN) using a neural gate, such as a gated recurrent unit (GRU) and long short-term memory (LSTM); moreover, it removes the previous hidden state from the input gate to improve the speed compared to GRU and LSTM. A self-matching network, used in R-Net, can have a similar effect to coreference resolution because the self-matching network can obtain context information of a similar meaning by calculating the attention weight for its own RNN sequence. In this paper, we construct a dataset for Korean machine reading comprehension and propose an $S^2-Net$ model that adds a self-matching layer to an encoder RNN using multilayer SRU. The experimental results show that the proposed $S^2-Net$ model has performance of single 68.82% EM and 81.25% F1, and ensemble 70.81% EM, 82.48% F1 in the Korean machine reading comprehension test dataset, and has single 71.30% EM and 80.37% F1 and ensemble 73.29% EM and 81.54% F1 performance in the SQuAD dev dataset.

A study on fault diagnosis for chemical processes using hybrid approach of quantitative and qualitative method (정성적, 정량적 기법의 혼합 전략을 통한 화학공정의 이상진단에 관한 연구)

  • 오영석;윤종한;윤인섭
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1996.10b
    • /
    • pp.714-717
    • /
    • 1996
  • This paper presents a fault detection and diagnosis methodologies based on weighted symptom model and pattern matching between the coming fault propagation trend and the simulated one. At the first step, backward chaining is used to find the possible cause candidates for the faults. The weighted symptom model(WSM) is used to generate those candidates. The weight is determined from dynamic simulation. Using WSMs, the methodology can generate the cause candidates and rank them according to the probability. Secondly, the fault propagation trends identified from the partial or complete sequence of measurements are compared to the standard fault propagation trends stored a priori. A pattern matching algorithm based on a number of triangular episodes is used to effectively match those trends. The standard trends have been generated using dynamic simulation and stored a priori. The proposed methodology has been illustrated using two case studies and showed satisfactory diagnostic resolution.

  • PDF