• 제목/요약/키워드: Sequence Pattern

검색결과 807건 처리시간 0.027초

Mining Maximal Frequent Contiguous Sequences in Biological Data Sequences

  • Kang, Tae-Ho;Yoo, Jae-Soo;Kim, Hak-Yong;Lee, Byoung-Yup
    • International Journal of Contents
    • /
    • 제3권2호
    • /
    • pp.18-24
    • /
    • 2007
  • Biological sequences such as DNA and amino acid sequences typically contain a large number of items. They have contiguous sequences that ordinarily consist of more than hundreds of frequent items. In biological sequences analysis(BSA), a frequent contiguous sequence search is one of the most important operations. Many studies have been done for mining sequential patterns efficiently. Most of the existing methods for mining sequential patterns are based on the Apriori algorithm. In particular, the prefixSpan algorithm is one of the most efficient sequential pattern mining schemes based on the Apriori algorithm. However, since the algorithm expands the sequential patterns from frequent patterns with length-1, it is not suitable for biological datasets with long frequent contiguous sequences. In recent years, the MacosVSpan algorithm was proposed based on the idea of the prefixSpan algorithm to significantly reduce its recursive process. However, the algorithm is still inefficient for mining frequent contiguous sequences from long biological data sequences. In this paper, we propose an efficient method to mine maximal frequent contiguous sequences in large biological data sequences by constructing the spanning tree with a fixed length. To verify the superiority of the proposed method, we perform experiments in various environments. The experiments show that the proposed method is much more efficient than MacosVSpan in terms of retrieval performance.

분할 순차 패턴과 SVM을 이용한 HPV 타입 예측 시스템 (HPV-type Prediction System using SVM and Partial Sequential Pattern)

  • 김진수
    • 디지털융복합연구
    • /
    • 제12권12호
    • /
    • pp.365-370
    • /
    • 2014
  • 기존의 시스템에서는 서열 전체 혹은 정렬되지 않은 서열로부터 패턴들을 생성하기 때문에 패턴의 수가 기하급수적으로 증가하여 많은 시간과 비용이 소모된다. 본 논문에서는 단백질의 전체 서열로부터 패턴을 찾아내는 것이 아니라, 다중 서열 정렬 기법을 이용하여 단백질의 분할 서열 구간을 생성하고 분할 서열 구간의 순차 패턴을 생성하며 생성된 패턴들을 통합하여 전체 모티프 후보 집합을 만들어 SVM의 훈련 집합으로 선택 및 학습하며, 최종적으로 미지의 혹은 알려진 단백질 서열의 HPV 타입을 SVM을 통해 학습된 정보를 적용하여 예측하는 시스템을 제안한다. 제안된 시스템은 기존의 시스템에 비해 최소 지지도 30%에서 정확도와 재현율 측면에서 보다 향상된 성능을 보였다.

부분방전 해석 방법으로 PSA(Pulse Sequence Analysis)의 문제점에 대한 고찰 (Some Considerations on the Problems of PSA(Pulse Sequence Analysis) as a Partial Discharge Analysis Method)

  • 김정태;이호근
    • 한국전기전자재료학회:학술대회논문집
    • /
    • 한국전기전자재료학회 2004년도 추계학술대회 논문집 Vol.17
    • /
    • pp.327-330
    • /
    • 2004
  • Because of its effectiveness for the PD(partial discharge) pattern recognition, PSA(Pulse Sequence Analysis) has been considered as a new analytic method instead of conventional PRPDA(Phase Resolved Partial Discharge Analysis). However, PSA has a big problem that can misanalyze patterns in case of data missing resulting from poor sensitivity because it analyses the correlation between sequential pulses, which leads to hesitate to apply it to on-site. Therefore, in this paper, the problems of PSA such as data missing and noise adding cases were investigated. For the purpose, PD data obtained from various defects including noise adding data were used and analysed, The result showed that both cases can cause fatal errors in recognizing PD patterns. In case of the data missing, the error depends on the kinds of defect and the degree of degradation. Also, it could be noticed that the error due to adding noises was larger than that due to some data missing.

  • PDF

부분방전 해석 방법으로 PSA(Pulse Sequence Analysis)의 현장 적용성에 대한 고찰 (Some Considerations on the On-site Applicability of PSA(Pulse Sequence Analysis) as a Partial Discharge Analysis Method)

  • 김정태;이호근
    • 한국전기전자재료학회논문지
    • /
    • 제18권5호
    • /
    • pp.484-489
    • /
    • 2005
  • Because of its effectiveness for the PD(Partial Discharge) pattern recognition, PSA(Pulse Sequence Analysis) has been considered as a new analytic method instead of conventional PRPDA(Phase Resolved Partial Discharge Analysis). However, it is generally thought that PSA has some possibility to misjudge patterns in case of data-missing resulting from poor sensitivity because it analyses the correlation between sequential pulses, which leads to hesitate to apply it to on-site. Therefore, in this paper, the problems of PSA such as data-missing and noise-adding cases were investigated. for the purpose, PD data obtained from various defects including noise-adding data were used and analyzed. As a result, it was shown that both cases could cause fatal errors in recognizing PD patterns. In case of the data missing, the error was dependant on the kinds of defect and the degree of degradation Also, it could be noticed that the error due to adding noises was larger than that due to some data missing.

GenScan을 이용한 진핵생물의 서열 패턴 분석 (Anlaysis of Eukaryotic Sequence Pattern using GenScan)

  • 정용규;임이슬;차병헌
    • 한국인터넷방송통신학회논문지
    • /
    • 제11권4호
    • /
    • pp.113-118
    • /
    • 2011
  • 서열 상동성 분석은 생명현상에 관여하는 물질을 정렬, 색인하여 데이터베이스 하는 것으로, 생명정보학의 유용성을 입증하는 분야이다. 본 논문에서는 구조가 복잡한 진핵생물의 서열 패턴을 단백질 서열로 변환하기 위해 은닉마르코프모델을 이용하는 GenScan 프로그램을 이용한다. 서열상동성 분석 중 최소거리 탐색 문제는 문제의 크기가 커지면 계산량이 기하급수적으로 증가하여 정확한 계산이 불가능해진다. 따라서 유사한 아미노산간의 치환과 상이한 아미노산간의 치환 점수를 차등화한 점수표를 적용하고, 은닉마르코프모델 등을 적용해 정교한 전이 확률모델을 적용한다. 변환된 서열을 서열 상동성 분석을 위해 사용되는 blast p를 이용하여, 은닉 마르코프 모델을 도입함으로 인해 단백질 구조 서열로 변환하는 데에 있어서 우수한 기능을 제공함을 알 수 있다.

최적화된 Flip Angle Pattern을 사용한 Turbo FLASH MRI: Inversion-Recovery T1-Weighted Imaging에의 응용 (Turbo FLASH NRI Using Optimized Flip Angle Pattern: Application to Inversion-Recovery T1-Weighted Imaging)

  • 오창현;최환준;양윤정;이덕래;류연철;현정호;김사라;이윤;정관진;안창범
    • 대한의용생체공학회:학술대회논문집
    • /
    • 대한의용생체공학회 1998년도 추계학술대회
    • /
    • pp.55-56
    • /
    • 1998
  • The 3-D Fast Gradient Echo (Turbo FLASH, Turbo Fast Low Angle Shot) sequence is optimized to achieve a good T1 contrast using variable excitation flip angles. In Turbo FLASH sequence, depending on the contrast preparation scheme, various types of image contrast can be established. While proton density contrast is obtained when using a short repetition time with a short echo time and small flip angles, T1 or T2 weighting can be obtained with proper contrast preparation sequences applied before the above proton density Turbo FLASH sequence. To maximize the contrast to noise ratio while retaining a sharp impulse response (smooth frequency domain response), the excitation flip-angle pattern is optimized through simulation and experiments. The TI (the delay after the preparation sequence which is a 180 degree inversion RF pulse in the IR T1 weighted imaging case), TD (the delay time between the Turbo FLASH sequence and the next preparation), and TR are also optimized fur the best image quality. The proposed 3-D Turbo FLASH provides $1mm\times1mm\times1.5mm$ high resolution images within a reasonable 5-8 minutes of imaging time. The proposed imaging sequence has been implemented in a Medison's Magnum 1.0T system and verified through simulations as well as human volunteer imaging. The experimental results show the utility of the proposed method.

  • PDF

Memory Tester용 ASIC 칩의 설계 (The Design of ASIC chip for Memory Tester)

  • 정지원;강창헌;최창;박종식
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2004년도 심포지엄 논문집 정보 및 제어부문
    • /
    • pp.153-155
    • /
    • 2004
  • In this paper, we design the memory tester chip playing an important role in the memory tester as central parts. Memory tester has the sixteen inner instructions to control the test sequence and the address and data signals to DUT. These instructions are saved in memory with each block such as sequencer and pattern generator. Sequencer controls the test sequence according to instructions saved in the memory. And Pattern generator generates the address and data signals according to instructions saved in the memory, too. We can use these chips for various functional test of memory.

  • PDF

Negative Selection Algorithm for DNA Pattern Classification

  • Lee, Dong-Wook;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2004년도 ICCAS
    • /
    • pp.190-195
    • /
    • 2004
  • We propose a pattern classification algorithm using self-nonself discrimination principle of immune cells and apply it to DNA pattern classification problem. Pattern classification problem in bioinformatics is very important and frequent one. In this paper, we propose a classification algorithm based on the negative selection of the immune system to classify DNA patterns. The negative selection is the process to determine an antigenic receptor that recognize antigens, nonself cells. The immune cells use this antigen receptor to judge whether a self or not. If one composes ${\eta}$ groups of antigenic receptor for ${\eta}$ different patterns, these receptor groups can classify into ${\eta}$ patterns. We propose a pattern classification algorithm based on the negative selection in nucleotide base level and amino acid level. Also to show the validity of our algorithm, experimental results of RNA group classification are presented.

  • PDF

텍스트 스트리밍 데이터에서 텍스트 임베딩과 이상 패턴 탐지를 이용한 신규 주제 발생 탐지 (Emerging Topic Detection Using Text Embedding and Anomaly Pattern Detection in Text Streaming Data)

  • 최세목;박정희
    • 한국멀티미디어학회논문지
    • /
    • 제23권9호
    • /
    • pp.1181-1190
    • /
    • 2020
  • Detection of an anomaly pattern deviating normal data distribution in streaming data is an important technique in many application areas. In this paper, a method for detection of an newly emerging pattern in text streaming data which is an ordered sequence of texts is proposed based on text embedding and anomaly pattern detection. Using text embedding methods such as BOW(Bag Of Words), Word2Vec, and BERT, the detection performance of the proposed method is compared. Experimental results show that anomaly pattern detection using BERT embedding gave an average F1 value of 0.85 and the F1 value of 1 in three cases among five test cases.

Quasi-static cyclic displacement pattern for seismic evaluation of reinforced concrete columns

  • Yuksel, E.;Surmeli, M.
    • Structural Engineering and Mechanics
    • /
    • 제37권3호
    • /
    • pp.267-283
    • /
    • 2011
  • Although earthquakes generate random cyclic lateral loading on structures, a quasi-static cyclic loading pattern with gradually increasing amplitude has been commonly used in the laboratory tests because of its relatively low cost and simplicity compared with pseudo-dynamic and shake table tests. The number, amplitudes and sequence of cycles must be chosen appropriately as important parameters of a quasi-static cyclic loading pattern in order to account for cumulative damage matter. This paper aims to reach a new cyclic displacement pattern to be used in quasi-static tests of well-confined, flexure-dominated reinforced concrete (RC) columns. The main parameters of the study are sectional dimensions, percentage of longitudinal reinforcement, axial force intensity and earthquake types, namely, far-fault and near-fault.