• 제목/요약/키워드: sequence analysis

검색결과 6,368건 처리시간 0.037초

Subband PRI 분석 알고리즘 (Subband PRI analysis algorithm)

  • 윤원식
    • 한국통신학회논문지
    • /
    • 제21권6호
    • /
    • pp.1425-1429
    • /
    • 1996
  • 누락펄스가 발생할 시 PRI 분석을 위한 전형적인 sequence search 알고리즘은 harmonic 문제를 발생시킨다. 본 논문에서 이러한 harmonic 문제를 해결할 수 있는 PRI 분석 알고리즘을 제안한다. 분석해야 할 전체 PRI 범위를 harmonic이 없도록 subband 로 나눈 후, sequence search를 forward 및 backward로 행한다. 전형적인 sequence search 알고리즘에 비하여 제안한 알고리즘의 성능이 증대된다.

  • PDF

A Pattern Summary System Using BLAST for Sequence Analysis

  • Choi, Han-Suk;Kim, Dong-Wook;Ryu, Tae-W.
    • Genomics & Informatics
    • /
    • 제4권4호
    • /
    • pp.173-181
    • /
    • 2006
  • Pattern finding is one of the important tasks in a protein or DNA sequence analysis. Alignment is the widely used technique for finding patterns in sequence analysis. BLAST (Basic Local Alignment Search Tool) is one of the most popularly used tools in bio-informatics to explore available DNA or protein sequence databases. BLAST may generate a huge output for a large sequence data that contains various sequence patterns. However, BLAST does not provide a tool to summarize and analyze the patterns or matched alignments in the BLAST output file. BLAST lacks of general and robust parsing tools to extract the essential information out from its output. This paper presents a pattern summary system which is a powerful and comprehensive tool for discovering pattern structures in huge amount of sequence data in the BLAST. The pattern summary system can identify clusters of patterns, extract the cluster pattern sequences from the subject database of BLAST, and display the clusters graphically to show the distribution of clusters in the subject database.

Linear-Time Korean Morphological Analysis Using an Action-based Local Monotonic Attention Mechanism

  • Hwang, Hyunsun;Lee, Changki
    • ETRI Journal
    • /
    • 제42권1호
    • /
    • pp.101-107
    • /
    • 2020
  • For Korean language processing, morphological analysis is a critical component that requires extensive work. This morphological analysis can be conducted in an end-to-end manner without requiring a complicated feature design using a sequence-to-sequence model. However, the sequence-to-sequence model has a time complexity of O(n2) for an input length n when using the attention mechanism technique for high performance. In this study, we propose a linear-time Korean morphological analysis model using a local monotonic attention mechanism relying on monotonic alignment, which is a characteristic of Korean morphological analysis. The proposed model indicates an extreme improvement in a single threaded environment and a high morphometric F1-measure even for a hard attention model with the elimination of the attention mechanism formula.

Sequence-to-Sequence 모델 기반으로 한 한국어 형태소 분석의 재순위화 모델 (A Reranking Model for Korean Morphological Analysis Based on Sequence-to-Sequence Model)

  • 최용석;이공주
    • 정보처리학회논문지:소프트웨어 및 데이터공학
    • /
    • 제7권4호
    • /
    • pp.121-128
    • /
    • 2018
  • Sequence-to-sequence(Seq2seq) 모델은 입력열과 출력열의 길이가 다를 경우에도 적용할 수 있는 모델로 한국어 형태소 분석에서 많이 사용되고 있다. 일반적으로 Seq2seq 모델을 이용한 한국어 형태소 분석에서는 원문을 음절 단위로 처리하고 형태소와 품사를 음절 단위로 출력한다. 음절 단위의 형태소 분석은 사전 미등록어 문제를 쉽게 처리할 수 있다는 장점이 있는 반면 형태소 단위의 사전 정보를 반영하지 못한다는 단점이 있다. 본 연구에서는 Seq2seq 모델의 후처리로 재순위화 모델을 추가하여 형태소 분석의 최종 성능을 향상시킬 수 있는 모델을 제안한다. Seq2seq 모델에 빔 서치를 적용하여 K개 형태소 분석 결과를 생성하고 이들 결과의 순위를 재조정하는 재순위화 모델을 적용한다. 재순위화 모델은 기존의 음절 단위 처리에서 반영하지 못했던 형태소 단위의 임베딩 정보와 n-gram 문맥 정보를 활용한다. 제안한 재순위화 모델은 기존 Seq2seq 모델에 비해 약 1.17%의 F1 점수가 향상되었다.

Correlation Analysis between Regulatory Sequence Motifs and Expression Profiles by Kernel CCA

  • Rhee, Je-Keun;Joung, Je-Gun;Chang, Jeong-Ho;Zhang, Byoung-Tak
    • 한국생물정보학회:학술대회논문집
    • /
    • 한국생물정보시스템생물학회 2005년도 BIOINFO 2005
    • /
    • pp.63-68
    • /
    • 2005
  • Transcription factors regulate gene expression by binding to gene upstream region. Each transcription factor has the specific binding site in promoter region. So the analysis of gene upstream sequence is necessary for understanding regulatory mechanism of genes, under a plausible idea that assumption that DNA sequence motif profiles are closely related to gene expression behaviors of the corresponding genes. Here, we present an effective approach to the analysis of the relation between gene expression profiles and gene upstream sequences on the basis of kernel canonical correlation analysis (kernel CCA). Kernel CCA is a useful method for finding relationships underlying between two different data sets. In the application to a yeast cell cycle data set, it is shown that gene upstream sequence profile is closely related to gene expression patterns in terms of canonical correlation scores. By the further analysis of the contributing values or weights of sequence motifs in the construction of a pair of sequence motif profiles and expression profiles, we show that the proposed method can identify significant DNA sequence motifs involved with some specific gene expression patterns, including some well known motifs and those putative, in the process of the yeast cell cycle.

  • PDF

지중송전선로의 대칭분 임피던스 모델링에 관한 연구 (A Study on the Sequence Impedance Modeling of Underground Transmission Systems)

  • 황영록;김경철
    • 조명전기설비학회논문지
    • /
    • 제28권6호
    • /
    • pp.60-67
    • /
    • 2014
  • Power system fault analysis is commonly based on well-known symmetrical component method, which describes power system elements by positive, negative and zero sequence impedance. The majority of fault in transmission lines is unbalanced fault, such as line-to-ground faults, so that both positive and zero sequence impedance is required for fault analysis. When unbalanced fault occurs, zero sequence current flows through earth and ground wires in overhead transmission systems and through cable sheaths and earth in underground transmission systems. Since zero sequence current distribution between cable sheath and earth is dependent on both sheath bondings and grounding configurations, care must be taken to calculate zero sequence impedance of underground cable transmission lines. In this paper, EMTP-based sequence impedance calculation method was described and applied to 345kV cable transmission systems. Calculation results showed that detailed circuit analysis is desirable to avoid possible errors of sequence impedance calculation resulted from various configuration of cable sheath bonding and grounding in underground cable transmission systems.

INSTABILITY OF THE BETTI SEQUENCE FOR PERSISTENT HOMOLOGY AND A STABILIZED VERSION OF THE BETTI SEQUENCE

  • JOHNSON, MEGAN;JUNG, JAE-HUN
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제25권4호
    • /
    • pp.296-311
    • /
    • 2021
  • Topological Data Analysis (TDA), a relatively new field of data analysis, has proved very useful in a variety of applications. The main persistence tool from TDA is persistent homology in which data structure is examined at many scales. Representations of persistent homology include persistence barcodes and persistence diagrams, both of which are not straightforward to reconcile with traditional machine learning algorithms as they are sets of intervals or multisets. The problem of faithfully representing barcodes and persistent diagrams has been pursued along two main avenues: kernel methods and vectorizations. One vectorization is the Betti sequence, or Betti curve, derived from the persistence barcode. While the Betti sequence has been used in classification problems in various applications, to our knowledge, the stability of the sequence has never before been discussed. In this paper we show that the Betti sequence is unstable under the 1-Wasserstein metric with regards to small perturbations in the barcode from which it is calculated. In addition, we propose a novel stabilized version of the Betti sequence based on the Gaussian smoothing seen in the Stable Persistence Bag of Words for persistent homology. We then introduce the normalized cumulative Betti sequence and provide numerical examples that support the main statement of the paper.

초고층 건물의 시공 중 안정성 검토를 위한 시공단계해석 (Construction Sequence Analysis for Checking Stability in High-Rise Building under Construction)

  • 김재요
    • 한국전산구조공학회:학술대회논문집
    • /
    • 한국전산구조공학회 2008년도 정기 학술대회
    • /
    • pp.618-623
    • /
    • 2008
  • Due to recent trends of the atypical plan shapes and the zoning construction in high-rise buildings, the building stability under construction is arising as an important issue for design and construction plan. To ensure the stability under construction, the differential column shortening and the lateral movements with unbalanced distributions of self-weight of structure members and the load flows before completion of member connections and lateral load resisting system should be checked by construction sequence analysis. This paper presents the scheme of zone-based construction sequence analysis, to check the stability of high-rise building under construction. This scheme is applied to the construction sequence analysis for real high-rise building under construction.

  • PDF

Protein Sequence Search based on N-gram Indexing

  • Hwang, Mi-Nyeong;Kim, Jin-Suk
    • Bioinformatics and Biosystems
    • /
    • 제1권1호
    • /
    • pp.46-50
    • /
    • 2006
  • According to the advancement of experimental techniques in molecular biology, genomic and protein sequence databases are increasing in size exponentially, and mean sequence lengths are also increasing. Because the sizes of these databases become larger, it is difficult to search similar sequences in biological databases with significant homologies to a query sequence. In this paper, we present the N-gram indexing method to retrieve similar sequences fast, precisely and comparably. This method regards a protein sequence as a text written in language of 20 amino acid codes, adapts N-gram tokens of fixed-length as its indexing scheme for sequence strings. After such tokens are indexed for all the sequences in the database, sequences can be searched with information retrieval algorithms. Using this new method, we have developed a protein sequence search system named as ProSeS (PROtein Sequence Search). ProSeS is a protein sequence analysis system which provides overall analysis results such as similar sequences with significant homologies, predicted subcellular locations of the query sequence, and major keywords extracted from annotations of similar sequences. We show experimentally that the N-gram indexing approach saves the retrieval time significantly, and that it is as accurate as current popular search tool BLAST.

  • PDF

실계통 345kV 지중송전선 대칭좌표 임피던스의 해석 (Analysis of Sequence Impedances of 345kV Cable Transmission Systems)

  • 최종기;안용호;윤용범;오세일;곽양호;이명희
    • 전기학회논문지
    • /
    • 제62권7호
    • /
    • pp.905-912
    • /
    • 2013
  • Power system fault analysis is commonly based on well-known symmetrical component method, which describes power system elements by positive, negative and zero sequence impedance. In case of balanced fault, such as three phase short circuit, transmission line can be represented by positive sequence impedance only. The majority of fault in transmission lines, however, is unbalanced fault, such as line-to-ground faults, so that both positive and zero sequence impedance is required for fault analysis. When unbalanced fault occurs, zero sequence current flows through earth and skywires in overhead transmission systems and through cable sheaths and earth in cable transmission systems. Since zero sequence current distribution between cable sheath and earth is dependent on both sheath bondings and grounding configurations, care must be taken to calculate zero sequence impedance of underground cable transmission lines. In this paper, conventional and EMTP-based sequence impedance calculation methods were described and applied to 345kV cable transmission systems (4 circuit, OF 2000mm2). Calculation results showed that detailed circuit analysis is desirable to avoid possible errors of sequence impedance calculation resulted from various configuration of cable sheath bonding and grounding in underground cable transmission systems.