• Title/Summary/Keyword: DNA 서열 비교

Search Result 515, Processing Time 0.031 seconds

Fast Matching Method for DNA Sequences (DNA 서열을 위한 빠른 매칭 기법)

  • Kim, Jin-Wook;Kim, Eun-Sang;Ahn, Yoong-Ki;Park, Kun-Soo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.36 no.4
    • /
    • pp.231-238
    • /
    • 2009
  • DNA sequences are the fundamental information for each species and a comparison between DNA sequences of different species is an important task. Since DNA sequences are very long and there exist many species, not only fast matching but also efficient storage is an important factor for DNA sequences. Thus, a fast string matching method suitable for encoded DNA sequences is needed. In this paper, we present a fast string matching method for encoded DNA sequences which does not decode DNA sequences while matching. We use four-characters-to-one-byte encoding and combine a suffix approach and a multi-pattern matching approach. Experimental results show that our method is about 5 times faster than AGREP and the fastest among known algorithms.

Phylogenetic Relationship Among Four Species of Korean Oysters Based on Mitochondrial 16S rDNA and COI Gene (미토콘드리아 16S rDNA와 COI유전자에 근거한 한국산 굴류 4종의 유연관계)

  • 이상엽;박두원;안혜숙;김상해
    • Animal Systematics, Evolution and Diversity
    • /
    • v.16 no.2
    • /
    • pp.203-211
    • /
    • 2000
  • Partial mitochondrial 16S rDNA and COI gene were amplified using PCR and sequenced for four species of oysters in Korea. Phylogenetic relationships among them were inferred from their aligned sequences by neighbor-joining method. The sequence comparison data of two mitochondrial genes showed that the genetic distinction between two oyster genera (Crassostreo and Ostrea) was obvious. Phylogenetic analysis based on the nucleotide sequences and A+T percentage of two genes indicates that C. gigas and C. nippona strongly formed a sister group and then C. ariakensis was clustered with the clade although that based on amino acid sequences of COI gene by neighbor-joining method represented different phylogenetic tree.

  • PDF

A DNA Sequence Generation Algorithm for Traveling Salesman Problem using DNA Computing with Evolution Model (DNA 컴퓨팅과 진화 모델을 이용하여 Traveling Salesman Problem를 해결하기 위한 DNA 서열 생성 알고리즘)

  • Kim, Eun-Gyeong;Lee, Sang-Yong
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.16 no.2
    • /
    • pp.222-227
    • /
    • 2006
  • Recently the research for Traveling Salesman Problem (TSP) using DNA computing with massive parallelism has been. However, there were difficulties in real biological experiments because the conventional method didn't reflect the precise characteristics of DNA when it express graph. Therefore, we need DNA sequence generation algorithm which can reflect DNA features and reduce biological experiment error. In this paper we proposed a DNA sequence generation algorithm that applied DNA coding method of evolution model to DNA computing. The algorithm was applied to TSP, and compared with a simple genetic algorithm. As a result, the algorithm could generate good sequences which minimize error and reduce the biologic experiment error rate.

Suffix Tree Constructing Algorithm for Large DNA Sequences Analysis (대용량 DNA서열 처리를 위한 서픽스 트리 생성 알고리즘의 개발)

  • Choi, Hae-Won
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.15 no.1
    • /
    • pp.37-46
    • /
    • 2010
  • A Suffix Tree is an efficient data structure that exposes the internal structure of a string and allows efficient solutions to a wide range of complex string problems, in particular, in the area of computational biology. However, as the biological information explodes, it is impossible to construct the suffix trees in main memory. We should find an efficient technique to construct the trees in a secondary storage. In this paper, we present a method for constructing a suffix tree in a disk for large set of DNA strings using new index scheme. We also show a typical application example with a suffix tree in the disk.

Sequence Alignment Algorithm using Quality Information (품질 정보를 이용한 서열 배치 알고리즘)

  • Na, Joong-Chae;Roh, Kang-Ho;Park, Kun-Soo
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.11_12
    • /
    • pp.578-586
    • /
    • 2005
  • In this Paper we consider the problem of sequence alignment with quality scores. DNA sequences produced by a base-calling program (as part of sequencing) have quality scores which represent the confidence level for individual bases. However, previous sequence alignment algorithms do not consider such quality scores. To solve sequence alignment with quality scores, we propose a measure of an alignment of two sequences with orality scores. We show that an optimal alignment in this measure can be found by dynamic programming.

DNA Sequence Design using $\varepsilon$ -Multiobjective Evolutionary Algorithm ($\varepsilon$-다중목적함수 진화 알고리즘을 이용한 DNA 서열 디자인)

  • Shin Soo-Yong;Lee In-Hee;Zhang Byoung-Tak
    • Journal of KIISE:Software and Applications
    • /
    • v.32 no.12
    • /
    • pp.1217-1228
    • /
    • 2005
  • Recently, since DNA computing has been widely studied for various applications, DNA sequence design which is the most basic and important step for DNA computing has been highlighted. In previous works, DNA sequence design has been formulated as a multi-objective optimization task, and solved by elitist non-dominated sorting genetic algorithm (NSGA-II). However, NSGA-II needed lots of computational time. Therefore, we use an $\varepsilon$- multiobjective evolutionarv algorithm ($\varepsilon$-MOEA) to overcome the drawbacks of NSGA-II in this paper. To compare the performance of two algorithms in detail, we apply both algorithms to the DTLZ2 benchmark function. $\varepsilon$-MOEA outperformed NSGA-II in both convergence and diversity, $70\%$ and $73\%$ respectively. Especially, $\varepsilon$-MOEA finds optimal solutions using small computational time. Based on these results, we redesign the DNA sequences generated by the previous DNA sequence design tools and the DNA sequences for the 7-travelling salesman problem (TSP). The experimental results show that $\varepsilon$-MOEA outperforms the most cases. Especially, for 7-TSP, $\varepsilon$-MOEA achieves the comparative results two tines faster while finding $22\%$ improved diversity and $92\%$ improved convergence in final solutions using the same time.

Gene Reangement through 151 bp Repeated Sequence in Rice Chloroplast DNA (벼 엽록체 DNA내의 151 bp 반복염기서열에 의한 유전자 재배열)

  • Nahm, Baek-Hie;Kim, Han-Jip
    • Applied Biological Chemistry
    • /
    • v.36 no.3
    • /
    • pp.208-214
    • /
    • 1993
  • To investigate the gene rearrangement via short repeated sequences in chloroplast DNA, the pattern of heterologous gene clusters containing the 151 bp repeated sequence with the development of plastid was compared in rice and the homologous gene clusters from various plant sources were searched for comparative analysis. Southern blot analysis of rice DNA using rp12 gene containing 151 bp repeated sequence as a probe showed the presence of heterologous gene clusters. Such heterologous gene clusters varied with the development of plastid. Also it was observed that the heterologous gene clusters were observed in all of the rice cultivars used in this work. Finally the comparative analysis of DNA sequence of the homologous gene clusters from various plants showed the evolutionary gene rearragngement via short repeated sequence among plants. These results suggest the possible relationship between the plastid development and gene rearrangement through short repeated sequences.

  • PDF

The 18s rDNA Sequences of the Basidiocarps of Tricholoma matsutake in Korea (한국산 송이버섯에서의 18s ribosomal DNA 서열)

  • Lee, Sang-Sun;Hong, Sung-Woon
    • The Korean Journal of Mycology
    • /
    • v.26 no.2 s.85
    • /
    • pp.256-264
    • /
    • 1998
  • The 18S rDNA sequences of Tricholoma matsutake (TM=T. caligatum var. nauseoum) collected in Korea were analyzed for the ectomycorrhizal fungi in the roots of Pinus densiflora. The 514 base pairs of rDNA region were synthesized by UF-5 and UR-6 primers, and double checked in the base pair. The sequence of four strains synthesized were all identical in this work, but different from those done by the previous workers. The basidiocarps collected in this work. were identified to T. matstake after searching the 18s rDNA by the BLAST in NCBI. Only several base pairs of 18S rDNA analyzed from other related basidiocarps were different from our analyses of 18S rDNA. The dendrogram were made based on the sequences of the 514 bp 18S rDNA by CLUSTAL-X alignment program. The groupings of the species at the level of genus in the dendrogram were well constructed.

  • PDF

A Sequence Similarity Algorithm Irrelevant to Sequence Length (서열의 길이에 무관한 유사도 측정 알고리즘)

  • Kim, Jae-Kwang;Lee, Jee-Hyong
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2008.04a
    • /
    • pp.13-16
    • /
    • 2008
  • Dynamic Programming (DP)을 이용한 서열 비교 알고리즘은 DNA, RNA, 단백질 서열의 비교와 프로그래밍 소스 코드 유사도를 측정하는 곳 등에 널리 사용되어 왔다. 이 알고리즘은 DP를 이용하여 행렬을 구성한 후, 행렬의 가장 마지막 생성 값을 이용해 두 서열의 유사도를 측정하는 방법이다. 그러나 이 알고리즘에서 사용하는 마지막 생성 값은 비교 서열이 길이에 따라 크게 좌우되기 때문에 다양한 서열들의 유사도를 알아내기에는 부적합하다. 본 논문에서는 서열의 길이에 무관한 유사도 측정 (S2) 알고리즘을 제안한다. 제안된 알고리즘을 이용하면 비교 서열의 길이에 영향을 받지 않고 정당한 서열 비교를 할 수 있다. 제안된 알고리즘의 검증을 위해 본 논문에서는 프로그램 소스 코드의 유사도 측정을 수행한다.

  • PDF

Determination of Nucleotide Sequences of cDNA from Cucumber Mosaic Virus-As RNA4 (As계의 오이 모자이크 바이러스 RNA4의 염기서열 결정)

  • 김상현;박원목;이세영;박영인
    • Korean Journal Plant Pathology
    • /
    • v.12 no.2
    • /
    • pp.176-181
    • /
    • 1996
  • Aster yomena로부터 분리한 오이 모자이크 바이러스(cucumber mosaic virus) (CMV-As)의 RNA4로부터 완전한 길이의 cDNA를 합성하고 그 전체적인 염기서열(1,043 nt`s)을 결정하였다. CMV-As RNA4는 73개의 염기로 구성된 5`말단의 leader 부위, 657개의 염기로 구성된 외피단백질(coat protein) 유전자 부위 및 312개의 염기로 구성된 3` 말단의 비번역 부위로 구성되어 있음을 확인하였다. 외피단백질 유전자 부위의 염기서열을 다른 계통의 CMV와 비교해 볼 때 그 염기서열이 보전적으로 존재하고 있으나 그 외의 부분은 다양함을 확인하였다. 특히 3` 말단부위의 61개의 염기로 구성된 부위(959-1019)는 다른 계통의 CMV에서는 상당히 유사하지만 CMV-As도 다른 CMV처럼 tRNA와 유사한 구조를 역시 형성함을 확인하였다. CMV-As의 RNA4 염기서열을 다른 계통의 CMV와 비교할 때 CMV-I17F와 가장 유사하였으며(91.9%) S형의 CMV-M과는 가장 낮은 동일성을 보였다(71.1%). 외와 같은 염기성열의 비교 결과와 EcoRI 제한효소 인식부위의 존재로 미루어 CMV-As는 WT형으로 분류된다.

  • PDF