• Title/Summary/Keyword: 전체 유전체

Search Result 306, Processing Time 0.026 seconds

The Analysis of Genome Database Compaction based on Sequence Similarity (시퀀스 유사도에 기반한 유전체 데이터베이스 압축 및 영향 분석)

  • Kwon, Sunyoung;Lee, Byunghan;Park, Seunghyun;Jo, Jeonghee;Yoon, Sungroh
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.4
    • /
    • pp.250-255
    • /
    • 2017
  • Given the explosion of genomic data and expansion of applications such as precision medicine, the importance of efficient genome-database management continues to grow. Traditional compression techniques may be effective in reducing the size of a database, but a new challenge follows in terms of performing operations such as comparison and searches on the compressed database. Based on that many genome databases typically have numerous duplicated or similar sequences, and that the runtime of genome analyses is normally proportional to the number of sequences in a database, we propose a technique that can compress a genome database by eliminating similar entries from the database. Through our experiments, we show that we can remove approximately 84% of sequences with 1% similarity threshold, accelerating the downstream classification tasks by approximately 10 times. We also confirm that our compression method does not significantly affect the accuracy of taxonomy diversity assessments or classification.

Development of Workbench for Analysis and Visualization of Whole Genome Sequence (전유전체(Whole gerlome) 서열 분석과 가시화를 위한 워크벤치 개발)

  • Choe, Jeong-Hyeon;Jin, Hui-Jeong;Kim, Cheol-Min;Jang, Cheol-Hun;Jo, Hwan-Gyu
    • The KIPS Transactions:PartA
    • /
    • v.9A no.3
    • /
    • pp.387-398
    • /
    • 2002
  • As whole genome sequences of many organisms have been revealed by small-scale genome projects, the intensive research on individual genes and their functions has been performed. However on-memory algorithms are inefficient to analysis of whole genome sequences, since the size of individual whole genome is from several million base pairs to hundreds billion base pairs. In order to effectively manipulate the huge sequence data, it is necessary to use the indexed data structure for external memory. In this paper, we introduce a workbench system for analysis and visualization of whole genome sequence using string B-tree that is suitable for analysis of huge data. This system consists of two parts : analysis query part and visualization part. Query system supports various transactions such as sequence search, k-occurrence, and k-mer analysis. Visualization system helps biological scientist to easily understand whole structure and specificity by many kinds of visualization such as whole genome sequence, annotation, CGR (Chaos Game Representation), k-mer, and RWP (Random Walk Plot). One can find the relations among organisms, predict the genes in a genome, and research on the function of junk DNA using our workbench.

A Survey of Sequence Alignment Algorithms (서열 정렬 알고리즘의 연구 동향)

  • 성종희;김동규
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2003.05b
    • /
    • pp.571-574
    • /
    • 2003
  • 서열 정렬(sequence alignment)은 새로운 서열의 기능적, 구조적, 진화적 분석을 용이하게 하기 때문에 분자 생물학(molecular biology) 등에서 널리 사용된다. 지금까지 서열 정렬 알고리즘들에 대한 연구는 활발히 진행되어 왔다. 특히, 생물학 데이터양의 기하급수적인 증가와 전체 유전체 서열의 분석이 이루어진 종(species)들이 증가하면서, 보다 빠르고 정확하게 서열 정력을 수행하는 알고리즘이 필요하게 되었다. 본 논문에서는 동적 프로그래밍 방식에서부터 전체 유전체 서열 알고리즘에 이르기까지 서열 정렬 알고리즘의 연구 동향을 분석하고자 한다.

  • PDF

A Study on a tool to generate polymorphic genome and metagenome sequences (다염기변이 및 메타유전체 염기서열 생성도구에 관한 연구)

  • Kim, Jonghyun;Kim, Woocheol;Park, Sanghyun
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2007.11a
    • /
    • pp.262-263
    • /
    • 2007
  • 유전체학 (genomics)의 가장 기초적인 기반이 되는 것은 염기서열을 정확하게 결정해 내는 것이다. 많은 진핵생물들 (eukaryotes)은 두개의 상동염색체를 가지며 두개의 염색체의 염기서열에는 차이가 존재한다. 현재의 유전체 염기서열 결정방법으로는 염기변이가 많이 존재할 경우 유전체의 염기서열을 결정하기 어렵다. 특정한 장소에 서식하는 무수히 많은 미생물들의 유전체의 염기서열을 동시에 결정하는 문제도 미생물학에서 중요성을 인정받는 문제이지만, 미생물들간의 염기변이의 정도는 단일개체의 경우보다 복잡하며 염기서열을 효과적으로 결정하기 힘들다. 따라서 염기변이가 많은 생물들과 미생물들 집합의 염기서열을 결정할 수 있는 방법론의 개발이 시급한 실정이다. 본 논문에서는 조립된 다염기변이 유전체및 메타유전체의 염기서열의 정확성을 평가하기 위한 유전체 서열과 시뮬레이션에 기반한 read 들을 생성하는 도구를 개발하는 것을 목표로 한다.

  • PDF

Advances of Genome Research in Livestock Animals (경제동물 유전체학 연구의 최근 연구 동향)

  • Song, Ki-Duk;Cho, Byung-Wook
    • Journal of Life Science
    • /
    • v.18 no.4
    • /
    • pp.572-579
    • /
    • 2008
  • Genome research in economic animals has progressed rapidly in recent years, transforming from primitive genome maps to quantitative/qualitative trait maps that are indispensable to gene discovery. These advances have been benefited from the result of animal genome sequencing projects and functional genomics that are being extensively applied in livestock animal research following the development of large expressed sequences tags (ESTs). Genome sequencing efforts will provide information to QTL study by larger scale single nucleotide polymorphisms (SNPs) association study. Comparative genomics which is applying the information from human genome research as well as rodents model has contributed to important discoveries in economic animal genome research. These efforts will speed up much denser QTL maps development for phenotypic traits which are not easy to measure and to be identified by quantitative genetics [20] and lead to development of convincing markers associated with economically important trait, which will be eventually applied to livestock industry. In addition to practical application, animal genome research will enrich the understanding of human physiology in terms of genome biology.

A Design Method for Dielectric-slab Waveguide Polarizer With A Low Return Loss over A Wideband Frequencies (광대역에서 저반사 손실을 가지는 도파관 유전체 편파기 설계 기법)

  • 김도균;이석곤;이종대;안병철
    • Proceedings of the Korea Electromagnetic Engineering Society Conference
    • /
    • 2001.11a
    • /
    • pp.161-164
    • /
    • 2001
  • 본 논문에서는 주어진 주파수대역에서 우수한 원편파 특성을 가지고 광대역에서 저반사 손실을 가지는 이종대역 안테나 피드용 원형도파관 유전체 편파기의 설계기법을 제시하였다. 양호한 편파특성을 위한 유전체판 두께와 원형 도판관 직경의 결정방법을 제시하였다. 또한 이종 대역 안테나 피드용으로 광대역에서 저반사 손실을 가지기 위한 유전체 형상 설계기법을 제시하였다.

  • PDF

A Genomics Tool for Microbial Genome Comparison Using BLAST/FASTA (BLAST/FASTA를 활용한 미생물 유전체 비교용 도구의 개발)

  • Tae, Hongseok;Lee, Daesang;Park, Wan;Park, Kiejung
    • Korean Journal of Microbiology
    • /
    • v.38 no.4
    • /
    • pp.267-275
    • /
    • 2002
  • We have developed GComp as an analysis tool for microbial genome comparison. This tool exploits BLAST or FASTA as a preprocessing program for local alignments to detect homologous regions, parses the homology search results, and generates tables and files to show homology relationship between two genomes at a glance. The interface for graphical representation of the comparative genomic analysis has been also implemented. Our test cases shows that the program can be useful in practice for intuitive and quantitative comparison of microbial genome sequence pairs as well as self-genome analysis. A few additional features have been devised and designed, which will be added in the further development.

Chloroplast genome sequence and PCR-based markers for S. cardiophyllum (감자 근연야생종 Solanum cardiophyllum의 엽록체 전장유전체 구명 및 이를 이용한 S. cardiophyllum 특이적 분자마커의 개발)

  • Tae-Ho Park
    • Journal of Plant Biotechnology
    • /
    • v.50
    • /
    • pp.45-55
    • /
    • 2023
  • The diploid Solanum cardiophyllum, a wild tuberbearing species from Mexico is one of the relatives to potato, S. tuberosum. It has been identified as a source of resistance to crucial pathogens and insects such as Phytophthora infestans, Potato virus Y, Colorado potato beetle, etc. and is widely used for potato breeding. However, the sexual hybridization between S. cardiophyllum and S. tuberosum is limited due to their incompatibility. Therefore, somatic hybridization can introduce beneficial traits from this wild species into the potato. After somatic hybridization, selecting fusion products using molecular markers is essential. In the current study, the chloroplast genome of S. cardiophyllum was sequenced by next-generation sequencing technology and compared with those of other Solanum species to develop S. cardiophyllum-specific markers. The total length of the S. cardiophyllum chloroplast genome was 155,570 bp and its size, gene content, order and orientation were similar to those of the other Solanum species. Phylogenic analysis with 32 other Solanaceae species revealed that S. cardiophyllum was expectedly grouped with other Solanum species and most closely located with S. bulbocastanum. Through detailed comparisons of the chloroplast genome sequences of eight Solanum species, we identified 13 SNPs specific to S. cardiophyllum. Further, four SNP-specific PCR markers were developed for discriminating S. cardiophyllum from other Solanum species. The results obtained in this study would help to explore the evolutionary aspects of Solanum species and accelerate breeding using S. cardiophyllum.