• Title/Summary/Keyword: Haplotype phasing

Search Result 5, Processing Time 0.037 seconds

Short Reads Phasing to Construct Haplotypes in Genomic Regions That Are Associated with Body Mass Index in Korean Individuals

  • Lee, Kichan;Han, Seonggyun;Tark, Yeonjeong;Kim, Sangsoo
    • Genomics & Informatics
    • /
    • v.12 no.4
    • /
    • pp.165-170
    • /
    • 2014
  • Genome-wide association (GWA) studies have found many important genetic variants that affect various traits. Since these studies are useful to investigate untyped but causal variants using linkage disequilibrium (LD), it would be useful to explore the haplotypes of single-nucleotide polymorphisms (SNPs) within the same LD block of significant associations based on high-density variants from population references. Here, we tried to make a haplotype catalog affecting body mass index (BMI) through an integrative analysis of previously published whole-genome next-generation sequencing (NGS) data of 7 representative Korean individuals and previously known Korean GWA signals. We selected 435 SNPs that were significantly associated with BMI from the GWA analysis and searched 53 LD ranges nearby those SNPs. With the NGS data, the haplotypes were phased within the LDs. A total of 44 possible haplotype blocks for Korean BMI were cataloged. Although the current result constitutes little data, this study provides new insights that may help to identify important haplotypes for traits and low variants nearby significant SNPs. Furthermore, we can build a more comprehensive catalog as a larger dataset becomes available.

A Comparative Analysis of the Illumina Truseq Synthetic Long-read Haplotyping Sequencing Platform versus the 10X Genomics Chromium Genome Sequencing Platform for Haplotype Phasing and the Identification of Single-nucleotide variants (SNVs) in Hanwoo (Korean Native Cattle) (일루미나에서 제작된 TSLRH (Truseq Synthetic Long-Read Haplotyping)와 10X Genomics에서 제작된 The Chromium Genome 시퀀싱 플랫폼을 이용하여 생산된 한우(한국 재래 소)의 반수체형 페이징 및 단일염기서열변이 비교 분석)

  • Park, Woncheoul;Srikanth, Krishnamoorthy;Park, Jong-Eun;Shin, Donghyun;Ko, Haesu;Lim, Dajeong;Cho, In-Cheol
    • Journal of Life Science
    • /
    • v.29 no.1
    • /
    • pp.1-8
    • /
    • 2019
  • In Hanwoo cattle (Korean native cattle), there is a scarcity of comparative analysis papers using highdepth sequencing and haplotype phasing, particularly a comparative analysis of the Truseq Synthetic Long-Read Haplotyping sequencing platform serviced by Illumina (TSLRH) versus the Chromium Genome Sequencing platform serviced by 10X Genomics (10XG). DNA was extracted from the sperm of a Hanwoo breeding bull (ID: TN1505D2184/27214) provided by Hanwoo research canter and used for the generation of sequence data from both the sequencing platforms. We then identified SNVs using an appropriate analysis pipeline tailored for each platform. The TSLRH and 10XG platforms generated a total of 355,208,304 and 1,632,772,004 reads, respectively, corresponding to a Q30 (%) of 89.04% and 88.60%, respectively, of which 351,992,768(99.09%) and 1,526,641,824(93.50%) were successfully mapped. For the TSLRH and 10XG platforms, the mean depth of the sequencing was 13.04X and 74.3X, the longest phase block was 1,982,706 bp and 1,480,081 bp, the N50 phase block was 57,637 bp and 114,394 bp, the total number of SNVs identified was 4,534,989 and 8,496,813, and the total phased rate was 72.29% and 87.67%, respectively. Moreover, for each chromosome, we identified unique and common SNVs using both sequencing platforms. The number of SNVs was directly proportional to the length of the chromosome. Based on our results, we recommend the use of the 10XG platform for haplotype phasing and SNV identification, as it generated a longer N50 phase block, in addition to a higher mean depth, total number of reads, total number of SNVs, and phase rate, than the TSLRH platform.

Difference in Haplotype Phasing According to the Use of Quality Information (품질정보의 사용유무에 따른 하플로타입 페이징의 결과 차이)

  • Lee, Jong-Chan;Na, Joong Chae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2017.04a
    • /
    • pp.13-16
    • /
    • 2017
  • 인간 유전자의 SNP 서열 정보를 통해 하플로타입을 추정하는 하플로타입 페이징은 생명공학분야에서 중요한 연구분야이다. 최근에는 SNP 데이터가 많아짐에 따라 많은 하플로타입 페이징 알고리즘들이 제시되었다. 본 논문에서는 SNP 데이터의 오류로 인한 하플로타입 페이징의 한계점과 이를 해결하기 위한 품질정보의 사용에 관한 문제점을 언급한 후 이와 관련된 실험을 통해 품질정보가 하플로타입 페이징의 결과에 미치는 영향을 알아본다. 실험은 기존의 하플로타입 페이징 알고리즘을 사용하여 품질정보의 사용 유무에 따라 하플로타입 페이징 결과를 비교하는 과정으로 진행되었다. 실험 결과 하플로타입 페이징에 과정에서 품질정보를 사용하는 것은 품질정보를 사용하지 않았을 때 보다 좋은 결과를 보여주었다.

A Greedy Algorithm for Haplotype Phasing (하플로타입 페이징에 대한 탐욕적 알고리즘)

  • Kim, Eun-Kwang;Na, Joong Chae
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2015.10a
    • /
    • pp.41-44
    • /
    • 2015
  • 하플로타입 페이징이란 상동 염색체의 DNA 염기 서열로부터 hetero SNP만을 읽어 들여 프래그먼트들을 만들고 이 프래그먼트들을 조합하여 하플로타입을 결정하는 것을 말한다. 하플로타입 페이징 과정에서 프래그먼트들의 조합은 무수히 많이 존재하며 때문에 하플로타입을 결정하기 위한 효율적인 알고리즘이 필요하다. 본 논문에서는 하플로타입 페이징 알고리즘들의 정확도 비교를 위해 제공되고 있는 데이터들을 가지고 탐욕적 알고리즘을 사용하여 하플로타입 페이징을 했을 경우 얼마나 정답과 유사한 하플로타입을 얻을 수 있는지를 분석하였다. 실험 결과 모든 데이터에 대하여 약 80%의 꽤 높은 정답률을 보였다. 더 나아가, 정답률이 저조한 구간에 대한 원인 분석을 한다.

Birth of an 'Asian cool' reference genome: AK1

  • Kim, Changhoon
    • BMB Reports
    • /
    • v.49 no.12
    • /
    • pp.653-654
    • /
    • 2016
  • The human reference genome, maintained by the Genome Reference Consortium, is conceivably the most complete genome assembly ever, since its first construction. It has continually been improved by incorporating corrections made to the previous assemblies, thanks to various technological advances. Many currently-ongoing population sequencing projects have been based on this reference genome, heightening hopes of the development of useful medical applications of genomic information, thanks to the recent maturation of high-throughput sequencing technologies. However, just one reference genome does not fit all the populations across the globe, because of the large diversity in genomic structures and technical limitations inherent to short read sequencing methods. The recent success in de novo construction of the highly contiguous Asian diploid genome AK1, by combining single molecule technologies with routine sequencing data without resorting to traditional clone-by-clone sequencing and physical mapping, reveals the nature of genomic structure variation by detecting thousands of novel structural variations and by finally filling in some of the prior gaps which had persistently remained in the current human reference genome. Now it is expected that the AK1 genome, soon to be paired with more upcoming de novo assembled genomes, will provide a chance to explore what it is really like to use ancestry-specific reference genomes instead of hg19/hg38 for population genomics. This is a major step towards the furthering of genetically-based precision medicine.