• Title/Summary/Keyword: haplotype assembly

Search Result 6, Processing Time 0.019 seconds

Haplotype Assembly from Weighted SNP Fragments and Related Genotype Information (신뢰도를 가진 SNP 단편들과 유전자형으로부터 일배체형 조합)

  • Kang, Seung-Ho;Jeong, In-Seon;Choi, Mun-Ho;Lim, Hyeong-Seok
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.11
    • /
    • pp.509-516
    • /
    • 2008
  • The Minimum Letter Flips (MLF) model and the Weighted Minimum Letter Flips (WMLF) model are for solving the haplotype assembly problem. But these two models are effective only when the error rate in SNP fragments is low. In this paper, we first establish a new computational model that employs the related genotype information as an improvement of the WMLF model and show its NP-hardness, and then propose an efficient genetic algorithm to solve the haplotype assembly problem. The results of experiments on random data set and a real data set indicate that the introduction of genotype information to the WMLF model is quite effective in improving the reconstruction rate especially when the error rate in SNP fragments is high. And the results also show that genotype information increases the convergence speed of the genetic algorithm.

Solving the Haplotype Assembly Problem for Human Using the Improved Branch and Bound Algorithm (개선된 분기한정 알고리즘을 이용한 인간 유전체의 일배체형 조합문제 해결)

  • Choi, Mun-Ho;Kang, Seung-Ho;Lim, Hyeong-Seok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.10
    • /
    • pp.697-704
    • /
    • 2013
  • The identification of haplotypes, which encode SNPs in a single chromosome, makes it possible to perform haplotype-based association tests with diseases. Minimum Error Correction model, one of models to computationally assemble a pair of haplotypes for a given organism from Single Nucleotide Polymorphism fragments, has been known to be NP-hard even for gapless cases. In the previous work, an improved branch and bound algorithm was suggested and showed that it is more efficient than naive branch and bound algorithm by performing experiments for Apis mellifera (honeybee) data set. In this paper, to show the extensibility of the algorithm to other organisms we apply the improved branch and bound algorithm to the human data set and confirm the efficiency of the algorithm.

The Correctness Comparison of MCIH Model and WMLF/GI Model for the Individual Haplotyping Reconstruction (일배체형 재조합을 위한 MCIH 모델과 WMLF/GI 모델의 정확도 비교)

  • Jeong, In-Seon;Kang, Seung-Ho;Lim, Hyeong-Seok
    • The KIPS Transactions:PartB
    • /
    • v.16B no.2
    • /
    • pp.157-161
    • /
    • 2009
  • Minimum Letter Flips(MLF) and Weighted Minimum Letter Flips(WMLF) can perform the haplotype reconstruction more accurately from SNP fragments when they have many errors and gaps by introducing the related genotype information. And it is known that WMLF is more accurate in haplotype reconstruction than those based on the MLF. In the paper, we analyze two models under the conditions that the different rates of homozygous site in the genotype information and the different confidence levels according to the sequencing quality. We compare the performance of the two models using neural network and genetic algorithm. If the rate of homozygous site is high and sequencing quality is good, the results of experiments indicate that WMLF/GI has higher accuracy of haplotype reconstruction than that of the MCIH especially when the error rate and gap rate of SNP fragments are high.

Birth of an 'Asian cool' reference genome: AK1

  • Kim, Changhoon
    • BMB Reports
    • /
    • v.49 no.12
    • /
    • pp.653-654
    • /
    • 2016
  • The human reference genome, maintained by the Genome Reference Consortium, is conceivably the most complete genome assembly ever, since its first construction. It has continually been improved by incorporating corrections made to the previous assemblies, thanks to various technological advances. Many currently-ongoing population sequencing projects have been based on this reference genome, heightening hopes of the development of useful medical applications of genomic information, thanks to the recent maturation of high-throughput sequencing technologies. However, just one reference genome does not fit all the populations across the globe, because of the large diversity in genomic structures and technical limitations inherent to short read sequencing methods. The recent success in de novo construction of the highly contiguous Asian diploid genome AK1, by combining single molecule technologies with routine sequencing data without resorting to traditional clone-by-clone sequencing and physical mapping, reveals the nature of genomic structure variation by detecting thousands of novel structural variations and by finally filling in some of the prior gaps which had persistently remained in the current human reference genome. Now it is expected that the AK1 genome, soon to be paired with more upcoming de novo assembled genomes, will provide a chance to explore what it is really like to use ancestry-specific reference genomes instead of hg19/hg38 for population genomics. This is a major step towards the furthering of genetically-based precision medicine.

Polymorphism of NLRP3 Gene and Association with Susceptibility to Digestive Disorders in Rabbit

  • Yang, Yu;Zhang, Gong-Wei;Chen, Shi-Yi;Peng, Jin;Lai, Song-Jia
    • Asian-Australasian Journal of Animal Sciences
    • /
    • v.26 no.4
    • /
    • pp.455-462
    • /
    • 2013
  • NLR family pyrin domain containing 3 (NLRP3) is a key component of the inflammasome, whose assembly is a crucial part of the innate immune response. The aim of the present study was to evaluate the association between exon 3 polymorphisms of NLRP3 and the susceptibility to digestive disorders in rabbits. In total, five coding single-nucleotide polymorphisms (cSNPs) were identified; all of which are synonymous. Among them, c.456 C> and c.594 G> were further genotyped for association analysis based on case-control design (n =162 vs n =102). Meanwhile, growing rabbits were experimentally induced to digestive disorders by feeding a fiber-deficient diet, subsequently they were subjected to mRNA expression analysis. Association analysis revealed that haplotype H1 (the two cSNPs: GT) played a potential protective role against digestive disorders (p<0.001). The expression of NLRP3 in the group $H1HX_1$ ($H1HX_1$ is composed of H1H1, H1H3 and H1H4) was the lowest among four groups which were classified by different types of diplotypes. Those results suggested that the NLRP3 gene was significantly associated with susceptibility to digestive disorders in rabbit.

Identification of Compound Heterozygous Alleles in a Patient with Autosomal Recessive Limb-Girdle Muscular Dystrophy (상염색체 열성 지대형 근이영양증 환자로부터 TTN 유전자의 복합 이형접합성 대립유전자의 분리)

  • Choi, Hee Ji;Lee, Soo Bin;Kwon, Hye Mi;Choi, Byung-Ok;Chung, Ki Wha
    • Journal of Life Science
    • /
    • v.31 no.10
    • /
    • pp.913-921
    • /
    • 2021
  • Limb-girdle muscular dystrophy (LGMD) which is characterized by progressive muscle weakening of the hip and shoulder shows both dominant and recessive inheritances with many pathogenic genes including TTN. This study performed to identify genetic causes of a male patient with late onset (45 years old) autosomal recessive LGMD and atrial flutter. By application of the whole exome sequencing, we identified bi-allelic variants of TTN gene in the patient. One allele had a single missense variant of [c.24124G>T (p.V8042F)], while the other allele consisted of three missense variants of [c.29222G>C (p.R9741P) + c.67490A>G (p.H22497R) + c.75376C>T (p.R25126C)]. The p.V8042F allele was transmitted from his mother, while the other haplotype allele was putatively transmitted from his father. His two unaffected sons had only the p.R9741P. These variants have been not reported or rarely reported in the public human genome databases (1,000 Genome, gnomAD, and KRGDB). Most variants were located in the highly conserved immunoglobulin or fibronectin domains and were predicted to be pathogenic by the in silico analyses. The TTN giant protein plays a key role in muscle assembly, force transmission at the Z-line, and maintenance of resting tension in the I-band. In conclusion, we think that these bi-allelic compound heterozygous mutations may play a role as the genetic causes of the LGMD phenotype.