• Title/Summary/Keyword: Haplotype Assembly Problem

Search Result 3, Processing Time 0.016 seconds

Haplotype Assembly from Weighted SNP Fragments and Related Genotype Information (신뢰도를 가진 SNP 단편들과 유전자형으로부터 일배체형 조합)

  • Kang, Seung-Ho;Jeong, In-Seon;Choi, Mun-Ho;Lim, Hyeong-Seok
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.11
    • /
    • pp.509-516
    • /
    • 2008
  • The Minimum Letter Flips (MLF) model and the Weighted Minimum Letter Flips (WMLF) model are for solving the haplotype assembly problem. But these two models are effective only when the error rate in SNP fragments is low. In this paper, we first establish a new computational model that employs the related genotype information as an improvement of the WMLF model and show its NP-hardness, and then propose an efficient genetic algorithm to solve the haplotype assembly problem. The results of experiments on random data set and a real data set indicate that the introduction of genotype information to the WMLF model is quite effective in improving the reconstruction rate especially when the error rate in SNP fragments is high. And the results also show that genotype information increases the convergence speed of the genetic algorithm.

Solving the Haplotype Assembly Problem for Human Using the Improved Branch and Bound Algorithm (개선된 분기한정 알고리즘을 이용한 인간 유전체의 일배체형 조합문제 해결)

  • Choi, Mun-Ho;Kang, Seung-Ho;Lim, Hyeong-Seok
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.2 no.10
    • /
    • pp.697-704
    • /
    • 2013
  • The identification of haplotypes, which encode SNPs in a single chromosome, makes it possible to perform haplotype-based association tests with diseases. Minimum Error Correction model, one of models to computationally assemble a pair of haplotypes for a given organism from Single Nucleotide Polymorphism fragments, has been known to be NP-hard even for gapless cases. In the previous work, an improved branch and bound algorithm was suggested and showed that it is more efficient than naive branch and bound algorithm by performing experiments for Apis mellifera (honeybee) data set. In this paper, to show the extensibility of the algorithm to other organisms we apply the improved branch and bound algorithm to the human data set and confirm the efficiency of the algorithm.

The Correctness Comparison of MCIH Model and WMLF/GI Model for the Individual Haplotyping Reconstruction (일배체형 재조합을 위한 MCIH 모델과 WMLF/GI 모델의 정확도 비교)

  • Jeong, In-Seon;Kang, Seung-Ho;Lim, Hyeong-Seok
    • The KIPS Transactions:PartB
    • /
    • v.16B no.2
    • /
    • pp.157-161
    • /
    • 2009
  • Minimum Letter Flips(MLF) and Weighted Minimum Letter Flips(WMLF) can perform the haplotype reconstruction more accurately from SNP fragments when they have many errors and gaps by introducing the related genotype information. And it is known that WMLF is more accurate in haplotype reconstruction than those based on the MLF. In the paper, we analyze two models under the conditions that the different rates of homozygous site in the genotype information and the different confidence levels according to the sequencing quality. We compare the performance of the two models using neural network and genetic algorithm. If the rate of homozygous site is high and sequencing quality is good, the results of experiments indicate that WMLF/GI has higher accuracy of haplotype reconstruction than that of the MCIH especially when the error rate and gap rate of SNP fragments are high.