Browse > Article
http://dx.doi.org/10.6109/jkiice.2013.17.12.3009

Genotype-Calling System for Somatic Mutation Discovery in Cancer Genome Sequence  

Park, Su-Young (Department of Computer Science & Statistics, Chosun University)
Jung, Chai-Yeoung (Department of Computer Science & Statistics, Chosun University)
Abstract
Next-generation sequencing (NGS) has enabled whole genome and transcriptome single nucleotide variant (SNV) discovery in cancer and method of the most fundamental being determining an individual's genotype from multiple aligned short read sequences at a position. Bayesian algorithm estimate parameter using posterior genotype probabilities and other method, EM algorithm, estimate parameter using maximum likelihood estimate method in observed data. Here, we propose a novel genotype-calling system and compare and analyze the effect of sample size(S = 50, 100 and 500) on posterior estimate of sequencing error rate, somatic mutation status and genotype probability. The result is that estimate applying Bayesian algorithm even for 50 of small sample size approached real parameter than estimate applying EM algorithm in small sample more accurately.
Keywords
NGS(Next-Generation Sequencing); Bayesian algorithm; EM algorithm; genotype-calling system;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Meng, X. L. and Rubin, D. B., "Using EM to obtain asymptotic variance-covariance matrices: the SEM algorithm.," J. Am. Stat. Assoc., vol. 86, no. 416, pp. 899-909, Dec. 1991.   DOI   ScienceOn
2 J. G., Bayesian Methods: A social and Behavioral Sciences Approach, 2th ed. Chapman & Hall/CRC, 2009.
3 E. R. Martin, D. D. Kinnamon, M. A. Schmidt, E. H. Powell, S. Zuchner and R. W. Morris, "SeqEM: an adaptive genotype-calling approach for next-generation sequencing studies," Bioinformatics, vol. 26, no. 22, pp. 2803-2810, 2010.   DOI   ScienceOn
4 Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al., "The Sequence Alignment/Map format and SAMtools.," Bioinformatics. vol. 25, pp. 2078-2079, Aug. 2009.   DOI   ScienceOn
5 D. J. Spieglhalter, J. P. Myles, D. R. Jones, K. R. Abrams, "Bayesian methods in health technology assessment: review," Health Technology Assessment, vol. 4, no. 38, pp.1-130, 2000.
6 Jonathan Marchini, Bryan Howie, Simon Myers, Gil McVean & Donnelly, "A new multipoint method for genome-wide association studies by imputation of genotypes.," nature genetics, vol. 39, no. 7, pp. 906-913, June 2007.   DOI   ScienceOn
7 Li, H. et al., "Mapping short DNA sequencing reads and calling variants using mapping quality scores.," Genome Res., vol. 18, pp. 1851-1858, Aug. 2008.   DOI   ScienceOn
8 Lin, D. Y. et al., "Simple and efficient analysis of disease associateion with missing genotype data.," Am. J. Hum. Genet., vol. 82, pp. 444-452, Feb. 2008.   DOI   ScienceOn
9 Ng, S. B. et al., "Exome Sequencing identifies the cause of a mendelian disorder," Nat. Geneti., vol. 42, pp. 30-35, 2010.   DOI   ScienceOn