Browse > Article
http://dx.doi.org/10.5808/GI.2015.13.2.31

Effect of Next-Generation Exome Sequencing Depth for Discovery of Diagnostic Variants  

Kim, Kyung (Department of Biomedical Informatics, Ajou University School of Medicine)
Seong, Moon-Woo (Department of Laboratory Medicine, Seoul National University Hospital College of Medicine)
Chung, Won-Hyong (Korean Bioinformation Center, Korea Research Institute of Bioscience and Biotechnology)
Park, Sung Sup (Department of Laboratory Medicine, Seoul National University Hospital College of Medicine)
Leem, Sangseob (Department of Biomedical Informatics, Ajou University School of Medicine)
Park, Won (Department of Functional Genomics, Korea University of Science and Technology)
Kim, Jihyun (Department of Biomedical Informatics, Ajou University School of Medicine)
Lee, KiYoung (Department of Biomedical Informatics, Ajou University School of Medicine)
Park, Rae Woong (Department of Biomedical Informatics, Ajou University School of Medicine)
Kim, Namshin (Department of Functional Genomics, Korea University of Science and Technology)
Abstract
Sequencing depth, which is directly related to the cost and time required for the generation, processing, and maintenance of next-generation sequencing data, is an important factor in the practical utilization of such data in clinical fields. Unfortunately, identifying an exome sequencing depth adequate for clinical use is a challenge that has not been addressed extensively. Here, we investigate the effect of exome sequencing depth on the discovery of sequence variants for clinical use. Toward this, we sequenced ten germ-line blood samples from breast cancer patients on the Illumina platform GAII(x) at a high depth of ${\sim}200{\times}$. We observed that most function-related diverse variants in the human exonic regions could be detected at a sequencing depth of $120{\times}$. Furthermore, investigation using a diagnostic gene set showed that the number of clinical variants identified using exome sequencing reached a plateau at an average sequencing depth of about $120{\times}$. Moreover, the phenomena were consistent across the breast cancer samples.
Keywords
clinical application; diagnostic variant; exome sequencing; genetic variation; high-throughput nucleotide sequence variant; sequencing;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Nielsen R, Paul JS, Albrechtsen A, Song YS. Genotype and SNP calling from next-generation sequencing data. Nat Rev Genet 2011;12:443-451.   DOI
2 Wendl MC, Wilson RK. Aspects of coverage in medical DNA sequencing. BMC Bioinformatics 2008;9:239.   DOI
3 Pan H, He Z, Ling L, Ding Q, Chen L, Zha X, et al. Reproductive factors and breast cancer risk among BRCA1 or BRCA2 mutation carriers: results from ten studies. Cancer Epidemiol 2014;38:1-8.   DOI
4 Wooster R, Bignell G, Lancaster J, Swift S, Seal S, Mangion J, et al. Identification of the breast cancer susceptibility gene BRCA2. Nature 1995;378:789-792.   DOI
5 Couch FJ, DeShano ML, Blackwood MA, Calzone K, Stopfer J, Campeau L, et al. BRCA1 mutations in women attending clinics that evaluate the risk of breast cancer. N Engl J Med 1997;336:1409-1415.   DOI
6 Ng SB, Turner EH, Robertson PD, Flygare SD, Bigham AW, Lee C, et al. Targeted capture and massively parallel sequencing of 12 human exomes. Nature 2009;461:272-276.   DOI
7 Choi M, Scholl UI, Ji W, Liu T, Tikhonova IR, Zumbo P, et al. Genetic diagnosis by whole exome capture and massively parallel DNA sequencing. Proc Natl Acad Sci U S A 2009;106:19096-19101.   DOI
8 Gullapalli RR, Desai KV, Santana-Santos L, Kant JA, Becich MJ. Next generation sequencing in clinical medicine: Challenges and lessons for pathology and biomedical informatics. J Pathol Inform 2012;3:40.   DOI
9 Huh HJ, Seo JY, Cho SY, Ki CS, Lee SY, Kim JW, et al. The first Korean case of mucopolysaccharidosis IIIC (Sanfilippo syndrome type C) confirmed by biochemical and molecular investigation. Ann Lab Med 2013;33:75-79.   DOI
10 de Ligt J, Willemsen MH, van Bon BW, Kleefstra T, Yntema HG, Kroes T, et al. Diagnostic exome sequencing in persons with severe intellectual disability. N Engl J Med 2012;367:1921-1929.   DOI
11 Thompson ER, Doyle MA, Ryland GL, Rowley SM, Choong DY, Tothill RW, et al. Exome sequencing identifies rare deleterious mutations in DNA repair genes FANCC and BLM as potential breast cancer susceptibility alleles. PLoS Genet 2012;8:e1002894.   DOI
12 Park DJ, Odefrey FA, Hammet F, Giles GG, Baglietto L, ABCFS, et al. FAN1 variants identified in multiple-case early-onset breast cancer families via exome sequencing: no evidence for association with risk for breast cancer. Breast Cancer Res Treat 2011;130:1043-1049.   DOI
13 Lonigro RJ, Grasso CS, Robinson DR, Jing X, Wu YM, Cao X, et al. Detection of somatic copy number alterations in cancer using targeted exome capture sequencing. Neoplasia 2011;13:1019-1025.   DOI
14 Wang L, Tsutsumi S, Kawaguchi T, Nagasaki K, Tatsuno K, Yamamoto S, et al. Whole-exome sequencing of human pancreatic cancers and characterization of genomic instability caused by MLH1 haploinsufficiency and complete deficiency. Genome Res 2012;22:208-219.   DOI
15 Le Gallo M, O'Hara AJ, Rudd ML, Urick ME, Hansen NF, O'Neil NJ, et al. Exome sequencing of serous endometrial tumors identifies recurrent somatic mutations in chromatin-remodeling and ubiquitin ligase complex genes. Nat Genet 2012;44:1310-1315.   DOI
16 Hou R, Yang Z, Li M, Xiao H. Impact of the next-generation sequencing data depth on various biological result inferences. Sci China Life Sci 2013;56:104-109.   DOI
17 Wang K, Kan J, Yuen ST, Shi ST, Chu KM, Law S, et al. Exome sequencing identifies frequent mutation of ARID1A in molecular subtypes of gastric cancer. Nat Genet 2011;43:1219-1223.   DOI
18 Liu P, Morrison C, Wang L, Xiong D, Vedell P, Cui P, et al. Identification of somatic mutations in non-small cell lung carcinomas using whole-exome sequencing. Carcinogenesis 2012;33:1270-1276.   DOI
19 Cao CC, Li C, Huang Z, Ma X, Sun X. Identifying rare variants with optimal depth of coverage and cost-effective overlapping pool sequencing. Genet Epidemiol 2013;37:820-830.   DOI
20 Ajay SS, Parker SC, Abaan HO, Fajardo KV, Margulies EH. Accurate and comprehensive sequencing of personal genomes. Genome Res 2011;21:1498-1505.   DOI
21 Li H, Durbin R. Fast and accurate long-read alignment with Burrows-Wheeler transform. Bioinformatics 2010;26:589-595.   DOI
22 Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The sequence alignment/Map format and SAMtools. Bioinformatics 2009;25:2078-2079.   DOI
23 McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a Map-Reduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010;20:1297-1303.   DOI
24 DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 2011;43:491-498.   DOI
25 Landrum MJ, Lee JM, Riley GR, Jang W, Rubinstein WS, Church DM, et al. ClinVar: public archive of relationships among sequence variation and human phenotype. Nucleic Acids Res 2014;42:D980-D985.   DOI
26 Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3. Fly (Austin) 2012;6:80-92.   DOI
27 Ng PC, Henikoff S. SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res 2003;31:3812-3814.   DOI
28 Smigielski EM, Sirotkin K, Ward M, Sherry ST. dbSNP: a database of single nucleotide polymorphisms. Nucleic Acids Res 2000;28:352-355.   DOI
29 Rhead B, Karolchik D, Kuhn RM, Hinrichs AS, Zweig AS, Fujita PA, et al. The UCSC genome browser database: update 2010. Nucleic Acids Res 2010;38:D613-D619.   DOI
30 Kananura C, Haug K, Sander T, Runge U, Gu W, Hallmann K, et al. A splice-site mutation in GABRG2 associated with childhood absence epilepsy and febrile convulsions. Arch Neurol 2002;59:1137-1141.   DOI
31 Carvalho GA, Weiss RE, Refetoff S. Complete thyroxine-binding globulin (TBG) deficiency produced by a mutation in acceptor splice site causing frameshift and early termination of translation (TBG-Kankakee). J Clin Endocrinol Metab 1998;83:3604-3608.
32 Parkinson DB, Thakker RV. A donor splice site mutation in the parathyroid hormone gene is associated with autosomal recessive hypoparathyroidism. Nat Genet 1992;1:149-152.   DOI