DOI QR코드

DOI QR Code

Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data

  • Kim, Soon-Young (Integrated Research Center for Genome Polymorphism, The Catholic University of Korea School of Medicine) ;
  • Kim, Ji-Hong (Integrated Research Center for Genome Polymorphism, The Catholic University of Korea School of Medicine) ;
  • Chung, Yeun-Jun (Integrated Research Center for Genome Polymorphism, The Catholic University of Korea School of Medicine)
  • Received : 2012.08.10
  • Accepted : 2012.08.23
  • Published : 2012.09.30

Abstract

In addition to single-nucleotide polymorphisms (SNP), copy number variation (CNV) is a major component of human genetic diversity. Among many whole-genome analysis platforms, SNP arrays have been commonly used for genomewide CNV discovery. Recently, a number of CNV defining algorithms from SNP genotyping data have been developed; however, due to the fundamental limitation of SNP genotyping data for the measurement of signal intensity, there are still concerns regarding the possibility of false discovery or low sensitivity for detecting CNVs. In this study, we aimed to verify the effect of combining multiple CNV calling algorithms and set up the most reliable pipeline for CNV calling with Affymetrix Genomewide SNP 5.0 data. For this purpose, we selected the 3 most commonly used algorithms for CNV segmentation from SNP genotyping data, PennCNV, QuantiSNP; and BirdSuite. After defining the CNV loci using the 3 different algorithms, we assessed how many of them overlapped with each other, and we also validated the CNVs by genomic quantitative PCR. Through this analysis, we proposed that for reliable CNV-based genomewide association study using SNP array data, CNV calls must be performed with at least 3 different algorithms and that the CNVs consistently called from more than 2 algorithms must be used for association analysis, because they are more reliable than the CNVs called from a single algorithm. Our result will be helpful to set up the CNV analysis protocols for Affymetrix Genomewide SNP 5.0 genotyping data.

Keywords

References

  1. McCarroll SA, Altshuler DM. Copy-number variation and association studies of human disease. Nat Genet 2007;39(7 Suppl):S37-S42. https://doi.org/10.1038/ng2080
  2. Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet 2006;7:85-97.
  3. Estivill X, Armengol L. Copy number variants and common disorders: filling the gaps and exploring complexity in genome- wide association studies. PLoS Genet 2007;3:1787-1799.
  4. Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, et al. Global variation in copy number in the human genome. Nature 2006;444:444-454. https://doi.org/10.1038/nature05329
  5. Yim SH, Kim TM, Hu HJ, Kim JH, Kim BJ, Lee JY, et al. Copy number variations in East-Asian population and their evolutionary and functional implications. Hum Mol Genet 2010; 19:1001-1008. https://doi.org/10.1093/hmg/ddp564
  6. Kim JH, Jung SH, Hu HJ, Yim SH, Chung YJ. Comparison of the Affymetrix SNP Array 5.0 and oligoarray platforms for defining CNV. Genomics Inform 2010;8:138-141. https://doi.org/10.5808/GI.2010.8.3.138
  7. Lai WR, Johnson MD, Kucherlapati R, Park PJ. Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics 2005;21:3763-3770. https://doi.org/10.1093/bioinformatics/bti611
  8. Dellinger AE, Saw SM, Goh LK, Seielstad M, Young TL, Li YJ. Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays. Nucleic Acids Res 2010;38:e105. https://doi.org/10.1093/nar/gkq040
  9. Barnes C, Plagnol V, Fitzgerald T, Redon R, Marchini J, Clayton D, et al. A robust statistical method for case-control association testing with copy number variation. Nat Genet 2008;40: 1245-1252. https://doi.org/10.1038/ng.206
  10. Forer L, Schonherr S, Weissensteiner H, Haider F, Kluckner T, Gieger C, et al. CONAN: copy number variation analysis software for genome-wide association studies. BMC Bioinformatics 2010;11:318. https://doi.org/10.1186/1471-2105-11-318
  11. Pique-Regi R, Caceres A, González JR. R-Gada: a fast and flexible pipeline for copy number analysis in association studies. BMC Bioinformatics 2010;11:380. https://doi.org/10.1186/1471-2105-11-380
  12. Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559-575. https://doi.org/10.1086/519795
  13. Subirana I, Diaz-Uriarte R, Lucas G, Gonzalez JR. CNVassoc: association analysis of CNV data using R. BMC Med Genomics 2011;4:47. https://doi.org/10.1186/1755-8794-4-47
  14. Kim JH, Hu HJ, Yim SH, Bae JS, Kim SY, Chung YJ. CNVRuler: a copy number variation-based case-control association analysis tool. Bioinformatics 2012;28:1790-1792. https://doi.org/10.1093/bioinformatics/bts239
  15. Winchester L, Yau C, Ragoussis J. Comparing CNV detection methods for SNP arrays. Brief Funct Genomic Proteomic 2009;8: 353-366. https://doi.org/10.1093/bfgp/elp017
  16. Kidd JM, Cooper GM, Donahue WF, Hayden HS, Sampas N, Graves T, et al. Mapping and sequencing of structural variation from eight human genomes. Nature 2008;453:56-64. https://doi.org/10.1038/nature06862
  17. Baumbusch LO, Aarøe J, Johansen FE, Hicks J, Sun H, Bruhn L, et al. Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors. BMC Genomics 2008;9:379. https://doi.org/10.1186/1471-2164-9-379
  18. Curtis C, Lynch AG, Dunning MJ, Spiteri I, Marioni JC, Hadfield J, et al. The pitfalls of platform comparison: DNA copy number array technologies assessed. BMC Genomics 2009;10:588. https://doi.org/10.1186/1471-2164-10-588
  19. Hester SD, Reid L, Nowak N, Jones WD, Parker JS, Knudtson K, et al. Comparison of comparative genomic hybridization technologies across microarray platforms. J Biomol Tech 2009; 20:135-151.
  20. Pinto D, Darvishi K, Shi X, Rajan D, Rigler D, Fitzgerald T, et al. Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 2011;29:512-520. https://doi.org/10.1038/nbt.1852
  21. Ramayo-Caldas Y, Castello A, Pena RN, Alves E, Mercade A, Souza CA, et al. Copy number variation in the porcine genome inferred from a 60 k SNP BeadChip. BMC Genomics 2010; 11:593. https://doi.org/10.1186/1471-2164-11-593
  22. Degenhardt F, Priebe L, Herms S, Mattheisen M, Muhleisen TW, Meier S, et al. Association between copy number variants in 16p11.2 and major depressive disorder in a German case-control sample. Am J Med Genet B Neuropsychiatr Genet 2012;159B:263-273. https://doi.org/10.1002/ajmg.b.32034
  23. Marenne G, Rodriguez-Santiago B, Closas MG, Perez-Jurado L, Rothman N, Rico D, et al. Assessment of copy number variation using the Illumina Infinium 1M SNP-array: a comparison of methodological approaches in the Spanish Bladder Cancer/EPICURO study. Hum Mutat 2011;32:240-248. https://doi.org/10.1002/humu.21398
  24. Kawamura Y, Otowa T, Koike A, Sugaya N, Yoshida E, Yasuda S, et al. A genome-wide CNV association study on panic disorder in a Japanese population. J Hum Genet 2011;56: 852-856. https://doi.org/10.1038/jhg.2011.117
  25. Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF, et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 2007;17:1665-1674. https://doi.org/10.1101/gr.6861907
  26. Colella S, Yau C, Taylor JM, Mirza G, Butler H, Clouston P, et al. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res 2007;35:2013-2025. https://doi.org/10.1093/nar/gkm076
  27. Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 2008;40:1253-1260. https://doi.org/10.1038/ng.237
  28. Yim SH, Chung YJ, Jin EH, Shim SC, Kim JY, Kim YS, et al. The potential role of VPREB1 gene copy number variation in susceptibility to rheumatoid arthritis. Mol Immunol 2011;48: 1338-1343. https://doi.org/10.1016/j.molimm.2010.11.009
  29. Wineinger NE, Tiwari HK. The impact of errors in copy number variation detection algorithms on association results. PLoS One 2012;7:e32396. https://doi.org/10.1371/journal.pone.0032396

Cited by

  1. Copy Number Variation in Hereditary Non-Polyposis Colorectal Cancer vol.4, pp.4, 2013, https://doi.org/10.3390/genes4040536
  2. Constitutive Function of the Ikaros Transcription Factor in Primary Leukemia Cells from Pediatric Newly Diagnosed High-Risk and Relapsed B-precursor ALL Patients vol.8, pp.11, 2013, https://doi.org/10.1371/journal.pone.0080732
  3. Genomic Copy Number Variants: Evidence for Association with Antibody Response to Anthrax Vaccine Adsorbed vol.8, pp.5, 2013, https://doi.org/10.1371/journal.pone.0064813
  4. Copy Number Variation Distribution in Six Monozygotic Twin Pairs Discordant for Schizophrenia vol.17, pp.02, 2014, https://doi.org/10.1017/thg.2014.6
  5. Haplotype Phasing and Inheritance of Copy Number Variants in Nuclear Families vol.10, pp.4, 2015, https://doi.org/10.1371/journal.pone.0122713
  6. Genomic copy number variation in Mus musculus vol.16, pp.1, 2015, https://doi.org/10.1186/s12864-015-1713-z
  7. Copy number variation analysis reveals additional variants contributing to endometriosis development vol.34, pp.1, 2017, https://doi.org/10.1007/s10815-016-0822-1
  8. Genome-wide meta-analysis of copy number variations with alcohol dependence pp.1473-1150, 2017, https://doi.org/10.1038/tpj.2017.35