Browse > Article
http://dx.doi.org/10.5808/GI.2012.10.3.194

Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data  

Kim, Soon-Young (Integrated Research Center for Genome Polymorphism, The Catholic University of Korea School of Medicine)
Kim, Ji-Hong (Integrated Research Center for Genome Polymorphism, The Catholic University of Korea School of Medicine)
Chung, Yeun-Jun (Integrated Research Center for Genome Polymorphism, The Catholic University of Korea School of Medicine)
Abstract
In addition to single-nucleotide polymorphisms (SNP), copy number variation (CNV) is a major component of human genetic diversity. Among many whole-genome analysis platforms, SNP arrays have been commonly used for genomewide CNV discovery. Recently, a number of CNV defining algorithms from SNP genotyping data have been developed; however, due to the fundamental limitation of SNP genotyping data for the measurement of signal intensity, there are still concerns regarding the possibility of false discovery or low sensitivity for detecting CNVs. In this study, we aimed to verify the effect of combining multiple CNV calling algorithms and set up the most reliable pipeline for CNV calling with Affymetrix Genomewide SNP 5.0 data. For this purpose, we selected the 3 most commonly used algorithms for CNV segmentation from SNP genotyping data, PennCNV, QuantiSNP; and BirdSuite. After defining the CNV loci using the 3 different algorithms, we assessed how many of them overlapped with each other, and we also validated the CNVs by genomic quantitative PCR. Through this analysis, we proposed that for reliable CNV-based genomewide association study using SNP array data, CNV calls must be performed with at least 3 different algorithms and that the CNVs consistently called from more than 2 algorithms must be used for association analysis, because they are more reliable than the CNVs called from a single algorithm. Our result will be helpful to set up the CNV analysis protocols for Affymetrix Genomewide SNP 5.0 genotyping data.
Keywords
CNV defining algorithm; DNA copy number variations; SNP array;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 2008;40:1253-1260.   DOI   ScienceOn
2 Yim SH, Chung YJ, Jin EH, Shim SC, Kim JY, Kim YS, et al. The potential role of VPREB1 gene copy number variation in susceptibility to rheumatoid arthritis. Mol Immunol 2011;48: 1338-1343.   DOI   ScienceOn
3 Wineinger NE, Tiwari HK. The impact of errors in copy number variation detection algorithms on association results. PLoS One 2012;7:e32396.   DOI
4 Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA, Bender D, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559-575.   DOI   ScienceOn
5 Subirana I, Diaz-Uriarte R, Lucas G, Gonzalez JR. CNVassoc: association analysis of CNV data using R. BMC Med Genomics 2011;4:47.   DOI
6 Kim JH, Hu HJ, Yim SH, Bae JS, Kim SY, Chung YJ. CNVRuler: a copy number variation-based case-control association analysis tool. Bioinformatics 2012;28:1790-1792.   DOI   ScienceOn
7 Winchester L, Yau C, Ragoussis J. Comparing CNV detection methods for SNP arrays. Brief Funct Genomic Proteomic 2009;8: 353-366.   DOI   ScienceOn
8 Kidd JM, Cooper GM, Donahue WF, Hayden HS, Sampas N, Graves T, et al. Mapping and sequencing of structural variation from eight human genomes. Nature 2008;453:56-64.   DOI   ScienceOn
9 Baumbusch LO, Aarøe J, Johansen FE, Hicks J, Sun H, Bruhn L, et al. Comparison of the Agilent, ROMA/NimbleGen and Illumina platforms for classification of copy number alterations in human breast tumors. BMC Genomics 2008;9:379.   DOI
10 Curtis C, Lynch AG, Dunning MJ, Spiteri I, Marioni JC, Hadfield J, et al. The pitfalls of platform comparison: DNA copy number array technologies assessed. BMC Genomics 2009;10:588.   DOI
11 Hester SD, Reid L, Nowak N, Jones WD, Parker JS, Knudtson K, et al. Comparison of comparative genomic hybridization technologies across microarray platforms. J Biomol Tech 2009; 20:135-151.
12 Pinto D, Darvishi K, Shi X, Rajan D, Rigler D, Fitzgerald T, et al. Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 2011;29:512-520.   DOI   ScienceOn
13 Ramayo-Caldas Y, Castello A, Pena RN, Alves E, Mercade A, Souza CA, et al. Copy number variation in the porcine genome inferred from a 60 k SNP BeadChip. BMC Genomics 2010; 11:593.   DOI
14 Degenhardt F, Priebe L, Herms S, Mattheisen M, Muhleisen TW, Meier S, et al. Association between copy number variants in 16p11.2 and major depressive disorder in a German case-control sample. Am J Med Genet B Neuropsychiatr Genet 2012;159B:263-273.   DOI   ScienceOn
15 Marenne G, Rodriguez-Santiago B, Closas MG, Perez-Jurado L, Rothman N, Rico D, et al. Assessment of copy number variation using the Illumina Infinium 1M SNP-array: a comparison of methodological approaches in the Spanish Bladder Cancer/EPICURO study. Hum Mutat 2011;32:240-248.   DOI   ScienceOn
16 Kawamura Y, Otowa T, Koike A, Sugaya N, Yoshida E, Yasuda S, et al. A genome-wide CNV association study on panic disorder in a Japanese population. J Hum Genet 2011;56: 852-856.   DOI   ScienceOn
17 Wang K, Li M, Hadley D, Liu R, Glessner J, Grant SF, et al. PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 2007;17:1665-1674.   DOI   ScienceOn
18 Feuk L, Carson AR, Scherer SW. Structural variation in the human genome. Nat Rev Genet 2006;7:85-97.
19 Colella S, Yau C, Taylor JM, Mirza G, Butler H, Clouston P, et al. QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res 2007;35:2013-2025.   DOI   ScienceOn
20 McCarroll SA, Altshuler DM. Copy-number variation and association studies of human disease. Nat Genet 2007;39(7 Suppl):S37-S42.   DOI
21 Estivill X, Armengol L. Copy number variants and common disorders: filling the gaps and exploring complexity in genome- wide association studies. PLoS Genet 2007;3:1787-1799.
22 Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, et al. Global variation in copy number in the human genome. Nature 2006;444:444-454.   DOI   ScienceOn
23 Yim SH, Kim TM, Hu HJ, Kim JH, Kim BJ, Lee JY, et al. Copy number variations in East-Asian population and their evolutionary and functional implications. Hum Mol Genet 2010; 19:1001-1008.   DOI   ScienceOn
24 Kim JH, Jung SH, Hu HJ, Yim SH, Chung YJ. Comparison of the Affymetrix SNP Array 5.0 and oligoarray platforms for defining CNV. Genomics Inform 2010;8:138-141.   DOI
25 Lai WR, Johnson MD, Kucherlapati R, Park PJ. Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data. Bioinformatics 2005;21:3763-3770.   DOI   ScienceOn
26 Dellinger AE, Saw SM, Goh LK, Seielstad M, Young TL, Li YJ. Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays. Nucleic Acids Res 2010;38:e105.   DOI   ScienceOn
27 Pique-Regi R, Caceres A, González JR. R-Gada: a fast and flexible pipeline for copy number analysis in association studies. BMC Bioinformatics 2010;11:380.   DOI
28 Barnes C, Plagnol V, Fitzgerald T, Redon R, Marchini J, Clayton D, et al. A robust statistical method for case-control association testing with copy number variation. Nat Genet 2008;40: 1245-1252.   DOI   ScienceOn
29 Forer L, Schonherr S, Weissensteiner H, Haider F, Kluckner T, Gieger C, et al. CONAN: copy number variation analysis software for genome-wide association studies. BMC Bioinformatics 2010;11:318.   DOI