Browse > Article
http://dx.doi.org/10.5808/GI.2012.10.2.81

Identifying Copy Number Variants under Selection in Geographically Structured Populations Based on F-statistics  

Song, Hae-Hiang (Division of Biostatistics, Department of Medical Lifescience, The Catholic University of Korea, College of Medicine)
Hu, Hae-Jin (Department of Microbiology, Integrated Research Center for Genome Polymorphism, The Catholic University of Korea, College of Medicine)
Seok, In-Hae (Department of Statistics, Hankuk University of Foreign Studies)
Chung, Yeun-Jun (Department of Microbiology, Integrated Research Center for Genome Polymorphism, The Catholic University of Korea, College of Medicine)
Abstract
Large-scale copy number variants (CNVs) in the human provide the raw material for delineating population differences, as natural selection may have affected at least some of the CNVs thus far discovered. Although the examination of relatively large numbers of specific ethnic groups has recently started in regard to inter-ethnic group differences in CNVs, identifying and understanding particular instances of natural selection have not been performed. The traditional $F_{ST}$ measure, obtained from differences in allele frequencies between populations, has been used to identify CNVs loci subject to geographically varying selection. Here, we review advances and the application of multinomial-Dirichlet likelihood methods of inference for identifying genome regions that have been subject to natural selection with the $F_{ST}$ estimates. The contents of presentation are not new; however, this review clarifies how the application of the methods to CNV data, which remains largely unexplored, is possible. A hierarchical Bayesian method, which is implemented via Markov Chain Monte Carlo, estimates locus-specific $F_{ST}$ and can identify outlying CNVs loci with large values of FST. By applying this Bayesian method to the publicly available CNV data, we identified the CNV loci that show signals of natural selection, which may elucidate the genetic basis of human disease and diversity.
Keywords
Bayes theorem; DNA copy number variations; population structure; selection; Wright's $F_{ST}$;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Gonzalez E, Kulkarni H, Bolivar H, Mangano A, Sanchez R, Catano G, et al. The influence of CCL3L1 gene-containing segmental duplications on HIV-1/AIDS susceptibility. Science 2005;307:1434-1440.   DOI   ScienceOn
2 Wright S. The genetical structure of populations. Ann Hum Genet 1949;15:323-354.   DOI
3 Holsinger KE, Weir BS. Genetics in geographically structured populations: defining, estimating and interpreting F(ST). Nat Rev Genet 2009;10:639-650.   DOI   ScienceOn
4 Foll M. BayeScan v2.0 User Manual. BayeScan, 2010.
5 Akey JM, Zhang G, Zhang K, Jin L, Shriver MD. Interrogating a high-density SNP map for signatures of natural selection. Genome Res 2002;12:1805-1814.   DOI   ScienceOn
6 Cockerham CC. Analyses of gene frequencies. Genetics 1973;74:679-700.
7 Weir BS. Genetic Data Analysis II: Methods for Discrete Population Genetic Data. Sunderland: Sinauer Associates, 1996.
8 Weir BS, Cockerham CC. Estimating F-statistics for the analysis of population structure. Evolution 1984;38:1358-1370.   DOI   ScienceOn
9 Balding DJ. Likelihood-based inference for genetic correlation coefficients. Theor Popul Biol 2003;63:221-230.   DOI   ScienceOn
10 Rousset F. Inferences from spatial population genetics. In: Handbook of Statistical Genetics (Balding DJ, Bishop MJ, Cannings C, eds.). Chichester: Wiley, 2001. pp. 239-269.
11 Rousset F. genepop'007: a complete re-implementation of the genepop software for Windows and Linux. Mol Ecol Resour 2008;8:103-106.   DOI   ScienceOn
12 Beaumont MA, Balding DJ. Identifying adaptive genetic divergence among populations from genome scans. Mol Ecol 2004;13:969-980.   DOI   ScienceOn
13 Vitalis R, Dawson K, Boursot P. Interpretation of variation across marker loci as evidence of selection. Genetics 2001;158:1811-1823.
14 Beaumont MA, Nichols RA. Evaluating loci for use in the genetic analysis of population structure. Proc R Soc Lond Series B Biol Sci 1996;263;1619-1626.   DOI   ScienceOn
15 Falush D, Stephens M, Pritchard JK. Inference of population structure using multilocus genotype data: linked loci and correlated allele frequencies. Genetics 2003;164:1567-1587.
16 Foll M, Gaggiotti O. Identifying the environmental factors that determine the genetic structure of populations. Genetics 2006;174:875-891.   DOI   ScienceOn
17 Foll M, Gaggiotti O. A genome-scan method to identify selected loci appropriate for both dominant and codominant markers: a Bayesian perspective. Genetics 2008;180:977-993.   DOI   ScienceOn
18 Green PJ. Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 1995;82:711-732.   DOI   ScienceOn
19 Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B Methodol 1995;57:289-300.
20 Voight BF, Kudaravalli S, Wen X, Pritchard JK. A map of recent positive selection in the human genome. PLoS Biol 2006;4:e72.   DOI   ScienceOn
21 Conrad DF, Pinto D, Redon R, Feuk L, Gokcumen O, Zhang Y, et al. Origins and functional impact of copy number variation in the human genome. Nature 2010;464:704-712.   DOI   ScienceOn
22 Sebat J, Lakshmi B, Troge J, Alexander J, Young J, Lundin P, et al. Large-scale copy number polymorphism in the human genome. Science 2004;305:525-528.   DOI   ScienceOn
23 Tuzun E, Sharp AJ, Bailey JA, Kaul R, Morrison VA, Pertz LM, et al. Fine-scale structural variation of the human genome. Nat Genet 2005;37:727-732.   DOI   ScienceOn
24 Redon R, Ishikawa S, Fitch KR, Feuk L, Perry GH, Andrews TD, et al. Global variation in copy number in the human genome. Nature 2006;444:444-454.   DOI   ScienceOn
25 Lupski JR. Genomic disorders: structural features of the genome can lead to DNA rearrangements and human disease traits. Trends Genet 1998;14:417-422.   DOI   ScienceOn
26 Stankiewicz P, Lupski JR. Genome architecture, rearrangements and genomic disorders. Trends Genet 2002;18:74-82.   DOI   ScienceOn
27 Jakobsson M, Scholz SW, Scheet P, Gibbs JR, VanLiere JM, Fung HC, et al. Genotype, haplotype and copy-number variation in worldwide human populations. Nature 2008;451:998-1003.   DOI   ScienceOn
28 Fanciulli M, Norsworthy PJ, Petretto E, Dong R, Harper L, Kamesh L, et al. FCGR3B copy number variation is associated with susceptibility to systemic, but not organ-specific, autoimmunity. Nat Genet 2007;39:721-723.   DOI   ScienceOn
29 Yang Y, Chung EK, Wu YL, Savelli SL, Nagaraja HN, Zhou B, et al. Gene copy-number variation and associated polymorphisms of complement component C4 in human systemic lupus erythematosus (SLE): low copy number is a risk factor for and high copy number is a protective factor against SLE susceptibility in European Americans. Am J Hum Genet 2007;80:1037-1054.   DOI   ScienceOn
30 Hollox EJ, Huffmeier U, Zeeuwen PL, Palla R, Lascorz J, Rodijk-Olthuis D, et al. Psoriasis is associated with increased beta-defensin genomic copy number. Nat Genet 2008;40:23-25.   DOI