DOI QR코드

DOI QR Code

Sample Size and Statistical Power Calculation in Genetic Association Studies

  • Hong, Eun-Pyo (Department of Medical Genetics, Hallym University College of Medicine) ;
  • Park, Ji-Wan (Department of Medical Genetics, Hallym University College of Medicine)
  • Received : 2012.04.13
  • Accepted : 2012.05.17
  • Published : 2012.06.30

Abstract

A sample size with sufficient statistical power is critical to the success of genetic association studies to detect causal genes of human complex diseases. Genome-wide association studies require much larger sample sizes to achieve an adequate statistical power. We estimated the statistical power with increasing numbers of markers analyzed and compared the sample sizes that were required in case-control studies and case-parent studies. We computed the effective sample size and statistical power using Genetic Power Calculator. An analysis using a larger number of markers requires a larger sample size. Testing a single-nucleotide polymorphism (SNP) marker requires 248 cases, while testing 500,000 SNPs and 1 million markers requires 1,206 cases and 1,255 cases, respectively, under the assumption of an odds ratio of 2, 5% disease prevalence, 5% minor allele frequency, complete linkage disequilibrium (LD), 1:1 case/control ratio, and a 5% error rate in an allelic test. Under a dominant model, a smaller sample size is required to achieve 80% power than other genetic models. We found that a much lower sample size was required with a strong effect size, common SNP, and increased LD. In addition, studying a common disease in a case-control study of a 1:4 case-control ratio is one way to achieve higher statistical power. We also found that case-parent studies require more samples than case-control studies. Although we have not covered all plausible cases in study design, the estimates of sample size and statistical power computed under various assumptions in this study may be useful to determine the sample size in designing a population-based genetic association study.

Keywords

References

  1. Cardon LR, Bell JI. Association study designs for complex diseases. Nat Rev Genet 2001;2:91-99.
  2. Gordon D, Finch SJ, Nothnagel M, Ott J. Power and sample size calculations for case-control genetic association tests when errors are present: application to single nucleotide polymorphisms. Hum Hered 2002;54:22-33. https://doi.org/10.1159/000066696
  3. Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive review of genetic association studies. Genet Med 2002;4:45-61. https://doi.org/10.1097/00125817-200203000-00002
  4. Scherag A, Müller HH, Dempfle A, Hebebrand J, Schäfer H. Data adaptive interim modification of sample sizes for candidate- gene association studies. Hum Hered 2003;56:56-62. https://doi.org/10.1159/000073733
  5. Lunetta KL. Genetic association studies. Circulation 2008;118:96-101. https://doi.org/10.1161/CIRCULATIONAHA.107.700401
  6. Gordon D, Levenstien MA, Finch SJ, Ott J. Errors and linkage disequilibrium interact multiplicatively when computing sample sizes for genetic case-control association studies. Pac Symp Biocomput 2003:490-501.
  7. Pfeiffer RM, Gail MH. Sample size calculations for populationand family-based case-control association studies on marker genotypes. Genet Epidemiol 2003;25:136-148. https://doi.org/10.1002/gepi.10245
  8. Risch N, Teng J. The relative power of family-based and case-control designs for linkage disequilibrium studies of complex human diseases I. DNA pooling. Genome Res 1998;8:1273-1288.
  9. Risch NJ. Searching for genetic determinants in the new millennium. Nature 2000;405:847-856. https://doi.org/10.1038/35015718
  10. Van den Oord EJ. A comparison between different designs and tests to detect QTLs in association studies. Behav Genet 1999;29:245-256. https://doi.org/10.1023/A:1021690206763
  11. Hintsanen P, Sevon P, Onkamo P, Eronen L, Toivonen H. An empirical comparison of case-control and trio based study designs in high throughput association mapping. J Med Genet 2006;43:617-624.
  12. Buyske S, Yang G, Matise TC, Gordon D. When a case is not a case: effects of phenotype misclassification on power and sample size requirements for the transmission disequilibrium test with affected child trios. Hum Hered 2009;67:287-292. https://doi.org/10.1159/000194981
  13. Peng B, Li B, Han Y, Amos CI. Power analysis for case-control association studies of samples with known family histories. Hum Genet 2010;127:699-704. https://doi.org/10.1007/s00439-010-0824-5
  14. Klein RJ. Power analysis for genome-wide association studies. BMC Genet 2007;8:58.
  15. Zondervan KT, Cardon LR. Designing candidate gene and genome- wide case-control association studies. Nat Protoc 2007;2:2492-2501. https://doi.org/10.1038/nprot.2007.366
  16. Spencer CC, Su Z, Donnelly P, Marchini J. Designing genome- wide association studies: sample size, power, imputation, and the choice of genotyping chip. PLoS Genet 2009;5:e1000477. https://doi.org/10.1371/journal.pgen.1000477
  17. Wu Z, Zhao H. Statistical power of model selection strategies for genome-wide association studies. PLoS Genet 2009;5:e1000582. https://doi.org/10.1371/journal.pgen.1000582
  18. Park JH, Wacholder S, Gail MH, Peters U, Jacobs KB, Chanock SJ, et al. Estimation of effect size distribution from genome- wide association studies and implications for future discoveries. Nat Genet 2010;42:570-575. https://doi.org/10.1038/ng.610
  19. Park AK, Kim H. A review of power and sample size estimation in genomewide association studies. J Prev Med Public Health 2007;40:114-121. https://doi.org/10.3961/jpmph.2007.40.2.114
  20. Whitley E, Ball J. Statistics review 4: sample size calculations. Crit Care 2002;6:335-341. https://doi.org/10.1186/cc1521
  21. Comeron JM, Kreitman M, De La Vega FM. On the power to detect SNP/phenotype association in candidate quantitative trait loci genomic regions: a simulation study. Pac Symp Biocomput 2003;8:478-489.
  22. Satagopan JM, Venkatraman ES, Begg CB. Two-stage designs for gene-disease association studies with sample size constraints. Biometrics 2004;60:589-597. https://doi.org/10.1111/j.0006-341X.2004.00207.x
  23. Ahn C. Sample size and power estimation in case-control genetic association studies. Genomics Inform 2006;4:51-56.
  24. Menashe I, Rosenberg PS, Chen BE. PGA: power calculator for case-control genetic association analyses. BMC Genet 2008;9:36.
  25. Laurie CC, Doheny KF, Mirel DB, Pugh EW, Bierut LJ, Bhangale T, et al. Quality control and quality assurance in genotypic data for genome-wide association studies. Genet Epidemiol 2010;34:591-602. https://doi.org/10.1002/gepi.20516
  26. Houle TT, Penzien DB, Houle CK. Statistical power and sample size estimation for headache research: an overview and power calculation tools. Headache 2005;45:414-418. https://doi.org/10.1111/j.1526-4610.2005.05092.x
  27. Purcell S, Cherny SS, Sham PC. Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics 2003;19:149-150. https://doi.org/10.1093/bioinformatics/19.1.149
  28. Manolio TA, Brooks LD, Collins FS. A HapMap harvest of insights into the genetics of common disease. J Clin Invest 2008;118:1590-1605. https://doi.org/10.1172/JCI34772
  29. Hirschhorn JN. Genomewide association studies: illuminating biologic pathways. N Engl J Med 2009;360:1699-1701. https://doi.org/10.1056/NEJMp0808934
  30. Liu X, Wang Y, Rekaya R, Sriram TN. Sample size determination for classifiers based on single-nucleotide polymorphisms. Biostatistics 2012;13:217-227. https://doi.org/10.1093/biostatistics/kxr053
  31. Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci U S A 2009;106:9362-9367. https://doi.org/10.1073/pnas.0903103106

Cited by

  1. Association of Leukotriene Gene Variants and Plasma LTB4 Levels with Coronary Artery Disease in Asian Indians vol.2013, pp.2090-5831, 2013, https://doi.org/10.1155/2013/985743
  2. The potential effect of gender in CYP1A1 and GSTM1 genotype-specific associations with pediatric brain tumor vol.34, pp.5, 2013, https://doi.org/10.1007/s13277-013-0823-y
  3. CYP1A1, GCLC, AGT, AGTR1 gene–gene interactions in community-acquired pneumonia pulmonary complications vol.40, pp.11, 2013, https://doi.org/10.1007/s11033-013-2727-8
  4. Genes for Elite Power and Sprint Performance: ACTN3 Leads the Way vol.43, pp.9, 2013, https://doi.org/10.1007/s40279-013-0059-4
  5. Natural selection and infectious disease in human populations vol.15, pp.6, 2014, https://doi.org/10.1038/nrg3734
  6. Replication of NTNG1 association in schizophrenia vol.24, pp.6, 2014, https://doi.org/10.1097/YPG.0000000000000061
  7. Application of routine electronic health record databases for pharmacogenetic research vol.275, pp.6, 2014, https://doi.org/10.1111/joim.12226
  8. gene polymorphism and muscle damage markers in elite athletes vol.48, pp.8, 2014, https://doi.org/10.3109/10715762.2014.928410
  9. Resequencing and Association Analysis of PTPRA, a Possible Susceptibility Gene for Schizophrenia and Autism Spectrum Disorders vol.9, pp.11, 2014, https://doi.org/10.1371/journal.pone.0112531
  10. R577X Polymorphism and Explosive Leg-Muscle Power in Elite Basketball Players vol.9, pp.2, 2014, https://doi.org/10.1123/ijspp.2012-0331
  11. Association between a polymorphism of the vasopressin 1B receptor gene and aggression in children pp.0955-8829, 2014, https://doi.org/10.1097/YPG.0000000000000036
  12. Genome-Wide Association Study in Thai Tsunami Survivors Identified Risk Alleles for Posttraumatic Stress Disorder vol.05, pp.02, 2015, https://doi.org/10.4236/ojgen.2015.52004
  13. GABRB2 Haplotype Association with Heroin Dependence in Chinese Population vol.10, pp.11, 2015, https://doi.org/10.1371/journal.pone.0142049
  14. Gender-specific associations between ADIPOQ gene polymorphisms and adiponectin levels and obesity in the Jackson Heart Study cohort vol.16, pp.1, 2015, https://doi.org/10.1186/s12881-015-0214-x
  15. Association of ADIPOQ gene with type 2 diabetes and related phenotypes in African American men and women: the Jackson Heart Study vol.16, pp.1, 2015, https://doi.org/10.1186/s12863-015-0319-4
  16. The heritable path of human physical performance: from single polymorphisms to the “next generation” vol.26, pp.6, 2015, https://doi.org/10.1111/sms.12503
  17. Interpretation of negative results in genetic epidemiology vol.3, pp.2, 2015, https://doi.org/10.4168/aard.2015.3.2.93
  18. Big data challenges in bone research: genome-wide association studies and next-generation sequencing vol.4, pp.20476396, 2015, https://doi.org/10.1038/bonekey.2015.2
  19. Machine Learning Data Imputation and Classification in a Multicohort Hypertension Clinical Study vol.9s3, pp.1177-9322, 2015, https://doi.org/10.4137/BBI.S29473
  20. AGORA, a data- and biobank for birth defects and childhood cancer vol.106, pp.8, 2016, https://doi.org/10.1002/bdra.23512
  21. Genome-wide Association Study Identifies Loci for the Polled Phenotype in Yak vol.11, pp.7, 2016, https://doi.org/10.1371/journal.pone.0158642
  22. Impact of host genetic polymorphisms on vaccine induced antibody response vol.12, pp.4, 2016, https://doi.org/10.1080/21645515.2015.1119345
  23. The role of active brown adipose tissue (aBAT) in lipid metabolism in healthy Chinese adults vol.15, pp.1, 2016, https://doi.org/10.1186/s12944-016-0310-8
  24. rs1801133 polymorphism and susceptibility to colorectal cancer in Iranian population: evidence of a case–control study and meta-analysis vol.17, pp.17, 2016, https://doi.org/10.2217/pgs-2016-0048
  25. A qualitative analysis of the attitudes of Irish patients towards participation in genetic-based research vol.185, pp.4, 2016, https://doi.org/10.1007/s11845-015-1373-7
  26. Genome-wide association study of 40,000 individuals identifies two novel loci associated with bipolar disorder vol.25, pp.15, 2016, https://doi.org/10.1093/hmg/ddw181
  27. Association mapping in Brassica napus (L.) accessions identifies a major QTL for blackleg disease resistance on chromosome A01 vol.36, pp.7, 2016, https://doi.org/10.1007/s11032-016-0513-8
  28. Genetic Basis of Chronotype in Humans: Insights From Three Landmark GWAS vol.40, pp.2, 2016, https://doi.org/10.1093/sleep/zsw048
  29. A Multi-locus Approach to Characterization of Major Quantitative Trait Loci Influencing Hb F Regulation in Chinese β-thalassemia Carriers vol.40, pp.6, 2016, https://doi.org/10.1080/03630269.2016.1245198
  30. The role of active brown adipose tissue in human metabolism vol.43, pp.2, 2016, https://doi.org/10.1007/s00259-015-3166-7
  31. Insertion/Insertion Genotype of Angiotensin I-Converting-Enzyme Gene Predicts Risk of Myocardial Infarction in North East India vol.54, pp.2, 2016, https://doi.org/10.1007/s10528-015-9706-9
  32. Association Mapping in Turkish Olive Cultivars Revealed Significant Markers Related to Some Important Agronomic Traits vol.54, pp.4, 2016, https://doi.org/10.1007/s10528-016-9738-9
  33. Dopamine pathway gene variants may modulate cognitive performance in the DHS - Mind Study vol.6, pp.4, 2016, https://doi.org/10.1002/brb3.446
  34. Polymorphisms of the Toll-Like Receptor-3 Gene in Autoimmune Adrenal Failure and Type 1 Diabetes in Polish Patients vol.64, pp.1, 2016, https://doi.org/10.1007/s00005-015-0360-z
  35. Elucidating the role of the host genome in shaping microbiome composition vol.7, pp.2, 2016, https://doi.org/10.1080/19490976.2016.1155022
  36. From integrative genomics to systems genetics in the rat to link genotypes to phenotypes vol.9, pp.10, 2016, https://doi.org/10.1242/dmm.026104
  37. The role of human host genetics in tuberculosis resistance vol.11, pp.9, 2017, https://doi.org/10.1080/17476348.2017.1354700
  38. Mitochondrial superclusters influence age of onset of Parkinson’s disease in a gender specific manner in the Cypriot population: A case-control study vol.12, pp.9, 2017, https://doi.org/10.1371/journal.pone.0183444
  39. haplotype for familial hepatitis B virus-related hepatocellular carcinoma vol.123, pp.20, 2017, https://doi.org/10.1002/cncr.30851
  40. Genetic loci associated with coronary artery disease harbor evidence of selection and antagonistic pleiotropy vol.13, pp.6, 2017, https://doi.org/10.1371/journal.pgen.1006328
  41. Genetic diversity, population structure, and linkage disequilibrium of elite and local apple accessions from Belgium using the IRSC array vol.13, pp.6, 2017, https://doi.org/10.1007/s11295-017-1206-0
  42. Genome-wide mapping and prediction suggests presence of local epistasis in a vast elite winter wheat populations adapted to Central Europe vol.130, pp.4, 2017, https://doi.org/10.1007/s00122-016-2840-x
  43. Sensorimotor Learning: Neurocognitive Mechanisms and Individual Differences vol.14, pp.1, 2017, https://doi.org/10.1186/s12984-017-0279-1
  44. An intronic single-nucleotide polymorphism (rs13217795) in FOXO3 is associated with asthma and allergic rhinitis: a case–case–control study vol.18, pp.1, 2017, https://doi.org/10.1186/s12881-017-0494-4
  45. Circadian CLOCK gene polymorphisms in relation to sleep patterns and obesity in African Americans: findings from the Jackson heart study vol.18, pp.1, 2017, https://doi.org/10.1186/s12863-017-0522-6
  46. Hepatic, lipid and genetic factors associated with obesity: crosstalk with alcohol dependence? vol.18, pp.2, 2017, https://doi.org/10.1080/15622975.2016.1249952
  47. Transplant genetics and genomics vol.18, pp.5, 2017, https://doi.org/10.1038/nrg.2017.12
  48. A Pilot Genome-Wide Association Study in Postmenopausal Mexican-Mestizo Women Implicates the RMND1/CCDC170 Locus Is Associated with Bone Mineral Density vol.2017, pp.2314-4378, 2017, https://doi.org/10.1155/2017/5831020
  49. Fifteen years of quantitative trait loci studies in fish: challenges and future directions vol.26, pp.6, 2017, https://doi.org/10.1111/mec.13965
  50. Deciphering the regulation of porcine genes influencing growth, fatness and yield-related traits through genetical genomics vol.28, pp.3-4, 2017, https://doi.org/10.1007/s00335-016-9674-3
  51. Mean Profiles of the NEO Personality Inventory vol.48, pp.3, 2017, https://doi.org/10.1177/0022022117692100
  52. L.) vol.48, pp.6, 2017, https://doi.org/10.1111/age.12621
  53. A Novel Association between Lysyl Oxidase Gene Polymorphism and Intracranial Aneurysm in Koreans vol.58, pp.5, 2017, https://doi.org/10.3349/ymj.2017.58.5.1006
  54. Transcriptomics in Human Challenge Models vol.8, pp.1664-3224, 2017, https://doi.org/10.3389/fimmu.2017.01839
  55. Shared additive genetic variation for alcohol dependence among subjects of African and European ancestry pp.13556215, 2017, https://doi.org/10.1111/adb.12578
  56. Properties of human disease genes and the role of genes linked to Mendelian disorders in complex disease aetiology pp.1460-2083, 2017, https://doi.org/10.1093/hmg/ddw405
  57. Power and sample size calculations for high-throughput sequencing-based experiments pp.1477-4054, 2017, https://doi.org/10.1093/bib/bbx061
  58. Advancing cancer drug development through precision medicine and innovative designs pp.1520-5711, 2017, https://doi.org/10.1080/10543406.2017.1402784
  59. Association of Polymorphisms of the Receptor for Advanced Glycation End Products Gene with COPD in the Chinese Population vol.33, pp.4, 2014, https://doi.org/10.1089/dna.2013.2303
  60. The Influence of the Apolipoprotein E (APOE) Gene on Subacute Post-Concussion Neurocognitive Performance in College Athletes vol.33, pp.1, 2017, https://doi.org/10.1093/arclin/acx051
  61. Novel Parkinson's disease risk loci identified through a meta-analysis of genome-wide association studies vol.33, pp.1, 2018, https://doi.org/10.1002/mds.27276
  62. SLC6A3 Is Associated With Relational Aggression in Children vol.38, pp.4, 2017, https://doi.org/10.1027/1614-0001/a000239
  63. Mx1, OAS1 and OAS2 polymorphisms are associated with the severity of liver disease in HIV/HCV-coinfected patients: A cross-sectional study vol.7, pp.1, 2017, https://doi.org/10.1038/srep41516
  64. Retinoic acid receptor beta promoter methylation and risk of cervical cancer vol.7, pp.1, 2018, https://doi.org/10.5501/wjv.v7.i1.1
  65. Sp1 Binding Site Polymorphism at COL1A1 Gene and Its Relation to Bone Mineral Density for Osteoporosis Risk Factor Among the Sikkimese Men and Women of Northeast India pp.0974-0422, 2019, https://doi.org/10.1007/s12291-017-0728-4
  66. Beyond genomics: understanding exposotypes through metabolomics vol.12, pp.1, 2018, https://doi.org/10.1186/s40246-018-0134-x
  67. Polymorphisms in mTOR and Calcineurin Signaling Pathways Are Associated With Long-Term Clinical Outcomes in Kidney Transplant Recipients vol.9, pp.1663-9812, 2018, https://doi.org/10.3389/fphar.2018.01296
  68. The Emergency Medicine Specimen Bank: An Innovative Approach To Biobanking In Acute Care pp.1553-2712, 2018, https://doi.org/10.1111/acem.13620
  69. TLR4 Polymorphisms and Expression in Solid Cancers pp.1179-2000, 2018, https://doi.org/10.1007/s40291-018-0361-9
  70. Genome-wide association study for electrolyte leakage in rapeseed/canola (Brassica napus L.) vol.38, pp.11, 2018, https://doi.org/10.1007/s11032-018-0892-0
  71. Centenarian controls increase variant effect sizes by an average twofold in an extreme case–extreme control analysis of Alzheimer’s disease pp.1476-5438, 2019, https://doi.org/10.1038/s41431-018-0273-5
  72. Apolipoprotein E (APOE) ε4 genotype is associated with reduced neuropsychological performance in military veterans with a history of mild traumatic brain injury vol.40, pp.10, 2018, https://doi.org/10.1080/13803395.2018.1508555
  73. Staphylococcus aureus CC30 Lineage and Absence of sed,j,r-Harboring Plasmid Predict Embolism in Infective Endocarditis vol.8, pp.2235-2988, 2018, https://doi.org/10.3389/fcimb.2018.00187
  74. Methods and results from the genome-wide association group at GAW20 vol.19, pp.S1, 2018, https://doi.org/10.1186/s12863-018-0649-0
  75. Identification of Novel Quantitative Trait Loci Linked to Crown Rot Resistance in Spring Wheat vol.19, pp.9, 2018, https://doi.org/10.3390/ijms19092666
  76. A Common Variation in the Caveolin 1 Gene Is Associated with High Serum Triglycerides and Metabolic Syndrome in an Admixed Latin American Population pp.1557-8518, 2018, https://doi.org/10.1089/met.2018.0004
  77. Genome-wide association study for frost tolerance in canola (Brassica napus L.) under field conditions pp.0974-1275, 2019, https://doi.org/10.1007/s13562-018-0472-8
  78. Genetic associations between ADHD and dopaminergic genes (DAT1 and DRD4) VNTRs in Korean children pp.2092-9293, 2018, https://doi.org/10.1007/s13258-018-0726-9
  79. Improving conservation policy with genomics: a guide to integrating adaptive potential into U.S. Endangered Species Act decisions for conservation practitioners and geneticists pp.1572-9737, 2018, https://doi.org/10.1007/s10592-018-1096-1
  80. Genetic Association with Subgingival Bacterial Colonization in Chronic Periodontitis vol.9, pp.6, 2018, https://doi.org/10.3390/genes9060271
  81. Influence of CRHR1 Polymorphisms and Childhood Abuse on Suicide Attempts in Affective Disorders: A GxE Approach vol.9, pp.1664-0640, 2018, https://doi.org/10.3389/fpsyt.2018.00165
  82. Association Study of VMAT1 Polymorphisms and Suicide Behavior vol.64, pp.4, 2018, https://doi.org/10.1007/s12031-018-1047-9
  83. Analysis of association of MEF2C, SOST and JAG1 genes with bone mineral density in Mexican-Mestizo postmenopausal women vol.15, pp.1, 2014, https://doi.org/10.1186/1471-2474-15-400
  84. Association between polymorphisms in the flanking region of the TAFI gene and atherosclerotic cerebral infarction in a Chinese population vol.13, pp.1, 2014, https://doi.org/10.1186/1476-511X-13-80
  85. The Effect of an Extreme and Prolonged Population Bottleneck on Patterns of Deleterious Variation: Insights from the Greenlandic Inuit vol.205, pp.2, 2016, https://doi.org/10.1534/genetics.116.193821
  86. Association of PPARGC1A Gly428Ser (rs8192678) polymorphism with potential for athletic ability and sports performance: A meta-analysis vol.14, pp.1, 2019, https://doi.org/10.1371/journal.pone.0200967
  87. Macrophage migration inhibitory factor polymorphisms are a potential susceptibility marker in systemic sclerosis from southern Mexican population: association with MIF mRNA expression and cytokine profile pp.1434-9949, 2019, https://doi.org/10.1007/s10067-019-04459-8
  88. Potential use of clinical polygenic risk scores in psychiatry – ethical implications and communicating high polygenic risk vol.14, pp.1, 2019, https://doi.org/10.1186/s13010-019-0073-8
  89. Genomic Variations in Susceptibility to Intracranial Aneurysm in the Korean Population vol.8, pp.2, 2019, https://doi.org/10.3390/jcm8020275
  90. A Perception on Genome-Wide Genetic Analysis of Metabolic Traits in Arab Populations vol.10, pp.1664-2392, 2019, https://doi.org/10.3389/fendo.2019.00008