DOI QR코드

DOI QR Code

Prediction of Quantitative Traits Using Common Genetic Variants: Application to Body Mass Index

  • Bae, Sunghwan (Interdisciplinary Program in Bioinformatics, Seoul National University) ;
  • Choi, Sungkyoung (Interdisciplinary Program in Bioinformatics, Seoul National University) ;
  • Kim, Sung Min (Bioinformatics and Biostatistics Lab, Seoul National University) ;
  • Park, Taesung (Interdisciplinary Program in Bioinformatics, Seoul National University)
  • 투고 : 2016.11.21
  • 심사 : 2016.12.06
  • 발행 : 2016.12.31

초록

With the success of the genome-wide association studies (GWASs), many candidate loci for complex human diseases have been reported in the GWAS catalog. Recently, many disease prediction models based on penalized regression or statistical learning methods were proposed using candidate causal variants from significant single-nucleotide polymorphisms of GWASs. However, there have been only a few systematic studies comparing existing methods. In this study, we first constructed risk prediction models, such as stepwise linear regression (SLR), least absolute shrinkage and selection operator (LASSO), and Elastic-Net (EN), using a GWAS chip and GWAS catalog. We then compared the prediction accuracy by calculating the mean square error (MSE) value on data from the Korea Association Resource (KARE) with body mass index. Our results show that SLR provides a smaller MSE value than the other methods, while the numbers of selected variables in each model were similar.

키워드

참고문헌

  1. Kooperberg C, LeBlanc M, Obenchain V. Risk prediction using genome-wide association studies. Genet Epidemiol 2010;34: 643-652. https://doi.org/10.1002/gepi.20509
  2. Futreal PA, Liu Q, Shattuck-Eidens D, Cochran C, Harshman K, Tavtigian S, et al. BRCA1 mutations in primary breast and ovarian carcinomas. Science 1994;266:120-122. https://doi.org/10.1126/science.7939630
  3. Lancaster JM, Wooster R, Mangion J, Phelan CM, Cochran C, Gumbs C, et al. BRCA2 mutations in primary breast and ovarian cancers. Nat Genet 1996;13:238-240. https://doi.org/10.1038/ng0696-238
  4. Manolio TA, Collins FS, Cox NJ, Goldstein DB, Hindorff LA, Hunter DJ, et al. Finding the missing heritability of complex diseases. Nature 2009;461:747-753. https://doi.org/10.1038/nature08494
  5. Wang WY, Barratt BJ, Clayton DG, Todd JA. Genome-wide association studies: theoretical and practical concerns. Nat Rev Genet 2005;6:109-118. https://doi.org/10.1038/nrg1522
  6. International Schizophrenia Consortium, Purcell SM, Wray NR, Stone JL, Visscher PM, O'Donovan MC, et al. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 2009;460:748-752.
  7. Machiela MJ, Chen CY, Chen C, Chanock SJ, Hunter DJ, Kraft P. Evaluation of polygenic risk scores for predicting breast and prostate cancer risk. Genet Epidemiol 2011;35:506-514.
  8. Evans DM, Visscher PM, Wray NR. Harnessing the information contained within genome-wide association studies to improve individual prediction of complex disease risk. Hum Mol Genet 2009;18:3525-3531. https://doi.org/10.1093/hmg/ddp295
  9. Janssens AC, van Duijn CM. Genome-based prediction of common diseases: advances and prospects. Hum Mol Genet 2008;17:R166-R173. https://doi.org/10.1093/hmg/ddn250
  10. Weedon MN, McCarthy MI, Hitman G, Walker M, Groves CJ, Zeggini E, et al. Combining information from common type 2 diabetes risk polymorphisms improves disease prediction. PLoS Med 2006;3:e374. https://doi.org/10.1371/journal.pmed.0030374
  11. van der Net JB, Janssens AC, Sijbrands EJ, Steyerberg EW. Value of genetic profiling for the prediction of coronary heart disease. Am Heart J 2009;158:105-110. https://doi.org/10.1016/j.ahj.2009.04.022
  12. Lindstrom S, Schumacher FR, Cox D, Travis RC, Albanes D, Allen NE, et al. Common genetic variants in prostate cancer risk prediction: results from the NCI Breast and Prostate Cancer Cohort Consortium (BPC3). Cancer Epidemiol Biomarkers Prev 2012;21:437-444. https://doi.org/10.1158/1055-9965.EPI-11-1038
  13. Jostins L, Barrett JC. Genetic risk prediction in complex disease. Hum Mol Genet 2011;20:R182-R188. https://doi.org/10.1093/hmg/ddr378
  14. Wacholder S, Hartge P, Prentice R, Garcia-Closas M, Feigelson HS, Diver WR, et al. Performance of common genetic variants in breast-cancer risk models. N Engl J Med 2010;362:986-993. https://doi.org/10.1056/NEJMoa0907727
  15. Hoerl AE. Ridge regression. Biometrics 1970;26:603.
  16. Hoerl AE, Kennard RW. Ridge regression: applications to nonorthogonal problems. Technometrics 1970;12:69-82. https://doi.org/10.1080/00401706.1970.10488635
  17. Hoerl AE, Kennard RW. Ridge regression: biased estimation for nonorthogonal problems. Technometrics 1970;12:55-67. https://doi.org/10.1080/00401706.1970.10488634
  18. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Ser B Methodol 1996;58:267-288.
  19. Zou H, Hastie T. Regularization and variable selection via the elastic net. J R Stat Soc Ser B Stat Methodol 2005;67:301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
  20. Wei Z, Wang W, Bradfield J, Li J, Cardinale C, Frackelton E, et al. Large sample size, wide variant spectrum, and advanced machine-learning technique boost risk prediction for inflammatory bowel disease. Am J Hum Genet 2013;92:1008-1012. https://doi.org/10.1016/j.ajhg.2013.05.002
  21. Austin E, Pan W, Shen X. Penalized regression and risk prediction in genome-wide association studies. Stat Anal Data Min 2013;6:315-328. https://doi.org/10.1002/sam.11183
  22. Cha PC, Mushiroda T, Takahashi A, Kubo M, Minami S, Kamatani N, et al. Genome-wide association study identifies genetic determinants of warfarin responsiveness for Japanese. Hum Mol Genet 2010;19:4735-4744. https://doi.org/10.1093/hmg/ddq389
  23. Cho YS, Go MJ, Kim YJ, Heo JY, Oh JH, Ban HJ, et al. A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nat Genet 2009;41:527-534. https://doi.org/10.1038/ng.357
  24. Welter D, MacArthur J, Morales J, Burdett T, Hall P, Junkins H, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res 2014;42:D1001-D1006. https://doi.org/10.1093/nar/gkt1229
  25. Ripley B, Venables B, Bates DM, Hornik K, Gebhardt A, Firth D, et al. Package 'MASS'. CRAN Repository, 2013. Accessed 2016 Dec 1. Available from: http://cran r-project org/web/packages/MASS/MASS pdf.
  26. Friedman J, Hastie T, Tibshirani R. Regularization paths for generalized linear models via coordinate descent. J Stat Softw 2010;33:1-22.
  27. Kim J, Namkung J, Lee S, Park T. Application of structural equation models to genome-wide association analysis. Genomics Inform 2010;8:150-158. https://doi.org/10.5808/GI.2010.8.3.150
  28. Wang KS, Liu X, Owusu D, Pan Y, Xie C. Polymorphisms in the ANKS1B gene are associated with cancer, obesity and type 2 diabetes. AIMS Genet 2015;2:192-203. https://doi.org/10.3934/genet.2015.3.192
  29. Frayling TM, Timpson NJ, Weedon MN, Zeggini E, Freathy RM, Lindgren CM, et al. A common variant in the FTO gene is associated with body mass index and predisposes to childhood and adult obesity. Science 2007;316:889-894. https://doi.org/10.1126/science.1141634
  30. Wen W, Cho YS, Zheng W, Dorajoo R, Kato N, Qi L, et al. Meta-analysis identifies common variants associated with body mass index in east Asians. Nat Genet 2012;44:307-311. https://doi.org/10.1038/ng.1087
  31. Manning AK, Hivert MF, Scott RA, Grimsby JL, Bouatia-Naji N, Chen H, et al. A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance. Nat Genet 2012;44: 659-669. https://doi.org/10.1038/ng.2274
  32. Sung YJ, Perusse L, Sarzynski MA, Fornage M, Sidney S, Sternfeld B, et al. Genome-wide association studies suggest sex-specific loci associated with abdominal and visceral fat. Int J Obes (Lond) 2016;40:662-674. https://doi.org/10.1038/ijo.2015.217
  33. Stergiakouli E, Gaillard R, Tavare JM, Balthasar N, Loos RJ, Taal HR, et al. Genome-wide association study of height-adjusted BMI in childhood identifies functional variant in ADCY3. Obesity (Silver Spring) 2014;22:2252-2259. https://doi.org/10.1002/oby.20840
  34. Hall P, Lee ER, Park BU. Bootstrap-based penalty choice for the lasso, achieving oracle performance. Stat Sin 2009;19:449-471.
  35. Chatterjee A, Lahiri SN. Bootstrapping Lasso estimators. J Am Stat Assoc 2011;106:608-625. https://doi.org/10.1198/jasa.2011.tm10159
  36. Eleftherohorinou H, Wright V, Hoggart C, Hartikainen AL, Jarvelin MR, Balding D, et al. Pathway analysis of GWAS provides new insights into genetic susceptibility to 3 inflammatory diseases. PLoS One 2009;4:e8068. https://doi.org/10.1371/journal.pone.0008068

피인용 문헌

  1. Newly identified set of obesity-related genotypes and abdominal fat influence the risk of insulin resistance in a Korean population pp.00099163, 2019, https://doi.org/10.1111/cge.13509