Adjusting sampling bias in case-control genetic association studies

Seo, Geum Chu;Park, Taesung;

doi:10.7465/jkdi.2014.25.5.1127

Journal of the Korean Data and Information Science Society

Volume 25 Issue 5
/
Pages.1127-1135
/
2014
/
1598-9402(pISSN)

The Korean Data and Information Science Society (한국데이터정보과학회)

DOI QR Code

Adjusting sampling bias in case-control genetic association studies

Seo, Geum Chu (Department of Statistics, Seoul National University) ;
Park, Taesung (Department of Statistics and Interdisciplinary Program in Bioinformatics, Seoul National University)

Received : 2014.06.30
Accepted : 2014.08.04
Published : 2014.09.30

https://doi.org/10.7465/jkdi.2014.25.5.1127 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Genome-wide association studies (GWAS) are designed to discover genetic variants such as single nucleotide polymorphisms (SNPs) that are associated with human complex traits. Although there is an increasing interest in the application of GWAS methodologies to population-based cohorts, many published GWAS have adopted a case-control design, which raise an issue related to a sampling bias of both case and control samples. Because of unequal selection probabilities between cases and controls, the samples are not representative of the population that they are purported to represent. Therefore, non-random sampling in case-control study can potentially lead to inconsistent and biased estimates of SNP-trait associations. In this paper, we proposed inverse-probability of sampling weights based on disease prevalence to eliminate a case-control sampling bias in estimation and testing for association between SNPs and quantitative traits. We apply the proposed method to a data from the Korea Association Resource project and show that the standard estimators applied to the weighted data yield unbiased estimates.

Keywords

References

Cho, Y. S., Go M. J., Kim Y. J., Heo J. Y., Oh J. H., Ban, H., Yoon D., Lee, M. H., et al. (2009). A large-scale genome-wide association study of Asian populations uncovers genetic factors influencing eight quantitative traits. Nature Genetics, 41, 527-534. https://doi.org/10.1038/ng.357
Hernan, M. A. and Robins J. M. (2006). Estimating causal effects from epidemiological data. Journal of Epidemiology and Community Health, 60, 578-586. https://doi.org/10.1136/jech.2004.029496
Hunter, D. J., Kraft, P., Jacobs, K. B., Cox, D. G., Yeager, M., Hankinson, S. E., Wacholder, S., Wang, Z., et al. (2007). A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer. Nature Genetics, 39, 870-874. https://doi.org/10.1038/ng2075
Manolio, T. A., Collins, F. S., Cox, N. J., Goldstein, D. B., Hindorff, L. A., Hunter, D. J., McCarthy, M.I., Ramos, E. M., et al. (2009). Finding the missing heritability of complex diseases. Nature Genetics, 461, 747-753.
Scott, L. J., Mohlke, K. L., Bonnycastle, L. L., Willer, C. J., Li, Y., Duren, W. L., Erdos, M. R., Stringham, H. M., et al. (2007). A genome-wide association study of type 2 diabetes in Finns detects multiple susceptibility variants. Science, 316, 1341-1345. https://doi.org/10.1126/science.1142382
Thomas, G., Jacobs, K. B., Yeager, M., Kraft, P., Wacholder, S., Orr, N., Yu, K., Chatterjee, N., et al. (2008). Multiple loci identified in a genome-wide association study of prostate cancer. Nature Genetics, 40, 310-315. https://doi.org/10.1038/ng.91
VanderWeele, T. J. and Vansteelandt, S. (2010). Odds ratios for mediation analysis for a dichotomous outcome. American Journal of Epidemiology, 172, 1339-1348. https://doi.org/10.1093/aje/kwq332

Journal of the Korean Data and Information Science Society

Adjusting sampling bias in case-control genetic association studies

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)