Browse > Article

Statistical Issues in Genomic Cohort Studies  

Park, So-Hee (Cancer Biostatistics Branch, Division of Cancer Registration and Epidemiology, National Cancer Center)
Publication Information
Journal of Preventive Medicine and Public Health / v.40, no.2, 2007 , pp. 108-113 More about this Journal
Abstract
When conducting large-scale cohort studies, numerous statistical issues arise from the range of study design, data collection, data analysis and interpretation. In genomic cohort studies, these statistical problems become more complicated, which need to be carefully dealt with. Rapid technical advances in genomic studies produce enormous amount of data to be analyzed and traditional statistical methods are no longer sufficient to handle these data. In this paper, we reviewed several important statistical issues that occur frequently in large-scale genomic cohort studies, including measurement error and its relevant correction methods, cost-efficient design strategy for main cohort and validation studies, inflated Type I error, gene-gene and gene-environment interaction and time-varying hazard ratios. It is very important to employ appropriate statistical methods in order to make the best use of valuable cohort data and produce valid and reliable study results.
Keywords
Cohort studies; Epidemiologic methods; Validation studies; Research design; Measurement error; Gene-environment interaction; Proportional hazards models; Genomics;
Citations & Related Records

Times Cited By SCOPUS : 0
연도 인용수 순위
  • Reference
1 Hunter DJ, Spiegelman D, Adami HO, Beeson L, van den Brandt PA, Folsom AR, Fraser GE, Goldbohm RA, Graham S, Howe GR, Kushi LH, Marshall JR, McDermott A, Miller AB, Speizer FE, Wolk A, Yaun SS, Willett W. Cohort studies of fat intake and the risk of breast cancer-a pooled analysis. N Engl J Med 1996; 334(6): 356-361   DOI   ScienceOn
2 Prentice RL. Measurement error and results from analytic epidemiology: Dietary fat and breast cancer. J Natl Cancer Inst 1996; 88(23): 1738-1747   DOI   PUBMED
3 Fraser GE, Stram DO. Regression calibration in studies with correlated variables measured with error. Am J Epidemiol 2001; 154(9): 836-844   DOI   ScienceOn
4 Hauser R, Meeker JD, Park S, Silva MJ, Calafat AM. Temporal variability of urinary phthalate metabolite levels in men of reproductive age. Environ Health Perspect 2004; 112(17): 1734-1740   DOI   PUBMED   ScienceOn
5 Park S, Ryan LM, Meeker JD, Hauser R. A latent model for measurement error correction using replicate data. Proc Int Biom Soc meet Mar 20-23; Austin, TX: 2005. p. 273
6 Ritchie MD, Hahn LW, Roodi N, Bailey LR, Dupont WD, Parl FF, Moore JH. Multifactor-dimensionality reduction reveals high-order interactions among estrogen-metabolism genes in sporadic breast cancer. Am J Hum Genet 2001; 69(1): 138-147   DOI   ScienceOn
7 Carlin B, Louis TA. Bayes and Empirical-Bayes Methods for Data Analysis. 2nd ed. New York: Chapman and Hall; 2000. p. 57-85
8 Park Y, Hunter DJ, Spiegelman D, Bergkvist L, Berrino F, van den Brandt PA, Buring JE, Colditz GA, Freudenheim JL, Fuchs CS, Giovannucci E, Goldbohm RA, Graham S, Harnack L, Hartman AM, Jacobs DR Jr, Kato I, Krogh V, Leitzmann MF, McCullough ML, Miller AB, Pietinen P, Rohan TE, Schatzkin A, Willett WC, Wolk A, Zeleniuch-Jacquotte A, Zhang SM, Smith-Warner SA. Dietary fiber intake and risk of colorectal cancer: A pooled analysis of prospective cohort studies. JAMA 2005; 294(22): 2849-2857   DOI   ScienceOn
9 Rothman K. No adjustments are needed for multiple comparisons. Epidemiology 1990; 1(1): 43-46   DOI   PUBMED   ScienceOn
10 Weinberg CR. It's time to rehabilitate the p-value. Epidemiology 2001; 12(3): 288-290   DOI   PUBMED   ScienceOn
11 Prentice RL, Pettinger M, Anderson GL. Statistical issues arising in the Women's Health Initiative. Biometrics 2005; 61(4): 899-911   DOI   ScienceOn
12 Carroll RJ, Ruppert D, Stefanski LA, Measurement Error in Nonlinear Models. New York: Chapman and Hall; 1995. p. 141-164
13 Chen J, Yu K, Hsing A, Therneau TM. A partially linear tree-based regression model for assessing complex joint gene-gene and gene-environment effects. Genet Epidemiol 2007. Epub 2007 Jan 31
14 Fuchs CS, Giovannucci EL, Colditz GA, Hunter DJ, Stampfer MJ, Rosner B, Speizer FE, Willett WC. Dietary fiber and the risk of colorectal cancer and adenoma in women. N Engl J Med 1999; 340(3): 169-176   DOI   ScienceOn
15 Rosner B, Willett WC, Spiegelman D. Correction of logistic regression relative risk estimates and confidence intervals for systematic within-person measurement error. Stat Med 1989; 8(9): 1051-1069   DOI   PUBMED   ScienceOn
16 Park S, Stram DO. Cost-efficient design of main cohort and calibration studies where one or more exposure variables are measured with errors. Proc Joint Stat Meet Aug 11-15; New York, NY: 2002. p. 2611-2616
17 Ritchie MD, Hahn LW, Moore JH. Power of multifactor dimensionality reduction for detecting gene-gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity. Genet Epidemiol 2003; 24(2): 150-157   DOI   ScienceOn
18 Breslow NE, Day NE. Statistical methods in cancer research. Volume II--The design and analysis of cohort studies. IARC Sci Publ 1987; (82): 1-406   PUBMED
19 Spiegelman D, Gray R. Cost-efficient study designs for binary response data with Gaussian covariate measurement error. Biometrics 1991; 47(3): 851-869   DOI   ScienceOn
20 Wacholder S, Chanock S, Garcia-Closas M, El Ghormli L, Rothman N. Assessing the probability that a positive report is false: An approach for molecular epidemiology studies. J Natl Cancer Inst 2004; 96(6): 434-442   DOI   PUBMED   ScienceOn
21 Thomas D. New techniques for the analysis of cohort studies. Epidemiol Rev 1998; 20(1): 122-134   DOI   PUBMED   ScienceOn
22 Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc (Ser B) 1995; 57(1): 289-300
23 Rosner B, Spiegelman D, Willett WC. Correction of logistic regression relative risk estimates and confidence intervals for measurement error: The case of multiple covariates measured with error. Am J Epidemiol 1990; 132(4): 734-745   DOI   PUBMED
24 Motsinger AA, Ritchie MD. Multifactor dimensionality reduction: an analysis strategy for modelling and detecting gene-gene interactions in human genetics and pharmacogenomics studies. Hum Genomics 2006; 2(5): 318-328   PUBMED