Browse > Article
http://dx.doi.org/10.29220/CSAM.2018.25.5.545

A sample size calibration approach for the p-value problem in huge samples  

Park, Yousung (Department of Statistics, Korea University)
Jeon, Saebom (Department of Marketing Information Consulting, Mokwon University)
Kwon, Tae Yeon (Department of International Finance, Hankuk University of Foreign Studies University)
Publication Information
Communications for Statistical Applications and Methods / v.25, no.5, 2018 , pp. 545-557 More about this Journal
Abstract
The inclusion of covariates in the model often affects not only the estimates of meaningful variables of interest but also its statistical significance. Such gap between statistical and subject-matter significance is a critical issue in huge sample studies. A popular huge sample study, the sample cohort data from Korean National Health Insurance Service, showed such gap of significance in the inference for the effect of obesity on cause of mortality, requiring careful consideration. In this regard, this paper proposes a sample size calibration method based on a Monte Carlo t (or z)-test approach without Monte Carlo simulation, and also proposes a test procedure for subject-matter significance using this calibration method in order to complement the deflated p-value in the huge sample size. Our calibration method shows no subject-matter significance of the obesity paradox regardless of race, sex, and age groups, unlike traditional statistical suggestions based on p-values.
Keywords
huge sample; p-value problem; subject-matter significance; Monte Carlo; sample size calibration;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Adams KF, Schatzkin A, Harris TB, Kipnis V, Mouw T, Ballard-Barbash R, Hollenbeck A, and Leitzmann MF (2006). Overweight, obesity, and mortality in a large prospective cohort of persons 50 to 71 years old, New England Journal of Medicine, 355, 763-778.   DOI
2 Altman M (2004). Introduction to special issue on statistical significance, Journal of Socio-Economics, 33, 523-675.   DOI
3 Bayarri MJ and Berger JO (1999). Quantifying surprise in the data and model verification. In Bernardo JM, Berger JO, Dawid AP, and Smith AFM (eds) Bayesian Statistics (6th ed, pp. 53-82), Oxford University Press, Oxford.
4 Bayarri MJ and Berger JO (2000). P-values for composite null models, Journal of the American Statistical Association, 95, 1127-1142.
5 Berrington de Gonzalez A, Hartge P, Cerhan JR, et al. (2010). Body-mass index and mortality among 1.46 million white adults, New England Journal of Medicine, 363, 2211-2219.   DOI
6 Bolstad WM (2009). Understanding Computational Bayesian Statistics, John Wiley & Sons, Boston, MA.
7 Carnethon MR, De Chavez PJ, Biggs ML, et al. (2012). Association of weight status with mortality in adults with incident diabetes, The Journal of American Medical Association, 308, 581-590.
8 DeGroot MH and Schervish MJ (2002). Probability and Statistics (3rd ed), Pearson, Boston.
9 Emerson S (2009). Small sample performance and calibration of the Empirical Likelihood method (Ph.D. thesis), Stanford University, Stanford.
10 Flegal KM, Graubard BI, Williamson DF, and Gail MH (2007). Cause-specific excess deaths associated with underweight, overweight, and obesity, The Journal of Medical Association, 298, 2028-2037.   DOI
11 Fomby TB, Hill RC, and Jhonson SR (1984). Advanced Econometric Methods, Springer Verlag, New York.
12 Gelman A, Meng XL, and Stern H (1996). Posterior predictive assessment of model fitness via realized discrepancies (with discussion), Statistica Sinica, 6, 733-807.
13 Ghose A, Smith M, and Telang R (2006). Internet exchanges for used books: an empirical analysis of product cannibalization and welfare impact, Information Systems Research, 17, 3-19.   DOI
14 Ghose A and Yang S (2009). An empirical analysis of search engine advertising: sponsored search in electronic markets, Management Science, 55, 1605-1622.   DOI
15 Good IJ (1980). C73. The diminishing significance of a p-value as the sample size increase, Journal of Statistical Computation and Simulation, 11, 307-313.   DOI
16 Good IJ (1982). C144. The diminishing significance of a fixed p-value as the sample size increase: a discrete model, Journal of Statistical Computation and Simulation, 16, 312-313.
17 Greene WH (2003). Econometric Analysis (5th ed), Pearson Education, New Jersey.
18 Halsey LG, Curran-Everett D, Vowler SL, and Drummond GB (2015). The fickle P value generate irreproducible results, Nature Methods, 12, 179-185.   DOI
19 Hubbard R and Armstrong JS (2006). Why we don't really know what statistical significance means: Implications for educators, Journal of Marketing Education, 28, 114-120.   DOI
20 Harlow LL, Mulaik S, and Steiger JH (2016). What If There were No Significance Tests?: Classic Edition (Chapter 2-5), Psychology Press, New York.
21 Johnston J (1984). Econometric Methods (3rd ed), McGraw-Hill, New York.
22 Kalantar-Zadeh K, Block G, Horwich T, and Fonarow GC (2004). Reverse epidemiology of conventional cardiovascular risk factors in patients with chronic heart failure, Journal of the American College of Cardiology, 43, 1439-1444.   DOI
23 Kalantar-Zadeh K, Streja E, Molnar MZ, Lukowsky LR, Krishnan M, Kovesdy CP, and Greenland S (2012). Mortality prediction by surrogates of body composition: an examination of the obesity paradox in hemodialysis patients using composite ranking score analysis, American Journal of Epidemiology, 175, 793-803.   DOI
24 Katzmarzyk PT, Janssen I, and Ardern CI (2003). Physical inactivity, excess adiposity and premature mortality, Obesity Reviews, 4, 257-290.   DOI
25 Kim NH, Lee J, Kim TJ, et al. (2015). Body mass index and mortality in the general population and in subjects with chronic disease in Korea: a nationwide cohort study (2002-2010), PloS One, 10, e0139924.   DOI
26 Kirk RE (1996). Practical significance: a concept whose time has come, Educational and Psychological Measurement, 56, 746-759.   DOI
27 Lam BCC, Koh GCH, Chen C, Wong MTK, and Fallows SJ (2015). Comparison of Body Mass Index (BMI), Body Adiposity Index (BAI), Waist Circumference (WC), Waist-To-Hip Ratio (WHR) and Waist-To-Height Ratio (WHtR) as Predictors of Cardiovascular Disease Risk Factors in an Adult Population in Singapore, PLoS One, 10, e0122985.   DOI
28 Leamer EE (1978). Specification Searches, John Wiley & Sons, New York.
29 Larsson SC and Wolk A (2008). Excess body fatness: an important cause of most cancers, The Lancet, 371, 536-537.   DOI
30 Johnson DH (1999). The insignificance of statistical significance testing, Journal of Statistical Computation and Simulation, 63, 763-772.
31 Lee J, Lee JS, Park SH, Shin SA, and Kim KW (2016). Cohort profile: the national health insurance service-national sample cohort (NHIS-NSC), South Korea, International Journal of Epidemiology, 46, e15-e15.
32 Lin M, Lucas Jr HC, and Shmueli G (2013). Research commentary-too big to fail: large samples and the p-value problem, Information Systems Research, 24, 906-917.   DOI
33 Nyamdorj R, Qiao Q, Lam TH, et al. (2008). BMI compared with central obesity indicators in relation to diabetes and hypertension in Asians, Obesity (Silver Spring), 16, 1622-1635.   DOI
34 Ogden CL, Carroll MD, Curtin LR, McDowell MA, Tabak CJ, and Flegal KM (2006). Prevalence of overweight and obesity in the United States, 1999-2004, The Journal of Medical Association, 295, 1549-1555.   DOI
35 Orpana HM, Berthelot JM, Kaplan MS, Feeny DH, McFarland B, and Ross NA (2010). BMI and mortality: results from a national longitudinal study of Canadian adults, Obesity, 18, 214-218.   DOI
36 Pawitan Y (2001). In All Likelihood: Statistical Modelling and Inference Using Likelihood, Oxford University Press, Oxford.
37 Qian H and Shmidt P (2003). Partial GLS regression, Economic letters, 79, 385-392.   DOI
38 Torloni MR, Betran AP, Horta BL, Nakamura MU, Atallah AN, Moron AF, and Valente O (2009). Prepregnancy BMI and the risk of gestational diabetes: a systematic review of the literature with meta-analysis, Obesity Reviews, 10, 194-203.   DOI
39 Renehan AG, Tyson M, Egger M, Heller RF, and Zwahlen M (2008). Body-mass index and incidence of cancer: a systematic review and meta-analysis of prospective observational studies, The Lancet, 371, 569-578.   DOI
40 Sellke T, Bayarri MJ, and Berger JO (2001). Calibration of p-values for testing precise null hypotheses, The American Statistician, 55, 63-71.
41 Tsao M (2001). A small sample calibration method for the empirical likelihood ratio, Statistics & Probability Letters, 54, 41-45.   DOI
42 Tsao M (2004). Bounds on coverage probabilities of the empirical likelihood ratio confidence regions, The Annals of Statistics, 32, 1215-1221.   DOI
43 Uretsky S, Messerli FH, Bangalore S, Champion A, Cooper-Dehoff RM, Zhou Q, and Pepine CJ (2007). Obesity paradox in patients with hypertension and coronary artery disease, American Journal of Medicine, 120, 1863-1870.
44 Wellek S (2017). A critical evaluation of the current p-value controversy, Biometrical Journal, 59, 854-872.   DOI
45 Whitehead J (1980). Fitting Cox's Regrssion Model to Survival Data using GLIM, Journal of the Royal Statistical Society. Series C (Applied Statistics), 29, 268-275.
46 Zheng W, McLerran DF, Rolland B, Zhang X, Inoue M, Matsuo K, and Irie F (2011). Association between body-mass index and risk of death in more than 1 million Asians, New England Journal of Medicine, 364, 719-729.   DOI