Browse > Article
http://dx.doi.org/10.5351/KJAS.2014.27.4.645

Reproducibility of Hypothesis Testing and Confidence Interval  

Huh, Myung-Hoe (Department of Statistics, Korea University)
Publication Information
The Korean Journal of Applied Statistics / v.27, no.4, 2014 , pp. 645-653 More about this Journal
Abstract
P-value is the probability of observing a current sample and possibly other samples departing equally or more extremely from the null hypothesis toward postulated alternative hypothesis. When p-value is less than a certain level called ${\alpha}$(= 0:05), researchers claim that the alternative hypothesis is supported empirically. Unfortunately, some findings discovered in that way are not reproducible, partly because the p-value itself is a statistic vulnerable to random variation. Boos and Stefanski (2011) suggests calculating the upper limit of p-value in hypothesis testing, using a bootstrap predictive distribution. To determine the sample size of a replication study, this study proposes thought experiments by simulating boosted bootstrap samples of different sizes from given observations. The method is illustrated for the cases of two-group comparison and multiple linear regression. This study also addresses the reproducibility of the points in the given 95% confidence interval. Numerical examples show that the center point is covered by 95% confidence intervals generated from bootstrap resamples. However, end points are covered with a 50% chance. Hence this study draws the graph of the reproducibility rate for each parameter in the confidence interval.
Keywords
Reproducibility; hypothesis testing; p-value; bootstrap method; confidence interval;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Boos, D. D. and Stefanski, L. A. (2011). P-value precision and reproducibility, The American Statistician, 65, 213-221.   DOI
2 Efron, B. (1987). Better bootstrap confidence intervals, Journal of the American Statistical Association, 82, 171-185.   DOI
3 Goodman, S. N. (1992). A comment on replication, p-values and evidence, Statistics in Medicine, 11. 875-879.   DOI
4 Shao, J. and Chow, S.-C. (2002). Reproducibility probability in clinical trials, Statistics in Medicine, 21, 1727-1742.   DOI
5 Hoenig, J. M. and Heisey, D. M. (2001). The abuse of power: The pervasive fallacy of power calculations for data analysis, The American Statistician, 55, 19-24.   DOI   ScienceOn