DOI QR코드

DOI QR Code

A tutorial on generalizing the default Bayesian t-test via posterior sampling and encompassing priors

  • Faulkenberry, Thomas J. (Department of Psychological Sciences, Tarleton State University)
  • Received : 2018.12.10
  • Accepted : 2019.02.06
  • Published : 2019.03.31

Abstract

With the advent of so-called "default" Bayesian hypothesis tests, scientists in applied fields have gained access to a powerful and principled method for testing hypotheses. However, such default tests usually come with a compromise, requiring the analyst to accept a one-size-fits-all approach to hypothesis testing. Further, such tests may not have the flexibility to test problems the scientist really cares about. In this tutorial, I demonstrate a flexible approach to generalizing one specific default test (the JZS t-test) (Rouder et al., Psychonomic Bulletin & Review, 16, 225-237, 2009) that is becoming increasingly popular in the social and behavioral sciences. The approach uses two results, the Savage-Dickey density ratio (Dickey and Lientz, 1980) and the technique of encompassing priors (Klugkist et al., Statistica Neerlandica, 59, 57-69, 2005) in combination with MCMC sampling via an easy-to-use probabilistic modeling package for R called Greta. Through a comprehensive mathematical description of the techniques as well as illustrative examples, the reader is presented with a general, flexible workflow that can be extended to solve problems relevant to his or her own work.

Keywords

References

  1. Abadi M, Agarwal A, Barham P, et al. (2015). TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems, Software available from tensorflow.org
  2. Carpenter B, Gelman A., Hoffman MD, et al. (2017). Stan: a probabilistic programming language, Journal of Statistical Software, 76.
  3. Dickey JM and Lientz BP (1970). The weighted likelihood ratio, sharp hypotheses about chances, the order of a Markov chain, The Annals of Mathematical Statistics, 41, 214-226. https://doi.org/10.1214/aoms/1177697203
  4. Faulkenberry TJ (2018). Computing Bayes factors to measure evidence from experiments: an extension of the BIC approximation, Biometrical Letters, 55, 31-43. https://doi.org/10.2478/bile-2018-0003
  5. Gabry J and Mahr T (2018). Bayesplot: Plotting for Bayesian Models, R package version 1.6.0. Available from: https://CRAN.R-project.org/package=bayesplot
  6. Gelfand AE and Smith AFM (1990). Sampling-based approaches to calculating marginal densities, Journal of the American Statistical Association, 85, 398-409. https://doi.org/10.1080/01621459.1990.10476213
  7. Gigerenzer G (2004). Mindless statistics, The Journal of Socio-Economics, 33, 587-606. https://doi.org/10.1016/j.socec.2004.09.033
  8. Gilks WR, Thomas A, and Spiegelhalter DJ (1994). A language and program for complex Bayesian modelling, The Statistician, 43, 169-177. https://doi.org/10.2307/2348941
  9. Golding N (2018). greta: Simple and Scalable Statistical Modelling in R, R package version 0.3.0.9001. Available from: https://github.com/greta-dev/greta
  10. Hoekstra R, Morey RD, Rouder JN, and Wagenmakers EJ (2014). Robust misinterpretation of confidence intervals, Psychonomic Bulletin & Review, 21, 1157-1164. https://doi.org/10.3758/s13423-013-0572-3
  11. Hoel PG (1984). Introduction to Mathematical Statistics (5th ed), John Wiley & Sons, New York.
  12. JASP Team (2018). JASP (Version 0.9)[Computer software]. Available from: https://jasp-stats.org/
  13. Jeffreys H (1961). The Theory of Probability (3rd ed), Oxford University Press, Oxford, UK.
  14. Kass RE and Raftery AE (1995). Bayes factors, Journal of the American Statistical Association, 90, 773-795. https://doi.org/10.1080/01621459.1995.10476572
  15. Killeen PR (2007). Replication statistics as a replacement for significance testing: best practices in scientific decision-making, Best Practices in Quantitative Methods, (Osborne JW ed), SAGE Publications, Inc., Thousand Oaks, CA.
  16. Klugkist I, Kato B, and Hoijtink H (2005). Bayesian model selection using encompassing priors, Statistica Neerlandica, 59, 57-69. https://doi.org/10.1111/j.1467-9574.2005.00279.x
  17. Kooperberg C (2018). polspline: Polynomial Spline Routines, R package version 1.1.13. Available from: https://CRAN.R-project.org/package=polspline
  18. Kooperberg C and Stone CJ (1992). Logspline density estimation for censored data, Journal of Computational and Graphical Statistics, 1, 301-328. https://doi.org/10.2307/1390786
  19. Lunn DJ, Thomas A, Best N, and Spiegelhalter D (2000). WinBUGS-a Bayesian modelling framework: concepts, structure, and extensibility, Statistics and Computing, 10, 325-337. https://doi.org/10.1023/A:1008929526011
  20. Masson MEJ (2011). A tutorial on a practical Bayesian alternative to null-hypothesis significance testing, Behavior Research Methods, 43, 679-690. https://doi.org/10.3758/s13428-010-0049-5
  21. Morey RD and Rouder JN (2011). Bayes factor approaches for testing interval null hypotheses, Psychological Methods, 16, 406-419. https://doi.org/10.1037/a0024377
  22. Morey RD and Rouder JN (2018). BayesFactor: Computation of Bayes Factors for Common Designs, R package version 0.9.12-4.2. Available from: https://CRAN.R-project.org/package=BayesFactor
  23. Neal R (2011). MCMC Using Hamiltonian Dynamics, (Brooks S, Gelman A, Jones G, and Meng X eds), Handbook of Markov Chain Monte Carlo, Chapman and Hall/CRC, 116-162.
  24. Oakes M (1986). Statistical Inference: A commentary for the Social and Behavioural Sciences, John Wiley & Sons, Chicester
  25. Plummer M (2003). JAGS: A program for analysis of Bayesian graphical models using Gibbs sampling.
  26. Raftery AE (1995). Bayesian model selection in social research, Sociological Methodology 25, 111-163. https://doi.org/10.2307/271063
  27. R Core Team (2018). R: A Language and Environment for Statistical Computing, R Foundation for Statistical Computing. Available from: https://www.R-project.org/
  28. Richard FD, Bond CF, and Stokes-Zoota JJ (2003). One hundred years of social psychology quantitatively described, Review of General Psychology, 7, 331-363. https://doi.org/10.1037/1089-2680.7.4.331
  29. Rouder JN, Speckman PL, Sun D, Morey RD, and Iverson G (2009). Bayesian t tests for accepting and rejecting the null hypothesis, Psychonomic Bulletin & Review, 16, 225-237. https://doi.org/10.3758/PBR.16.2.225
  30. Stone CJ, Hansen MH, Kooperberg C, and Truong YK (1997). Polynomial splines and their tensor products in extended linear modeling: 1994 Wald memorial lecture, The Annals of Statistics, 25, 1371-1470. https://doi.org/10.1214/aos/1031594728
  31. Wagenmakers J, and Wetzels R, Borsboom D, and van der Maas HLJ (2011). Why psychologists must change the way they analyze their data: the case of psi: Comment on Bem (2011), Journal of Personality and Social Psychology, 100, 426-432. https://doi.org/10.1037/a0022790
  32. Wagenmakers EJ, Lodewyckx T, Kuriyal H, and Grasman R (2010). Bayesian hypothesis testing for psychologists: a tutorial on the Savage-Dickey method, Cognitive Psychology, 60, 158-189. x https://doi.org/10.1016/j.cogpsych.2009.12.001
  33. Wang M (2017). Mixtures of g-priors for analysis of variance models with a diverging number of parameters, Bayesian Analysis, 12, 511-532. https://doi.org/10.1214/16-BA1011
  34. Wetzels R, Grasman RPPP, and Wagenmakers EJ (2010). An encompassing prior generalization of the Savage-Dickey density ratio, Computational Statistics & Data Analysis, 54, 20942102. https://doi.org/10.1016/j.csda.2010.03.016
  35. Wetzels R, Raaijmakers JGW, Jakab E, and Wagenmakers EJ (2009). How to quantify support for and against the null hypothesis: a flexible WinBUGS implementation of a default Bayesian t test, Psychonomic Bulletin & Review, 16, 752-760. https://doi.org/10.3758/PBR.16.4.752
  36. Zellner A and Siow A (1980). Posterior odds ratios for selected regression hypotheses, Trabajos de Estadistica Y de Investigacion Operativa, 31, 585-603. https://doi.org/10.1007/BF02888369