Browse > Article
http://dx.doi.org/10.5351/KJAS.2020.33.1.025

Bayesian ordinal probit semiparametric regression models: KNHANES 2016 data analysis of the relationship between smoking behavior and coffee intake  

Lee, Dasom (Department of Statistics, North Carolina State University)
Lee, Eunji (Department of Statistics, Korea University)
Jo, Seogil (Department of Statistics (Institute of Applied Statistics), Jeonbuk National University)
Choi, Taeryeon (Department of Statistics, Korea University)
Publication Information
The Korean Journal of Applied Statistics / v.33, no.1, 2020 , pp. 25-46 More about this Journal
Abstract
This paper presents ordinal probit semiparametric regression models using Bayesian Spectral Analysis Regression (BSAR) method. Ordinal probit regression is a way of modeling ordinal responses - usually more than two categories - by connecting the probability of falling into each category explained by a combination of available covariates using a probit (an inverse function of normal cumulative distribution function) link. The Bayesian probit model facilitates posterior sampling by bringing a latent variable following normal distribution, therefore, the responses are categorized by the cut-off points according to values of latent variables. In this paper, we extend the latent variable approach to a semiparametric model for the Bayesian ordinal probit regression with nonparametric functions using a spectral representation of Gaussian processes based BSAR method. The latent variable is decomposed into a parametric component and a nonparametric component with or without a shape constraint for modeling ordinal responses and predicting outcomes more flexibly. We illustrate the proposed methods with simulation studies in comparison with existing methods and real data analysis applied to a Korean National Health and Nutrition Examination Survey (KNHANES) 2016 for investigating nonparametric relationship between smoking behavior and coffee intake.
Keywords
BSAR; Gaussian process; KNHANES data; Markov chain Monte Carlo; Ordinal probit; Semiparametric regression;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Agresti, A. (2013). Categorical Data Analysis (3rd ed), John Wiley & Sons, NJ.
2 Hasegawa, H. (2010). Analyzing tourists' satisfaction: a multivariate ordered probit approach, Tourism Management, 31, 86-97.   DOI
3 Hastie, T. J. and Tibshirani, R. J. (1990). Generalized additive models, Monographs on Statistics and Applied Probability (Vol 43), Chapman and Hall, London.
4 Jara, A., Hanson, T. E., and Lesaffre, E. (2009). Robustifying generalized linear mixed models using a new class of mixtures of multivariate Polya trees, Journal of Computational and Graphical Statistics, 18, 838-860.   DOI
5 Jo, S., Choi, T., Park, B., and Lenk, P. (2019). bsamGP: An R package for Bayesian spectral analysis models using Gaussian process priors, Journal of Statistical Software, 90, 1-41.
6 Jung, K. W., Won, Y. J., Kong, H. J., Lee, E. S., and Community of Population-Based Regional Cancer Registries (2018). Cancer statistics in Korea: incidence, mortality, survival, and prevalence in 2015, Cancer Research and Treatment: Official Journal of Korean Cancer Association, 50, 303-316.   DOI
7 Kang, E., Lee, J. A., and Cho, H. J. (2017). Characteristics of hardcore smokers in South Korea from 2007 to 2013, BMC Public Health, 17, 521.   DOI
8 Kim, M. (2015). Semiparametric approach to logistic model with random intercept, Korean Journal of Applied Statistics, 28, 1121-1131.   DOI
9 Kockelman, K. M. and Kweon, Y. J. (2002). Driver injury severity: an application of ordered probit models, Accident Analysis & Prevention, 34, 313-321.   DOI
10 Koop, G., Poirier, D. J., and Tobias, J. L. (2007). Bayesian Econometric Methods (Econometric Exercises), Cambridge University Press, Cambridge.
11 Korean Centers for Disease Control and Prevention (2016). The Seventh Korea National Health and Nutrition Examination Survey (KNHANES VII-1).
12 Lee, J. H. and Heo, T. Y. (2014). A study of effect on the smoking status using multilevel logistic model, Korean Journal of Applied Statistics, 27, 89-102.   DOI
13 Lenk, P. J. and Choi, T. (2017). Bayesian analysis of shape-restricted functions using Gaussian process priors, Statistica Sinica, 27, 43-69.
14 Moon, S. (2016). Types of smoking statuses and associated factors among Korean wageworkers, Journal of Korean Public Health Nursing, 30, 495-511.   DOI
15 Nelder, J. A. and Wedderburn, R. W. (1972). Generalized linear models, Journal of the Royal Statistical Society. Series A (General), 135, 370-384.   DOI
16 Park, J. C., Kim, M. H., and Lee, J. Y. (2018). Nomogram comparison conducted by logistic regression and naive Bayesian classifier using type 2 diabetes mellitus (T2D), Korean Journal of Applied Statistics, 31, 573-585.   DOI
17 Watanabe, S. (2010). Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory, Journal of Machine Learning Research, 11, 3571-3594.
18 Seok, H. E., Bang, H. J., and Kim, S. Y. (2017). Bayesian analysis of KBSID-III adaptive behavior data using a zero-inflated ordered probit model, Korean Journal of Psychology: General, 36, 215-239.   DOI
19 Sha, N. and Dechi, B. O. (2019). A Bayes inference for ordinal response with latent variable approach, Stats, 2, 321-331.   DOI
20 Tan, Y. V. and Roy, J. (2019). Bayesian additive regression trees and the general BART model, Statistics in Medicine, 38, 5048-5069.   DOI
21 Wood, S. N. (2017). Generalized Additive Models: An Introduction with R (2nd ed), CRC Press, Florida.
22 Xie, Y., Zhang, Y., and Liang, F. (2009). Crash injury severity analysis using Bayesian ordered probit models, Journal of Transportation Engineering, 135, 18-25.   DOI
23 Cho, K. S. (2013). Prevalence of hardcore smoking and its associated factors in Korea, Health and Social Welfare Review, 33, 603-628.   DOI
24 Ahn, H. J., Gwak, J. I., Yun, S. J., Choi, H. J., Nam, J. W., and Shin, J. S. (2017). The influence of coffee consumption for smoking behavior, Korean Journal of Family Practice, 7, 218-222.   DOI
25 Albert, J. H. and Chib, S. (1993). Bayesian analysis of binary and polychotomous response data, Journal of the American Statistical Association, 88, 669-679.   DOI
26 Carmody, T. P., Brischetto, C. S., Matarazzo, J. D., O’Donnell, R. P., and Connor, W. E. (1985). Cooccurrent use of cigarettes, alcohol, and coffee in healthy, community-living men and women. Health Psychology, 4, 323.   DOI
27 Chen, M. H. and Dey, D. K. (2000). Bayesian analysis for correlated ordinal data models. In Generalized Linear Models: A Bayesian Perspective (volume 5, pages 133-157), Dekker, New York.
28 Chipman, H. A., George, E. I., and McCulloch, R. E. (2010). BART: Bayesian additive regression trees, The Annals of Applied Statistics, 4, 266-298.   DOI
29 Clark, A., Georgellis, Y., and Sanfey, P. (2001). Scarring: The psychological impact of past unemployment, Economica, 68, 221-241.   DOI
30 Cowles, M. K., Carlin, B. P., and Connett, J. E. (1996). Bayesian tobit modeling of longitudinal ordinal clinical trial compliance data with nonignorable missingness, Journal of the American Statistical Association, 91, 86-98.   DOI
31 Geisser, S. and Eddy, W. F. (1979). A predictive approach to model selection, Journal of the American Statistical Association, 74, 153-160.   DOI
32 Harris, M. N. and Zhao, X. (2007). A zero-inflated ordered probit model, with an application to modelling tobacco consumption, Journal of Econometrics, 141, 1073-1099.   DOI