DOI QR코드

DOI QR Code

Bayesian inference of the cumulative logistic principal component regression models

  • Kyung, Minjung (Department of Statistics, Duksung Women's University)
  • Received : 2021.09.04
  • Accepted : 2022.02.21
  • Published : 2022.03.31

Abstract

We propose a Bayesian approach to cumulative logistic regression model for the ordinal response based on the orthogonal principal components via singular value decomposition considering the multicollinearity among predictors. The advantage of the suggested method is considering dimension reduction and parameter estimation simultaneously. To evaluate the performance of the proposed model we conduct a simulation study with considering a high-dimensional and highly correlated explanatory matrix. Also, we fit the suggested method to a real data concerning sprout- and scab-damaged kernels of wheat and compare it to EM based proportional-odds logistic regression model. Compared to EM based methods, we argue that the proposed model works better for the highly correlated high-dimensional data with providing parameter estimates and provides good predictions.

Keywords

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korea government(MSIT) (No. 2021R1F1A1049834).

References

  1. Agresti A (2010). Analysis of Ordinal Categorical Data (2nd Edition), Wiley.
  2. Albert JH and Chib S (1993). Bayesian analysis of binary and polychotonous response data, Journal of the American Statistical Association, 88, 669-679. https://doi.org/10.1080/01621459.1993.10476321
  3. Bair E, Hastie T, Paul D, and Tibshirani R (2006). Prediction by supervised principal components, Journal of the American Statistical Association, 101, 119-137. https://doi.org/10.1198/016214505000000628
  4. Bilder CR and Loughin TM (2015). Analysis of Categorical Data with R, CRC Press.
  5. Frank LE and Friedman JH (1993). A statistical view of some chemometrics regression tools, Technometrics, 35, 109-135. https://doi.org/10.1080/00401706.1993.10485033
  6. Gelman A, Carlin JB, Stern HS, Dunson DB, Vehtary A, and Rubin DB (2015). Bayesian Data Analysis (3rd ed), CRC press.
  7. Gelman A, Jakulin A, Pittau MG, and Su YS (2008). A weakly informative default prior distribution for logistic and other regression models, The Annals of Applied Statistics, 2, 1360-1383. https://doi.org/10.1214/08-AOAS191
  8. Gelman A and Su YS (2020). Arm: Data Analysis Using Regression and Multilevel/Hierarchical Models, R package version 1.11-2, https://CRAN.R-project.org/package=arm
  9. Hirk R, Hornik K, and Vana, L. (2019). Multivariate ordinal regression models: an analysis of corporate credit ratings, Statistical Methods and Applications, 28, 507-539. https://doi.org/10.1007/s10260-018-00437-7
  10. Holmes CC and Held L (2006). Bayesian auxiliary variable models for binary and multinomial regression, Bayesian Analysis, 1, 145-168. https://doi.org/10.1214/06-BA105
  11. Kass RE and Raftery AE (1995). Bayes factors, Journal of the American Statistical Association, 90, 773-795. https://doi.org/10.1080/01621459.1995.10476572
  12. Lang JB (1999). Bayesian ordinal and binary regression models with a parametric family of mixture links, Computational Statistics & Data Analysis, 32, 59-87. https://doi.org/10.1016/S0167-9473(99)00007-9
  13. Martin C, Herrman TJ, Loughin T, and Oentong S (1998). Micropycnometer measurement of singlekernel density of healthy, sprouted, and scab-damaged wheats, Cereal Chemistry, 75, 177-180. https://doi.org/10.1094/CCHEM.1998.75.2.177
  14. Massy WF (1965). Principal components regression in exploratory statistical research, Journal of the American Statistical Association, 60, 234-256. https://doi.org/10.1080/01621459.1965.10480787
  15. McCullagh P (1980). Regression models for ordinal data, Journal of the Royal Statistical Society, Series B., 42, 109-142.
  16. McCullagh P and Nelder JA (1989). Generalized Linear Models (2nd Edition), Chapman and Hall, London.
  17. McKinley TJ, Morters M, and Wood JLN (2015). Bayesian model choice in cumulative link ordinal regression models, Bayesian Analysis, 10, 1-30. https://doi.org/10.1214/14-BA884
  18. O'Brien SM and Dunson DB (2004). Bayesian multivariate logistic regression, Biometrics, 60, 739-746. https://doi.org/10.1111/j.0006-341X.2004.00224.x
  19. R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/
  20. Raftery AE (1995). Bayesian model selection in social research, Sociological Methodology, 25, 111-163. https://doi.org/10.2307/271063
  21. Reiss PJ and Ogden TR (2007). Functional principal component regression and functional partial least squares, Journal of the American Statistical Association, 102, 984-997. https://doi.org/10.1198/016214507000000527
  22. Sha N, Vannucci M, Tadesse MG, et al. (2004). Bayesian variable selection in multinomial probit models to identify molecular signatures of disease stage, Biometrics, 60, 812-819. https://doi.org/10.1111/j.0006-341X.2004.00233.x
  23. Venables WN and Ripley BD (2002). Modern Applied Statistics with S (4th ed.), Springer, New York.
  24. Walker SH and Duncan DB (1967). Estimation of the probability of an event as a function of several independent variables, Biometrika, 54, 167-179. https://doi.org/10.1093/biomet/54.1-2.167
  25. West M (2003). Bayesian factor regression models in the "Large p, Small n" paradigm, Bayesian Statistics 7, 723-732.
  26. Yi N, Banerjee S, Pomp D, and Yandell BS (2007). Bayesian mapping of genomewide interacting quantitative trait loci for ordinal traits, Genetics, 176, 1855-1864. https://doi.org/10.1534/genetics.107.071142