DOI QR코드

DOI QR Code

A computational note on maximum likelihood estimation in random effects panel probit model

  • Received : 2019.03.09
  • Accepted : 2019.04.12
  • Published : 2019.05.31

Abstract

Panel data sets have recently been developed in various areas, and many recent studies have analyzed panel, or longitudinal data sets. Often a dichotomous dependent variable occur in survival analysis, biomedical and epidemiological studies that is analyzed by a generalized linear mixed effects model (GLMM). The most common estimation method for the binary panel data may be the maximum likelihood (ML). Many statistical packages provide ML estimates; however, the estimates are computed from numerically approximated likelihood function. For instance, R packages, pglm (Croissant, 2017) approximate the likelihood function by the Gauss-Hermite quadratures, while Rchoice (Sarrias, Journal of Statistical Software, 74, 1-31, 2016) use a Monte Carlo integration method for the approximation. As a result, it can be observed that different packages give different results because of different numerical computation methods. In this note, we discuss the pros and cons of numerical methods compared with the exact computation method.

Keywords

References

  1. Arismendi JC (2013). Multivariate truncated moments, Journal of Multivariate Analysis, 117, 41-75. https://doi.org/10.1016/j.jmva.2013.01.007
  2. Bates D, Machler M, Bolket BM, and Walker SC (2015). Fitting linear mixed-effects models using lme4, Journal of Statistical Software, 67, 1-48.
  3. Byrd RH, Lu P, Nocedal J, and Zhu C (1995). A limited memory algorithm for bound constrained optimization, SIAM Journal on Scientific Computing, 16, 1190-1208. https://doi.org/10.1137/0916069
  4. Celeux G and Diebolt J (1985). The SEM algorithm: a probabilistic teacher algorithm derived from the EM algorithm for the mixture problem, Computational Statistics Quarterly, 2, 73-82.
  5. Celeux G, Chauveau D, and Diebolt J (1996). Stochastic versions of the EM algorithm: an experimental study in the mixture case, Journal of Statistical Computation and Simulation, 55, 287-314. https://doi.org/10.1080/00949659608811772
  6. Chan JSK and Kuk AYC (1997). Maximum likelihood estimation for probit-linear mixed models with correlated random effects, Biometrics, 53, 86-97. https://doi.org/10.2307/2533099
  7. Croissant Y (2017). Package pglm: Panel generalized linear models, R package version 0.2-1. Available from: https://cran.r-project.org/web/packages/pglm/pglm.pdf
  8. Eddelbuettel D, Francois R, Allaire J, Ushey K, Kou Q, Russell N, Bates D, and Chambers J (2018). Seamless R and C++ Integration. Available from: http://dirk.eddelbuettel.com/code/rcpp.html
  9. Eddelbuettel D and Sanderson C (2014). RcppArmadillo: accelerating R with high-performance C++ linear algebra, Computational Statistics and Data Analysis, 71, 1054-1063. https://doi.org/10.1016/j.csda.2013.02.005
  10. Halton JH (1964). Radical-inverse quasi-random point sequence, Communications of the ACM, 7, 701-702. https://doi.org/10.1145/355588.365104
  11. Harris MN, Macquarie LM, and Siouclis AJ (2000). A comparison of alternative estimators for binary panel probit models, Melbourne Institute Working Paper, No 3. ISSN 1328-4991.
  12. Hotelling H (1936). Relations between two sets of variates, Biometrika, 28, 321-377. https://doi.org/10.1093/biomet/28.3-4.321
  13. Kan R and Robotti C (2017). On Moments of Folded and Truncated Multivariate Normal Distributions, Unpublished manuscript. Available from: https://sites.google.com/site/cesarerobotti/kr_JCGS.pdf
  14. Kennedy JWJ and Gentle JE (1980) Statistical Computing, Marcel Dekker, Inc.
  15. Lancaster T (2000). The incidental parameter problem since 1948, Journal of Econometrics, 95, 391-413. https://doi.org/10.1016/S0304-4076(99)00044-5
  16. Marquardt DW (1963). An algorithm for least squares estimation of nonlinear parameters, Journal of the Society for Industrial and Applied Mathematics, 11, 431-441. https://doi.org/10.1137/0111030
  17. Manjunath BG and Wilhelm S (2009). Moments calculation for the double truncated multivariate normal density (Working Paper).Available from: http://ssrn.com/abstract=1472153
  18. McCulloch CE (1994). Maximum likelihood variance components estimation for binary data, Journal of the American Statistical Association, 89, 330-335. https://doi.org/10.1080/01621459.1994.10476474
  19. McCulloch CE (1996). Fixed and random effects and best prediction. In Proceedings of the Kansas State Conference on Applied Statistics in Agriculture.
  20. McFadden D and Ruud PA (1994). Estimation by simulation, The Review of Econometrics and Statistics, 76, 591-608. https://doi.org/10.2307/2109765
  21. Pinheiro J, Bates D, DebRoy S, and Sarkar D (2015). nlme: Linear and nonlinear mixed effects Models. R package version 3.1-122. Available from: http://CRAN.R-project.org/package=nlme
  22. Sarrias M (2016). Discrete choice models with random parameters in R: The Rchoice Package, Journal of Statistical Software, 74, 1-31. https://doi.org/10.18637/jss.v074.i10
  23. Searle SR, Casella G, and McCulloch CE (2006). Variance Components, John Wiley & Sons, New York.
  24. Wei GCG and Tanner MA (1990). A Monte Carlo implementation of the EM algorithm and the poor man's data augmentation algorithms, Journal of the American Statistical Association, 85, 699-704. https://doi.org/10.1080/01621459.1990.10474930
  25. Wilhelm S (2015). Package tmvtnorm: Truncated Multivariate Normal and Student t Distribution. Available from: https://cran.r-project.org/web/packages/tmvtnorm/tmvtnorm.pdf
  26. Wolfinger R and O'Connell M (1993). Generalized linear mixed models: a pseudo-likelihood approach, Journal of Statistical Computation and Simulation, 48, 233-243. https://doi.org/10.1080/00949659308811554
  27. Zhang H, Lu N, Feng C, Thurston SW, Xia Y, Zhu L, and Tu XM (2011). On fitting generalized linear mixed-effects models for binary responses using different statistical packages, Statistics in Medicine, 30, 2562-2572. https://doi.org/10.1002/sim.4265