DOI QR코드

DOI QR Code

The skew-t censored regression model: parameter estimation via an EM-type algorithm

  • Lachos, Victor H. (Department of Statistics, University of Connecticut) ;
  • Bazan, Jorge L. (Department of Applied Mathematics and Statistics, Universidade of Sao Paulo) ;
  • Castro, Luis M. (Department of Statistics, Pontificia Universidad Catolica de Chile) ;
  • Park, Jiwon (Department of Statistics, University of Connecticut)
  • Received : 2021.10.19
  • Accepted : 2022.04.19
  • Published : 2022.05.31

Abstract

The skew-t distribution is an attractive family of asymmetrical heavy-tailed densities that includes the normal, skew-normal and Student's-t distributions as special cases. In this work, we propose an EM-type algorithm for computing the maximum likelihood estimates for skew-t linear regression models with censored response. In contrast with previous proposals, this algorithm uses analytical expressions at the E-step, as opposed to Monte Carlo simulations. These expressions rely on formulas for the mean and variance of a truncated skew-t distribution, and can be computed using the R library MomTrunc. The standard errors, the prediction of unobserved values of the response and the log-likelihood function are obtained as a by-product. The proposed methodology is illustrated through the analyses of simulated and a real data application on Letter-Name Fluency test in Peruvian students.

Keywords

Acknowledgement

We thank the associate editor and two anonymous referees for their important comments and suggestions which lead to an improvement of this paper. Jorge L. Bazan acknowledges support from FAPESP-Brazil (Grant 2021/11720-0). L. M. Castro acknowledges support from Grant FONDECYT 1220799 from the Chilean government.

References

  1. Akaike H (1974). A new look at the statistical model identification, IEEE Transactions on Automatic Control, 19, 716-723. https://doi.org/10.1109/TAC.1974.1100705
  2. Arellano-Valle RB, Castro LM, Gonzalez-Farias G and Munoz-Gajardo KA (2012). Student-t censored regression model: properties and inference, Statistical Methods & Applications, 21, 453-473. https://doi.org/10.1007/s10260-012-0199-y
  3. Azzalini A (1985). A class of distributions which includes the normal ones, Scandinavian Journal of Statistics, 12, 171-178.
  4. Azzalini A and Capitanio A (1999). Statistical applications of the multivariate skew normal distribution, Journal of the Royal Statistical Society: Series B, 61, 579-602. https://doi.org/10.1111/1467-9868.00194
  5. Azzalini A and Capitanio A (2003). Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65, 367-389. https://doi.org/10.1111/rssc.12126
  6. Azzalini A and Dalla Valle A (1996). The multivariate skew-normal distribution, Biometrika, 83, 715-726. https://doi.org/10.1093/biomet/83.4.715
  7. Basso RM, Lachos VH, Cabral CR, and Ghosh P (2010). Robust mixture modeling based on scale mixtures of skew-normal distributions, Computational Statistics & Data Analysis, 54, 2926-2941. https://doi.org/10.1016/j.csda.2009.09.031
  8. Bozdogan H (1987). Model selection and Akaike's Information Criterion (AIC): The general theory and its analytical extensions, Psychometrika, 52, 345-370. https://doi.org/10.1007/BF02294361
  9. Burnham KP and Anderson DR (2002). Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (2nd ed.), Springer-Verlag.
  10. Cronin V and Carver P (1998). Phonological sensitivity, rapid naming and beginning reading, Applied Psycholinguistics, 19, 447-461. https://doi.org/10.1017/S0142716400010262
  11. Dempster A, Laird N, and Rubin D (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, Series B,, 39, 1-38.
  12. Foulin JN (2005). Why is letter-name knowledge such a good predictor of learning to read?, Reading and Writting, 38, 129-155. https://doi.org/10.1007/s11145-004-5892-2
  13. Galarza CM, Kan R, and Lachos VH (2020). MomTrunc: Moments of Folded and Doubly Truncated Multivariate Distributions, R package version 5.69, http://cran.r-project.org/package=MomTrunc
  14. Garay AM, Lachos VH, Bolfarine H, and Cabral CRB (2017a). Linear censored regression models with scale mixtures of normal distributions, Statistical Papers, 58, 247-278. https://doi.org/10.1007/s00362-015-0696-9
  15. Garay AW, Massuia MB, and Lachos VH (2017b). BayesCR: Bayesian Analysis of Censored Regression Models Under Scale Mixture of Skew Normal Distributions. R package version 2.1, http://cran.r-project.org/package=BayesCR
  16. Lachos VH, Garay A, and Cabral CR (2020). Moments of truncated skew-normal/independent distributions, Brazilian Journal of Probability and Statistics, 34, 478-494.
  17. Lachos VH, Moreno EJL, Chen K, and Cabral CRB (2017). Finite mixture modeling of censored data using the multivariate Student-t distribution, Journal of Multivariate Analysis, 159, 151-167. https://doi.org/10.1016/j.jmva.2017.05.005
  18. Liu C and Rubin DB (1994). The ECME algorithm: A simple extension of EM and ECM with faster monotone convergence, Biometrika, 81, 633-648. https://doi.org/10.1093/biomet/81.4.633
  19. Louis TA (1982). Finding the observed information matrix when using the EM algorithm, Journal of the Royal Statistical Society: Series B (Methodological), 44, 226-233. https://doi.org/10.1111/j.2517-6161.1982.tb01203.x
  20. Marston D and Magnusson D (1988). Alternative Educational Delivery Systems: Enhancing Instructional Options for All Students, (Ed. Graden J. and Zins, J. and Curtis, M.), Pages = 137-172, Publisher = National Association of School Psychology, Title = Curriculum-based measurement: District level implementation, Washington, DC.
  21. Massuia MB, Cabral CRB, Matos LA and Lachos VH (2015). Influence diagnostics for Student-t censored linear regression models, Statistics, 49, 1074-1094. https://doi.org/10.1080/02331888.2014.958489
  22. Massuia MB, Garay AM, Lachos VH and Cabral CRB (2017). Bayesian analysis of censored linear regression models with scale mixtures of skew-normal distributions, Statistics and its Interface, 10, 425-439, https://doi.org/10.4310/SII.2017.v10.n3.a7
  23. Mattos TdB, Garay AM, and Lachos VH (2018). Likelihood-based inference for censored linear regression models with scale mixtures of skew-normal distributions, Journal of Applied Statistics, 45, 2039-2066. https://doi.org/10.1080/02664763.2017.1408788
  24. Ritchey K and Speece D (2006). From letter names to word reading: The nascent role of sublexical fluency, Contemporary Educational Psychology, 31, 301-327. https://doi.org/10.1016/j.cedpsych.2005.10.001
  25. RTI-FDA (2008). Snapshot of School Management Effectiveness: Peru Pilot Study (Technical report), USAID.
  26. Schwarz G (1978). Estimating the dimension of a model, The Annals of Statistics, 6, 461-464. https://doi.org/10.1214/aos/1176344136