DOI QR코드

DOI QR Code

An efficient algorithm for the non-convex penalized multinomial logistic regression

  • Kwon, Sunghoon (Department of Applied Statistics, Konkuk University) ;
  • Kim, Dongshin (Graziadio School of Business and Management, Pepperdine University) ;
  • Lee, Sangin (Department of Information and Statistics, Chungnam National University)
  • Received : 2019.10.17
  • Accepted : 2019.11.26
  • Published : 2020.01.31

Abstract

In this paper, we introduce an efficient algorithm for the non-convex penalized multinomial logistic regression that can be uniformly applied to a class of non-convex penalties. The class includes most non-convex penalties such as the smoothly clipped absolute deviation, minimax concave and bridge penalties. The algorithm is developed based on the concave-convex procedure and modified local quadratic approximation algorithm. However, usual quadratic approximation may slow down computational speed since the dimension of the Hessian matrix depends on the number of categories of the output variable. For this issue, we use a uniform bound of the Hessian matrix in the quadratic approximation. The algorithm is available from the R package ncpen developed by the authors. Numerical studies via simulations and real data sets are provided for illustration.

Keywords

References

  1. Bohning D (1992). Multinomial logistic regression algorithm, Annals of the Institute of Statistical Mathematics, 44, 197-200. https://doi.org/10.1007/BF00048682
  2. Bondell HD, Krishna A, and Ghosh SK (2010). Joint variable selection for fixed and random effects in linear mixed-effects models, Biometrics, 66, 1069-1077. https://doi.org/10.1111/j.1541-0420.2010.01391.x
  3. Cawley GC, Talbot NL, and Girolami M (2007). Sparse multinomial logistic regression via Bayesian l1 regularisation. In Advances in Neural Information Processing Systems, 209-216.
  4. Chen L, Yang J, Li J, and Wang X (2014). Multinomial regression with elastic net penalty and its grouping effect in gene selection. In Abstract and Applied Analysis, 2014, Hindawi.
  5. Fan J and Li R (2001). Variable selection via nonconcave penalized likelihood and its oracle proper-ties, Journal of the American statistical Association, 96, 1348-1360. https://doi.org/10.1198/016214501753382273
  6. Fan J and Lv J (2011). Nonconcave penalized likelihood with np-dimensionality, IEEE Transactions on Information Theory, 57, 5467-5484. https://doi.org/10.1109/TIT.2011.2158486
  7. Fan J and Peng H (2004). Nonconcave penalized likelihood with a diverging number of parameters, The Annals of Statistics, 32, 928-961. https://doi.org/10.1214/009053604000000256
  8. Friedman J, Hastie T, and Tibshirani R (2010). Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, 33, 1.
  9. Hoerl AE and Kennard RW (1970). Ridge regression: Biased estimation for nonorthogonal problems, Technometrics, 12, 55-67. https://doi.org/10.1080/00401706.1970.10488634
  10. Huang J, Breheny P, Lee S, Ma S, and Zhang CH (2016). The Mnet method for variable selection, Statistica Sinica, 26, 903-923.
  11. Huang J, Horowitz JL, and Ma S (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models, The Annals of Statistics, 36, 587-613. https://doi.org/10.1214/009053607000000875
  12. Huang J, Horowitz JL, and Wei F (2010). Variable selection in nonparametric additive models, The Annals of Statistics, 38, 2282-2313. https://doi.org/10.1214/09-AOS781
  13. Huttunen H, Yancheshmeh FS, and Chen K (2016). Car type recognition with deep neural networks. In 2016 IEEE Intelligent Vehicles Symposium (IV), 1115-1120, IEEE.
  14. Kim J, Kim Y, and Kim Y (2008). A gradient-based optimization algorithm for lasso, Journal of Computational and Graphical Statistics, 17, 994-1009. https://doi.org/10.1198/106186008X386210
  15. Kim Y, Kwon S, and Song SH (2006). Multiclass sparse logistic regression for classification of multiple cancer types using gene expression data, Computational Statistics & Data Analysis, 51, 1643-1655. https://doi.org/10.1016/j.csda.2006.06.007
  16. Krishnapuram B, Carin L, Figueiredo MA, and Hartemink AJ (2005). Sparse multinomial logistic regression: Fast algorithms and generalization bounds, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 957-968. https://doi.org/10.1109/TPAMI.2005.127
  17. Kwon S and Kim Y (2012). Large sample properties of the SCAD-penalized maximum likelihood estimation on high dimensions, Statistica Sinica, 12, 629-653.
  18. Kwon S, Kim Y, and Choi H (2013). Sparse bridge estimation with a diverging number of parameters, Statistics and Its Interface, 6, 231-242. https://doi.org/10.4310/SII.2013.v6.n2.a7
  19. Kwon S, Lee S, and Kim Y (2015). Moderately clipped lasso, Computational Statistics & Data Analysis, 92, 53-67. https://doi.org/10.1016/j.csda.2015.07.001
  20. Kwon S, Oh S, and Lee Y (2016). The use of random-effect models for high-dimensional variable selection problems, Computational Statistics & Data Analysis, 103, 401-412. https://doi.org/10.1016/j.csda.2016.05.016
  21. Lee S, Kwon S, and Kim Y (2016). A modified local quadratic approximation algorithm for penalized optimization problems, Computational Statistics & Data Analysis, 94(C), 275-286. https://doi.org/10.1016/j.csda.2015.08.019
  22. Lee Y and Oh HS (2014). A new sparse variable selection via random-effect model, Journal of Multivariate Analysis, 125, 89-99. https://doi.org/10.1016/j.jmva.2013.11.016
  23. Shen X, Pan W, and Zhu Y (2012). Likelihood-based selection and sharp parameter estimation, Journal of the American Statistical Association, 107, 223-232. https://doi.org/10.1080/01621459.2011.645783
  24. Simon N, Friedman J, and Hastie T (2013). A blockwise descent algorithm for group-penalized multiresponse and multinomial regression. arXiv preprint arXiv:1311.6529.
  25. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  26. Tutz G (2011). Regression for categorical data, volume 34. Cambridge University Press.
  27. Tutz G, PossneckerW, and Uhlmann L (2015). Variable selection in general multinomial logit models, Computational Statistics & Data Analysis, 82, 207-222. https://doi.org/10.1016/j.csda.2014.09.009
  28. Um S, Kim D, Lee S, and Kwon S (2019). On the strong oracle property of concave penalized estimators with infinite penalty derivative at the origin, The Korean Journal of Statistics, Under review.
  29. Xie H and Huang J (2009). SCAD-penalized regression in high-dimensional partially linear models, The Annals of Statistics, 37, 673-696. https://doi.org/10.1214/07-AOS580
  30. Yuan M and Lin Y (2006). Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68, 49-67. https://doi.org/10.1111/j.1467-9868.2005.00532.x
  31. Yuille AL and Rangarajan A (2002). The concave-convex procedure (CCCP). In Advances in Neural Information Processing Systems, 1033-1040.
  32. Zhang CH (2010). Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38, 894-942. https://doi.org/10.1214/09-AOS729
  33. Zhang CH and Zhang T (2012). A general theory of concave regularization for high-dimensional sparse estimation problems, Statistical Science, 27, 576-593. https://doi.org/10.1214/12-STS399
  34. Zhao P and Yu B (2006). On model selection consistency of lasso, Journal of Machine Learning Research, 7, 2541-2563.
  35. Zhu J and Hastie T (2004). Classification of gene microarrays by penalized logistic regression, Biostatistics, 5, 427-443. https://doi.org/10.1093/biostatistics/kxg046
  36. Zou H and Hastie T (2005). Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
  37. Zou H and Li R (2008). One-step sparse estimates in nonconcave penalized likelihood models, Annals of Statistics, 36, 1509-1533 https://doi.org/10.1214/009053607000000802