[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.29220/CSAM.2020.27.1.129

An efficient algorithm for the non-convex penalized multinomial logistic regression

Kwon, Sunghoon (Department of Applied Statistics, Konkuk University)
Kim, Dongshin (Graziadio School of Business and Management, Pepperdine University)
Lee, Sangin (Department of Information and Statistics, Chungnam National University)

Publication Information

Communications for Statistical Applications and Methods / v.27, no.1, 2020 , pp. 129-140 More about this Journal

Abstract

In this paper, we introduce an efficient algorithm for the non-convex penalized multinomial logistic regression that can be uniformly applied to a class of non-convex penalties. The class includes most non-convex penalties such as the smoothly clipped absolute deviation, minimax concave and bridge penalties. The algorithm is developed based on the concave-convex procedure and modified local quadratic approximation algorithm. However, usual quadratic approximation may slow down computational speed since the dimension of the Hessian matrix depends on the number of categories of the output variable. For this issue, we use a uniform bound of the Hessian matrix in the quadratic approximation. The algorithm is available from the R package ncpen developed by the authors. Numerical studies via simulations and real data sets are provided for illustration.

Keywords

concave-convex procedure; modified local quadratic approximation algorithm; multinomial logistic regression; non-convex penalty;

Citations & Related Records

Reference

1	Zhang CH (2010). Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38, 894-942. DOI
2	Zhang CH and Zhang T (2012). A general theory of concave regularization for high-dimensional sparse estimation problems, Statistical Science, 27, 576-593. DOI
3	Zhao P and Yu B (2006). On model selection consistency of lasso, Journal of Machine Learning Research, 7, 2541-2563.
4	Zhu J and Hastie T (2004). Classification of gene microarrays by penalized logistic regression, Biostatistics, 5, 427-443. DOI
5	Zou H and Hastie T (2005). Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 301-320. DOI
6	Zou H and Li R (2008). One-step sparse estimates in nonconcave penalized likelihood models, Annals of Statistics, 36, 1509-1533 DOI
7	Chen L, Yang J, Li J, and Wang X (2014). Multinomial regression with elastic net penalty and its grouping effect in gene selection. In Abstract and Applied Analysis, 2014, Hindawi.
8	Bohning D (1992). Multinomial logistic regression algorithm, Annals of the Institute of Statistical Mathematics, 44, 197-200. DOI
9	Bondell HD, Krishna A, and Ghosh SK (2010). Joint variable selection for fixed and random effects in linear mixed-effects models, Biometrics, 66, 1069-1077. DOI
10	Cawley GC, Talbot NL, and Girolami M (2007). Sparse multinomial logistic regression via Bayesian l1 regularisation. In Advances in Neural Information Processing Systems, 209-216.
11	Friedman J, Hastie T, and Tibshirani R (2010). Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, 33, 1.
12	Fan J and Li R (2001). Variable selection via nonconcave penalized likelihood and its oracle proper-ties, Journal of the American statistical Association, 96, 1348-1360. DOI
13	Fan J and Lv J (2011). Nonconcave penalized likelihood with np-dimensionality, IEEE Transactions on Information Theory, 57, 5467-5484. DOI
14	Fan J and Peng H (2004). Nonconcave penalized likelihood with a diverging number of parameters, The Annals of Statistics, 32, 928-961. DOI
15	Hoerl AE and Kennard RW (1970). Ridge regression: Biased estimation for nonorthogonal problems, Technometrics, 12, 55-67. DOI
16	Huang J, Breheny P, Lee S, Ma S, and Zhang CH (2016). The Mnet method for variable selection, Statistica Sinica, 26, 903-923.
17	Huang J, Horowitz JL, and Ma S (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models, The Annals of Statistics, 36, 587-613. DOI
18	Huang J, Horowitz JL, and Wei F (2010). Variable selection in nonparametric additive models, The Annals of Statistics, 38, 2282-2313. DOI
19	Huttunen H, Yancheshmeh FS, and Chen K (2016). Car type recognition with deep neural networks. In 2016 IEEE Intelligent Vehicles Symposium (IV), 1115-1120, IEEE.
20	Kim J, Kim Y, and Kim Y (2008). A gradient-based optimization algorithm for lasso, Journal of Computational and Graphical Statistics, 17, 994-1009. DOI
21	Kwon S, Kim Y, and Choi H (2013). Sparse bridge estimation with a diverging number of parameters, Statistics and Its Interface, 6, 231-242. DOI
22	Kim Y, Kwon S, and Song SH (2006). Multiclass sparse logistic regression for classification of multiple cancer types using gene expression data, Computational Statistics & Data Analysis, 51, 1643-1655. DOI
23	Krishnapuram B, Carin L, Figueiredo MA, and Hartemink AJ (2005). Sparse multinomial logistic regression: Fast algorithms and generalization bounds, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 957-968. DOI
24	Kwon S and Kim Y (2012). Large sample properties of the SCAD-penalized maximum likelihood estimation on high dimensions, Statistica Sinica, 12, 629-653.
25	Kwon S, Lee S, and Kim Y (2015). Moderately clipped lasso, Computational Statistics & Data Analysis, 92, 53-67. DOI
26	Kwon S, Oh S, and Lee Y (2016). The use of random-effect models for high-dimensional variable selection problems, Computational Statistics & Data Analysis, 103, 401-412. DOI
27	Lee S, Kwon S, and Kim Y (2016). A modified local quadratic approximation algorithm for penalized optimization problems, Computational Statistics & Data Analysis, 94(C), 275-286. DOI
28	Lee Y and Oh HS (2014). A new sparse variable selection via random-effect model, Journal of Multivariate Analysis, 125, 89-99. DOI
29	Shen X, Pan W, and Zhu Y (2012). Likelihood-based selection and sharp parameter estimation, Journal of the American Statistical Association, 107, 223-232. DOI
30	Simon N, Friedman J, and Hastie T (2013). A blockwise descent algorithm for group-penalized multiresponse and multinomial regression. arXiv preprint arXiv:1311.6529.
31	Um S, Kim D, Lee S, and Kwon S (2019). On the strong oracle property of concave penalized estimators with infinite penalty derivative at the origin, The Korean Journal of Statistics, Under review.
32	Tibshirani, R. (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267-288. DOI
33	Tutz G (2011). Regression for categorical data, volume 34. Cambridge University Press.
34	Tutz G, PossneckerW, and Uhlmann L (2015). Variable selection in general multinomial logit models, Computational Statistics & Data Analysis, 82, 207-222. DOI
35	Xie H and Huang J (2009). SCAD-penalized regression in high-dimensional partially linear models, The Annals of Statistics, 37, 673-696. DOI
36	Yuan M and Lin Y (2006). Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68, 49-67. DOI
37	Yuille AL and Rangarajan A (2002). The concave-convex procedure (CCCP). In Advances in Neural Information Processing Systems, 1033-1040.