Browse > Article
http://dx.doi.org/10.29220/CSAM.2020.27.1.129

An efficient algorithm for the non-convex penalized multinomial logistic regression  

Kwon, Sunghoon (Department of Applied Statistics, Konkuk University)
Kim, Dongshin (Graziadio School of Business and Management, Pepperdine University)
Lee, Sangin (Department of Information and Statistics, Chungnam National University)
Publication Information
Communications for Statistical Applications and Methods / v.27, no.1, 2020 , pp. 129-140 More about this Journal
Abstract
In this paper, we introduce an efficient algorithm for the non-convex penalized multinomial logistic regression that can be uniformly applied to a class of non-convex penalties. The class includes most non-convex penalties such as the smoothly clipped absolute deviation, minimax concave and bridge penalties. The algorithm is developed based on the concave-convex procedure and modified local quadratic approximation algorithm. However, usual quadratic approximation may slow down computational speed since the dimension of the Hessian matrix depends on the number of categories of the output variable. For this issue, we use a uniform bound of the Hessian matrix in the quadratic approximation. The algorithm is available from the R package ncpen developed by the authors. Numerical studies via simulations and real data sets are provided for illustration.
Keywords
concave-convex procedure; modified local quadratic approximation algorithm; multinomial logistic regression; non-convex penalty;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Zhang CH (2010). Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38, 894-942.   DOI
2 Zhang CH and Zhang T (2012). A general theory of concave regularization for high-dimensional sparse estimation problems, Statistical Science, 27, 576-593.   DOI
3 Zhao P and Yu B (2006). On model selection consistency of lasso, Journal of Machine Learning Research, 7, 2541-2563.
4 Zhu J and Hastie T (2004). Classification of gene microarrays by penalized logistic regression, Biostatistics, 5, 427-443.   DOI
5 Zou H and Hastie T (2005). Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 301-320.   DOI
6 Zou H and Li R (2008). One-step sparse estimates in nonconcave penalized likelihood models, Annals of Statistics, 36, 1509-1533   DOI
7 Chen L, Yang J, Li J, and Wang X (2014). Multinomial regression with elastic net penalty and its grouping effect in gene selection. In Abstract and Applied Analysis, 2014, Hindawi.
8 Bohning D (1992). Multinomial logistic regression algorithm, Annals of the Institute of Statistical Mathematics, 44, 197-200.   DOI
9 Bondell HD, Krishna A, and Ghosh SK (2010). Joint variable selection for fixed and random effects in linear mixed-effects models, Biometrics, 66, 1069-1077.   DOI
10 Cawley GC, Talbot NL, and Girolami M (2007). Sparse multinomial logistic regression via Bayesian l1 regularisation. In Advances in Neural Information Processing Systems, 209-216.
11 Friedman J, Hastie T, and Tibshirani R (2010). Regularization paths for generalized linear models via coordinate descent, Journal of Statistical Software, 33, 1.
12 Fan J and Li R (2001). Variable selection via nonconcave penalized likelihood and its oracle proper-ties, Journal of the American statistical Association, 96, 1348-1360.   DOI
13 Fan J and Lv J (2011). Nonconcave penalized likelihood with np-dimensionality, IEEE Transactions on Information Theory, 57, 5467-5484.   DOI
14 Fan J and Peng H (2004). Nonconcave penalized likelihood with a diverging number of parameters, The Annals of Statistics, 32, 928-961.   DOI
15 Hoerl AE and Kennard RW (1970). Ridge regression: Biased estimation for nonorthogonal problems, Technometrics, 12, 55-67.   DOI
16 Huang J, Breheny P, Lee S, Ma S, and Zhang CH (2016). The Mnet method for variable selection, Statistica Sinica, 26, 903-923.
17 Huang J, Horowitz JL, and Ma S (2008). Asymptotic properties of bridge estimators in sparse high-dimensional regression models, The Annals of Statistics, 36, 587-613.   DOI
18 Huang J, Horowitz JL, and Wei F (2010). Variable selection in nonparametric additive models, The Annals of Statistics, 38, 2282-2313.   DOI
19 Huttunen H, Yancheshmeh FS, and Chen K (2016). Car type recognition with deep neural networks. In 2016 IEEE Intelligent Vehicles Symposium (IV), 1115-1120, IEEE.
20 Kim J, Kim Y, and Kim Y (2008). A gradient-based optimization algorithm for lasso, Journal of Computational and Graphical Statistics, 17, 994-1009.   DOI
21 Kwon S, Kim Y, and Choi H (2013). Sparse bridge estimation with a diverging number of parameters, Statistics and Its Interface, 6, 231-242.   DOI
22 Kim Y, Kwon S, and Song SH (2006). Multiclass sparse logistic regression for classification of multiple cancer types using gene expression data, Computational Statistics & Data Analysis, 51, 1643-1655.   DOI
23 Krishnapuram B, Carin L, Figueiredo MA, and Hartemink AJ (2005). Sparse multinomial logistic regression: Fast algorithms and generalization bounds, IEEE Transactions on Pattern Analysis and Machine Intelligence, 27, 957-968.   DOI
24 Kwon S and Kim Y (2012). Large sample properties of the SCAD-penalized maximum likelihood estimation on high dimensions, Statistica Sinica, 12, 629-653.
25 Kwon S, Lee S, and Kim Y (2015). Moderately clipped lasso, Computational Statistics & Data Analysis, 92, 53-67.   DOI
26 Kwon S, Oh S, and Lee Y (2016). The use of random-effect models for high-dimensional variable selection problems, Computational Statistics & Data Analysis, 103, 401-412.   DOI
27 Lee S, Kwon S, and Kim Y (2016). A modified local quadratic approximation algorithm for penalized optimization problems, Computational Statistics & Data Analysis, 94(C), 275-286.   DOI
28 Lee Y and Oh HS (2014). A new sparse variable selection via random-effect model, Journal of Multivariate Analysis, 125, 89-99.   DOI
29 Shen X, Pan W, and Zhu Y (2012). Likelihood-based selection and sharp parameter estimation, Journal of the American Statistical Association, 107, 223-232.   DOI
30 Simon N, Friedman J, and Hastie T (2013). A blockwise descent algorithm for group-penalized multiresponse and multinomial regression. arXiv preprint arXiv:1311.6529.
31 Um S, Kim D, Lee S, and Kwon S (2019). On the strong oracle property of concave penalized estimators with infinite penalty derivative at the origin, The Korean Journal of Statistics, Under review.
32 Tibshirani, R. (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267-288.   DOI
33 Tutz G (2011). Regression for categorical data, volume 34. Cambridge University Press.
34 Tutz G, PossneckerW, and Uhlmann L (2015). Variable selection in general multinomial logit models, Computational Statistics & Data Analysis, 82, 207-222.   DOI
35 Xie H and Huang J (2009). SCAD-penalized regression in high-dimensional partially linear models, The Annals of Statistics, 37, 673-696.   DOI
36 Yuan M and Lin Y (2006). Model selection and estimation in regression with grouped variables, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68, 49-67.   DOI
37 Yuille AL and Rangarajan A (2002). The concave-convex procedure (CCCP). In Advances in Neural Information Processing Systems, 1033-1040.