DOI QR코드

DOI QR Code

Moderately clipped LASSO for the high-dimensional generalized linear model

  • Lee, Sangin (Department of Information and Statistics, Chungnam National University) ;
  • Ku, Boncho (Korea Institute of Oriental Medicine, Konkuk University) ;
  • Kown, Sunghoon (Department of Applied Statistics, Konkuk University)
  • Received : 2020.03.23
  • Accepted : 2020.04.22
  • Published : 2020.07.31

Abstract

The least absolute shrinkage and selection operator (LASSO) is a popular method for a high-dimensional regression model. LASSO has high prediction accuracy; however, it also selects many irrelevant variables. In this paper, we consider the moderately clipped LASSO (MCL) for the high-dimensional generalized linear model which is a hybrid method of the LASSO and minimax concave penalty (MCP). The MCL preserves advantages of the LASSO and MCP since it shows high prediction accuracy and successfully selects relevant variables. We prove that the MCL achieves the oracle property under some regularity conditions, even when the number of parameters is larger than the sample size. An efficient algorithm is also provided. Various numerical studies confirm that the MCL can be a better alternative to other competitors.

Keywords

References

  1. Breiman L (1996). Heuristics of instability and stabilization in model selection, The Annals of Statistics, 24, 2350-2383. https://doi.org/10.1214/aos/1032181158
  2. Efron B, Hastie T, Johnstone I, Tibshirani R (2004). Least angle regression, The Annals of Statistics, 32 407-499. https://doi.org/10.1214/009053604000000067
  3. Fan J and Li R (2001). Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 96 1348-1360. https://doi.org/10.1198/016214501753382273
  4. Fan J and Peng H (2004). Nonconcave penalized likelihood with a diverging number of parameters, The Annals of Statistics, 32 928-961. https://doi.org/10.1214/009053604000000256
  5. Friedman J, Hastie T, Hofling H, and Tibshirani R (2007). Pathwise coordinate optimization, The Annals of Applied Statistics, 1 302-332. https://doi.org/10.1214/07-AOAS131
  6. Fu WJ (1998). Penalized regressions: the bridge versus the lasso, Journal of Computational and Graphical Statistics, 7, 397-416. https://doi.org/10.2307/1390712
  7. Huang J, Breheny P, Lee S, Ma S, and Zhang C (2016). The Mnet method for variable selection, Statistica Sinica, 26, 903-923.
  8. Kim Y, Choi H, and Oh HS (2008). Smoothly clipped absolute deviation on high dimensions, Journal of the American Statistical Association, 103, 1665-1673. https://doi.org/10.1198/016214508000001066
  9. Kim Y and Kwon S (2012). Global optimality of nonconvex penalized estimators. Biometrika, 99, 315-325. https://doi.org/10.1093/biomet/asr084
  10. Kwon S and Kim Y (2012). Large sample properties of the scad-penalized maximum likelihood estimation on high dimensions, Statistica Sinica, 22, 629-653.
  11. Kwon S, Kim Y, and Choi H (2013). Sparse bridge estimation with a diverging number of parameters, Statistics and Its Interface, 6, 231-242. https://doi.org/10.4310/SII.2013.v6.n2.a7
  12. Kwon S, Lee S, and Kim Y (2015). Moderately clipped lasso, Computational Statistics & Data Analysis, 92, 53-67. https://doi.org/10.1016/j.csda.2015.07.001
  13. Lee S, Kwon S, and Kim Y (2016). A modified local quadratic approximation algorithm for penalized optimization problems, Computational Statistics & Data Analysis, 94, 275-286. https://doi.org/10.1016/j.csda.2015.08.019
  14. Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society. Series B (Methodological), 58, 267-288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
  15. Yuille AL and Rangarajan A (2003). The concave-convex procedure (CCCP), Neural Computation, 15, 915-936. https://doi.org/10.1162/08997660360581958
  16. Zhang CH (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, 894-942. https://doi.org/10.1214/09-AOS729
  17. Zhang CH and Huang J (2008). The sparsity and bias of the lasso selection in high-dimensional linear regression, The Annals of Statistics, 36, 1567-1594. https://doi.org/10.1214/07-AOS520
  18. Zhang CH and Zhang T (2012). A general theory of concave regularization for high-dimensional sparse estimation problems, Statistical Science, 27, 576-593. https://doi.org/10.1214/12-STS399
  19. Zhao P and Yu B (2006). On model selection consistency of lasso, The Journal of Machine Learning Research, 7, 2541-2563.
  20. Zou H and Hastie T (2005). Regularization and variable selection via the elastic net, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67, 301-320. https://doi.org/10.1111/j.1467-9868.2005.00503.x