Browse > Article
http://dx.doi.org/10.29220/CSAM.2021.28.1.021

High-dimensional linear discriminant analysis with moderately clipped LASSO  

Chang, Jaeho (Department of Applied Statistics, Konkuk University)
Moon, Haeseong (Department of Applied Statistics, Konkuk University)
Kwon, Sunghoon (Department of Applied Statistics, Konkuk University)
Publication Information
Communications for Statistical Applications and Methods / v.28, no.1, 2021 , pp. 21-37 More about this Journal
Abstract
There is a direct connection between linear discriminant analysis (LDA) and linear regression since the direction vector of the LDA can be obtained by the least square estimation. The connection motivates the penalized LDA when the model is high-dimensional where the number of predictive variables is larger than the sample size. In this paper, we study the penalized LDA for a class of penalties, called the moderately clipped LASSO (MCL), which interpolates between the least absolute shrinkage and selection operator (LASSO) and minimax concave penalty. We prove that the MCL penalized LDA correctly identifies the sparsity of the Bayes direction vector with probability tending to one, which is supported by better finite sample performance than LASSO based on concrete numerical studies.
Keywords
high-dimensional LDA; LASSO; MCP; moderately clipped LASSO;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Bickel PJ and Levina E (2004). Some theory for fisher's linear discriminant function, 'naive Bayes', and some alternatives when there are many more variables than observations, Bernoulli, 10, 989-1010.
2 Burczynski ME, Peterson RL, Twine NC, et al. (2006). Molecular classification of Crohn's disease and ulcerative colitis patients using transcriptional profiles in peripheral blood mononuclear cells, The Journal of Molecular Diagnostics, 8,51-61.   DOI
3 Cai T and Liu W (2011). A direct estimation approach to sparse linear discriminant analysis, Journal of the American Statistical Association, 106, 1566-1577.   DOI
4 Casella G (1985). An introduction to empirical bayes data analysis, The American Statistician, 39, 83-87.   DOI
5 Chin K, DeVries S, Fridlyand J, et al. (2006). Genomic and transcriptional aberrations linked to breast cancer pathophysiologies, Cancer Cell, 10, 529-541.   DOI
6 Chowdary D, Lathrop J, Skelton J, et al. (2006). Prognostic gene expression signatures can be measured in tissues collected in RNAlater preservative, The Journal of Molecular Diagnostics, 8, 31-39.   DOI
7 Clemmensen L, Hastie T, Witten D, and Ersboll B (2011). Sparse discriminant analysis, Technometrics, 53, 406-413.   DOI
8 Efron B and Morris C (1975). Data analysis using stein's estimator and its generalizations, Journal of the American Statistical Association, 70, 311-319.   DOI
9 Fan J and Fan Y (2008). High dimensional classification using features annealed independence rules, Annals of Statistics, 36, 2605-2637.   DOI
10 Fan J and Li R (2001). Variable selection via nonconcave penalized likelihood and its oracle properties, Journal of the American Statistical Association, 96, 1348-1360.   DOI
11 Fan J and Song R (2010). Sure independence screening in generalized linear models with np-dimensionality, The Annals of Statistics, 38, 3567-3604.   DOI
12 Fan J, Xue L, and Zou H (2014). Strong oracle optimality of folded concave penalized estimation, Annals of Statistics, 42, 819.   DOI
13 Fisher RA (1936). The use of multiple measurements in taxonomic problems, Annals of Eugenics, 7, 179-188.   DOI
14 Gordon GJ, Jensen RV, Hsiao LL, et al. (2002). Translation of microarray data into clinically relevant cancer diagnostic tests using gene expression ratios in lung cancer and mesothelioma, Cancer Research, 62, 4963-4967.
15 Guo Y, Hastie T, and Tibshirani R (2006). Regularized linear discriminant analysis and its application in microarrays, Biostatistics, 8, 86-100.   DOI
16 Hastie T, Tibshirani R, and Friedman J (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer Science & Business Media.
17 John AR (2016). datamicroarray: Collection of Data Sets for Classification. Available from: https://github.com/ramhiser/datamicroarray.
18 Kim D, Lee S, and Kwon S (2020). A unified algorithm for the non-convex penalized estimation: The ncpen package, The R Journal, Accepted.
19 Kim Y, Choi H, and Oh HS (2008). Smoothly clipped absolute deviation on high dimensions, Journal of the American Statistical Association, 103, 1665-1673.   DOI
20 Kim Y, Jeon JJ, and Han S (2016). A necessary condition for the strong oracle property, Scandinavian Journal of Statistics, 43, 610-624.   DOI
21 Kim Y and Kwon S (2012). Global optimality of nonconvex penalized estimators, Biometrika, 99, 315-325.   DOI
22 Krzanowski W, Jonathan P, McCarthy W, and Thomas M (1995). Discriminant analysis with singular covariance matrices: methods and applications to spectroscopic data, Journal of the Royal Statistical Society: Series C (Applied Statistics), 44, 101-115.
23 Kwon S, Lee S, and Kim Y (2015). Moderately clipped lasso, Computational Statistics & Data Analysis, 92, 53-67.   DOI
24 Mai Q, Zou H, and Yuan M (2012). A direct approach to sparse discriminant analysis in ultra-high dimensions, Biometrika, 99, 29-42.   DOI
25 Mazumder R, Friedman JH, and Hastie T (2011). Sparsenet: Coordinate descent with nonconvex penalties, Journal of the American Statistical Association, 106, 1125-1138.   DOI
26 Shao J, Wang Y, Deng X, and Wang, S. (2011). Sparse linear discriminant analysis by thresholding for high dimensional data, The Annals of Statistics, 39, 1241-1265.   DOI
27 Tibshirani R (1996). Regression shrinkage and selection via the lasso, Journal of the Royal Statistical Society: Series B (Methodological), 58, 267-288.   DOI
28 Witten DM and Tibshirani R (2011). Penalized classification using fisher's linear discriminant, Journal of the Royal Statistical Society: Series B (Statistical Methodology), 73, 753-772.   DOI
29 Wu MC, Zhang L, Wang Z, Christiani DC, and Lin X (2009). Sparse linear discriminant analysis for simultaneous testing for the significance of a gene set/pathway and gene selection, Bioinformatics, 25, 1145-1151.   DOI
30 Yuille AL and Rangarajan A (2002). The concave-convex procedure (cccp). In Advances in Neural Information Processing Systems, pages 1033-1040.
31 Zhang CH (2010). Nearly unbiased variable selection under minimax concave penalty, The Annals of Statistics, 38, 894-942.   DOI
32 Zhang CH and Huang J (2008). The sparsity and bias of the lasso selection in high-dimensional linear regression, The Annals of Statistics, 36, 1567-1594.   DOI
33 Zhao P and Yu B (2006). On model selection consistency of lasso, Journal of Machine Learning Research, 7, 2541-2563.
34 Zou H (2006). The adaptive lasso and its oracle properties, Journal of the American Statistical Association, 101, 1418-1429.   DOI