[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.29220/CSAM.2019.26.3.273

Least absolute deviation estimator based consistent model selection in regression

Shende, K.S. (Department of Statistics, Shivaji University)
Kashid, D.N. (Department of Statistics, Shivaji University)

Publication Information

Communications for Statistical Applications and Methods / v.26, no.3, 2019 , pp. 273-293 More about this Journal

Abstract

We consider the problem of model selection in multiple linear regression with outliers and non-normal error distributions. In this article, the robust model selection criterion is proposed based on the robust estimation method with the least absolute deviation (LAD). The proposed criterion is shown to be consistent. We suggest proposed criterion based algorithms that are suitable for a large number of predictors in the model. These algorithms select only relevant predictor variables with probability one for large sample sizes. An exhaustive simulation study shows that the criterion performs well. However, the proposed criterion is applied to a real data set to examine its applicability. The simulation results show the proficiency of algorithms in the presence of outliers, non-normal distribution, and multicollinearity.

Keywords

linear regression; model selection; consistency; robustness; sequential algorithm;

Citations & Related Records

Reference

1	Dielman TE (2006). Variance estimates and hypothesis tests in least absolute value regression, Journal of Statistical Computation and Simulation, 76, 103-114. DOI
2	Gilmour SG (1995). The interpretation of Mallows's $C_p$ -statistic, Journal of the Royal Statistical Society, Series D (The Statistician), 45, 49-56.
3	Kashid DN and Kulkarni SR (2002). A more general criterion for subset selection in multiple linear regression, Communications in Statistics - Theory and Methods, 31, 795-811. DOI
4	Kim C and Hwang S (2000). Influence subsets on the variable selection, Communication in Statistics-Theory and Methods, 29, 335-347. DOI
5	Machado JAF (1993). Robust model selection and M-estimation, Econometric Theory, 9, 478-493. DOI
6	Mallows C (1973). Some comment on $C_p$ , Technometrics, 15, 661-675. DOI
7	Rao CR and Wu Y (1989). A strong consistent procedure for model selection in a regression model, Biometrika, 76, 369-374. DOI
8	Rao C, Wu Y, Konishi S, et al. (2001). On model selection, Lecture Notes-Monograph Series, 38, 1-64.
9	Ronchetti E (1985). Robust model selection in regression, Statistics and Probability Letters, 3, 21-23. DOI
10	Ronchetti E and Staudte RG (1994). A robust version of Mallows's $C_p$ , Journal of the American Statistical Association, 89, 550-559. DOI
11	Schwarz G (1978). Estimating the dimension of a model, The Annals of Statistics, 6, 461-464. DOI
12	Siniksaran E (2008). A geometric interpretation of Mallows' $C_p$ statistic and an alternative plot in variable selection, Computational Statistics and Data Analysis, 52, 3459-3467. DOI
13	Tharmaratnam K and Claeskens G (2013). A comparison of robust versions of the AIC based on M, S and MM-estimators, Statistics: A Journal of Theoretical and Applied Statistics, 47, 216-235. DOI
14	Yamashita T, Yamashita K, and Kamimura, R (2007). A stepwise AIC method for variable selection in linear regression, Communication in Statistics-Theory and Methods, 36, 2395-2403. DOI
15	Dielman TE (2005). Least absolute value regression: recent contributions, Journal of Statistical Computation and Simulation, 75, 263-286. DOI
16	Akaike H (1973). Information theory and an extension of maximum likelihood principle. In Proceedings of the Second International Symposium on Information Theory, Akademiai Kiado, Budapest, 267-281.
17	Birkes D and Dodge Y (1993). Alternative Methods of Regression, Wiley, New York.