Browse > Article
http://dx.doi.org/10.5351/KJAS.2020.33.1.001

A procedure for simultaneous variable selection, variable transformation and outlier identification in linear regression  

Seo, Han Son (Department of Applied Statistics, Konkuk University)
Yoon, Min (Department of Applied Mathematics, Pukyong National University)
Publication Information
The Korean Journal of Applied Statistics / v.33, no.1, 2020 , pp. 1-10 More about this Journal
Abstract
We propose a unified approach to variable selection, transformation and outliers in the linear model. The procedure includes a sequential method for outlier detection and a least trimmed squares estimator for variable transformation. It uses all possible subsets regressions for model selection. Some real data analyses and the simulation results are provided to show the efficiency of the methods in the context of the correct variable selection and the fitness of the estimated model.
Keywords
linear regression; outliers; response transformation; variable selection;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 Atkinson, A. C. (1986). Diagnostic tests for transformation, Technometrics, 28, 29-37.   DOI
2 Atkinson, A. C. and Riani, M. (2000). Robust Diagnostic Regression Analysis, Springer, New York.
3 Box, G. E. P. and Cox, D. R. (1964). An analysis of transformations (with discussion), Journal of Royal Statistical Society, Series B, 26, 211-246.
4 Brownlee, K. A. (1965). Statistical Theory and Methodology in Science and Engineering (2nd ed), Wiley, New York.
5 Carroll, R. J. and Ruppert, D. (1988). Transformation and Weighting in Regression (2nd ed), Wiley, New York.
6 Cheng, T. C. (2005). Robust regression diagnostics with data transformations, Computational Statistics and Data Analysis, 49, 875-891.   DOI
7 Daniel, C. and Wood, F. S. (1980). Fitting Equations to Data: Computer Analysis of Multifactor Data, John Wiley & Sons, New York.
8 Dupuis, D. J. and Victoria-Feser, M. P. (2011). Fast robust model selection in large datasets, Journal of the American Statistical Association, 106, 203-212.   DOI
9 Dupuis, D. J. and Victoria-Feser, M. P. (2013). Robust VIF regression with application to variable selection in large data sets, Annals of Applied Statistics, 7, 319-341.   DOI
10 Gottardo, R. and Raftery, A. (2009). Bayesian robust transformation and variable selection: a unified approach, Canadian Journal of Statistics, 37, 361-380.   DOI
11 Hadi, A. S. and Luceno, A. (1997). Maximum trimmed likelihood estimators: a unified approach, examples, and algorithms, Computational Statistics and Data Analysis, 25, 251-272.   DOI
12 Hadi, A. S. and Simonoff, J. S. (1993). Procedures for the identification of multiple outliers in linear models, Journal of the American Statistical Association, 88, 1264-1272.   DOI
13 Parker, I. (1988). Transformations and influential observations in minimum sum of absolute errors regression, Technometrics, 30, 215-220.   DOI
14 Hoeting, J., Raftery, A. E., and Madigan, D. (1996). A method for simultaneous variable selection and outlier identification in linear regression, Computational Statistics and Data Analysis, 22, 251-270.   DOI
15 Kim, S., Park, S. H., and Krzanowski, W. J. (2008). Simultaneous variable selection and outlier identification in linear regression using the mean-shift outlier model, Journal of Applied Statistics, 35, 283-291.   DOI
16 McCann, L. and Welsch, R. E. (2007). Robust variable selection using least angle regression and elemental set sampling, Computational Statistics and Data Analysis, 52, 249-257.   DOI
17 Ryan, T. A., Joiner, B. L., and Ryan, B. F. (1976). Minitab Student Handbook, Duxbury Press, Mass.
18 Sakia, R. M. (1992). The Box-Cox transformation technique: a review, The Statistician, 41, 169-178.   DOI
19 Seo, H. S. (2018). Fast robust variable selection using VIF regression in large datasets, The Korean Journal of Applied Statistics, 31, 463-473.   DOI
20 Seo, H. S. (2019). Unified methods for variable selection and outlier detection in linear regression, Communications for Statistical Applications and Methods, 26, 575-582.   DOI
21 Atkinson, A. C. (1985). Plots, Transformations and Regression: An Introduction to Graphical Method of Diagnostic Regression Analysis, Oxford University Press, Oxford.
22 Seo, H. S., Lee, G. Y., and Yoon, M. (2012). Robust response transformation using outlier detection in regression model, The Korean Journal of Applied Statistics, 25, 205-213.   DOI
23 Wisnowski, J. W., Simpson, J. R., Montgomery, D. C., and Runger, G. C. (2003). Resampling methods for variable selection in robust regression, Computational Statistics and Data Analysis, 43, 341-355.   DOI
24 Zhou, J., Foster, D. P., and Ungar, L. H. (2006). Streamwise feature selection, Journal of Machine Learning Researches, 7, 1861-1885.
25 Yeo, I. (2005). Variable selection and transformation in linear regression models, Statistics and Probability Letters, 72, 219-226.   DOI