Browse > Article
http://dx.doi.org/10.5351/KJAS.2019.32.6.889

Robust multiple imputation method for missings with boundary and outliers  

Park, Yousung (Department of Statistics, Korea University)
Oh, Do Young (Department of Statistics, Korea University)
Kwon, Tae Yeon (Department of International Finance, Hankuk University of Foreign Studies)
Publication Information
The Korean Journal of Applied Statistics / v.32, no.6, 2019 , pp. 889-898 More about this Journal
Abstract
The problem of missing value imputation for variables in surveys that include item missing becomes complicated if outliers and logical boundary conditions between other survey items cannot be ignored. If there are outliers and boundaries in a variable including missing values, imputed values based on previous regression-based imputation methods are likely to be biased and not meet boundary conditions. In this paper, we approach these difficulties in imputation by combining various robust regression models and multiple imputation methods. Through a simulation study on various scenarios of outliers and boundaries, we find and discuss the optimal combination of robust regression and multiple imputation method.
Keywords
break-down point; robust regression; Bayesian multiple imputation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Little, R. J. (1988). Missing-data adjustments in large surveys, Journal of Business & Economic Statistics, 6, 287-296.   DOI
2 Maronna, R. A. and Yohai, V. J. (2000). Robust regression with both continuous and categorical predictors, Journal of Statistical Planning and Inference, 89, 197-214.   DOI
3 Park, Y., Kim, D., and Kim, S. (2012). Robust regression using data partitioning and M-estimation, Communications in Statistics-Simulation and Computation, 41, 1282-1300.   DOI
4 Raghunathan, T. E., Lepkowski, J. M., Van Hoewyk, J., and Solenberger, P. (2001). A multivariate technique for multiply imputing missing values using a sequence of regression models, Survey Methodology, 27, 85-96.
5 Rousseeuw, P. and Yohai, V. (1984). Robust regression by means of S-estimators. In Robust and Nonlinear Time Series Analysis (pp. 256-272), Springer, New York.
6 Rousseeuw, P. J. (1984). Least median of squares regression, Journal of the American Statistical Association, 79, 871-880.   DOI
7 Rousseeuw, P. J. and Van Driessen, K. (2000). An algorithm for positive-breakdown regression based on concentration steps. In Data Analysis (pp. 335-346), Springer, Berlin, Heidelberg.
8 Rubin, D. B. (1987). Multiple Imputation for Nonresponse in Surveys (Vol. 81), John Wiley & Sons.
9 Salibian-Barrera, M. and Yohai, V. J. (2006). A fast algorithm for S-regression estimates, Journal of computational and Graphical Statistics, 15, 414-427.   DOI
10 Schenker, N., Raghunathan, T. E., Chiu, P. L., Makuc, D. M., Zhang, G., and Cohen, A. J. (2006). Multiple imputation of missing income data in the National Health Interview Survey, Journal of the American Statistical Association, 101, 924-933.   DOI
11 Schenker, N. and Taylor, J. M. (1996). Partially parametric techniques for multiple imputation, Computational Statistics & Data Analysis, 22, 425-446.   DOI
12 Stromberg, A. J. (1993). Computation of high breakdown nonlinear regression parameters, Journal of the American Statistical Association, 88, 237-244.   DOI
13 Yohai, V. J. (1987). High breakdown-point and high efficiency robust estimates for regression, The Annals of Statistics, 15, 642-656.   DOI
14 Geraci, M. and McLain, A. (2018). Multiple imputation for bounded variables, Psychometrika, 83, 919-940.   DOI
15 Huber, P. J. (1973). Robust regression: asymptotics, conjectures and Monte Carlo, The Annals of Statistics, 1, 799-821.   DOI
16 Kwon, T. Y. and Park, Y. (2015). A new multiple imputation method for bounded missing values, Statistics & Probability Letters, 107, 204-209.   DOI