Browse > Article
http://dx.doi.org/10.5351/KJAS.2017.30.3.377

Comparison of imputation methods for item nonresponses in a panel study  

Lee, Hyejung (Department of Statistics, Korea University)
Song, Juwon (Department of Statistics, Korea University)
Publication Information
The Korean Journal of Applied Statistics / v.30, no.3, 2017 , pp. 377-390 More about this Journal
Abstract
When conducting a survey, item nonresponse occurs if the respondent does not respond to some items. Since analysis based only on completely observed data may cause biased results, imputation is often conducted to analyze data in its complete form. The panel study is a survey method that examines changes of responses over time. In panel studies, there has been a preference for using information from response values of previous waves when the imputation of item nonresponses is performed; however, limited research has been conducted to support this preference. Therefore, this study compares the performance of imputation methods according to whether or not information from previous waves is utilized in the panel study. Among imputation methods that utilize information from previous responses, we consider ratio imputation, imputation based on the linear mixed model, and imputation based on the Bayesian linear mixed model approach. We compare the results from these methods against the results of methods that do not use information from previous responses, such as mean imputation and hot deck imputation. Simulation results show that imputation based on the Bayesian linear mixed model performs best and yields small biases and high coverage rates of the 95% confidence interval even at higher nonresponse rates.
Keywords
imputation; panel data; linear mixed model; ratio imputation; Korean Labor and Income Panel Study;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Dempster, A. P., Laird, N. M., and Rubin, D. B (1977). Maximum likelihood from incomplete data via the EM algorithm, Journal of the Royal Statistical Society, 39, 1-38.
2 Duffy, D. (2011). 2007 PSID Income and Wage Imputation Methodology, Survey Research Center-Institute for Social Research Technical Series Paper #11-03, University of Michigan, Michigan.
3 Frick, J. R. and Grabka, M. M. (2004). Missing Income Data in the German SOEP: Incidence, Imputation and its Impact on the Income Distribution, DIW Discussion Papers No. 376, DIW Berlin.
4 Laird, N. M. and Ware, J. H. (1982). Random-effects models for longitudinal data, Biometrics, 38, 963-974.   DOI
5 Lee, K., Lee, J., Shin, S., Lee, H., and Kim, K. (2015). The Economic Activity of Korean Individuals and Hou-eholds-2014 (Wave 17) Annual Report of the KLIPS Study, Korea Labor Institute.
6 Little, R. J. A., and Rubin, D. B. (2002). Statistical Analysis with Missing Data, John Wiley, New York.
7 Schafer, J. L. and Yucel, R. M. (2002). Computational strategies for multivariate linear mixed-effects models with missing values, Journal of Computational and Graphical Statistics, 11, 437-457.   DOI
8 Song, J. (2015). A Study of Improved Item Nonresponse Imputation Methods for KLIPS, Korea Labor Institute.
9 U.S. Census Bureau (2016). Survey of Income and Program Participation 2014 Panel Users' Guide, U.S. Department of Commerce Economic and Statistics Administration U.S. Census Bureau.
10 Taylor, M. F., Brice, J., Buck, N., and Prentice-Lane, E. (2010). British Household Panel Survey User Manual Volume A-Introduction (Technical Report and Appendices), University Essex, Colchester.