• Title/Summary/Keyword: Imputation method

Search Result 132, Processing Time 0.022 seconds

A Combined Method Compensating for Wave Nonresponse

  • Park, Jinwoo
    • Journal of the Korean Statistical Society
    • /
    • v.31 no.4
    • /
    • pp.469-482
    • /
    • 2002
  • This paper suggests a new method of compensating for wave nonresponse in panel survey, which combines weighting adjustment and imputation. By deleting less frequent nonresponse patterns, we can get simplicity. A new mean estimator under the new combining method is provided and a limited simulation study employing a real data is conducted.

A Study on the efficiency of the MCMC multiple imputation In LDA (선형판별분석에서 MCMC다중대체법의 효율에 관한 연구)

  • Yoo, Hee-Kyung;Kim, Myung-Cheol
    • Journal of the Korea Safety Management & Science
    • /
    • v.11 no.3
    • /
    • pp.189-198
    • /
    • 2009
  • This thesis studies two imputation methods, the MCMC method and the EM algorithm, that take care of the problem. The performance of the two methods for the linear (or quadratic) discriminant analysis are evaluated under various types of incomplete observations. Based on simulated experiments, the effect of the imputation using the EM algorithm and the MCMC method are evaluated and compared in terms of the probability of misclassification and the RMSE. This is done for the various cases of incomplete observations. The cases are differentiated by missing rates, sample sizes, and distances between two classification groups. The studies show that the probability of misclassification and the RMSE of the EM algorithm method is lower than the MCMC method. Therefore the imputation using the EM algorithm is more efficient than the MCMC method. And the probability of misclassification of the method that all vectors of observations with missing values are omitted from analysis is lower than the EM algorithm and the MCMC method when the samples size is small and the rate of missing values is extremely big.

Weighted Hot-Deck Imputation in Farm and Fishery Household Economy Surveys (농어가경제조사에서 가중핫덱 무응답 대체법의 활용)

  • Kim Kyu-Seong;Lee Kee-Jae;Kim Jin
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.2
    • /
    • pp.311-328
    • /
    • 2005
  • This paper deals with a treatment of nonresponse in farm and fishery household economy surveys in Korea. Since the samples in two surveys were selected by stratified multi-stage sampling and weighted sample means has been used to estimate the population means, we choose a weighted hot-deck imputation method as an appropriate method for two surveys. We investigate the procedure of the weighted hot-deck as well as an adjusted jackknife method for variance estimation. Through an empirical study we found that the method worked very well in both mean and variance estimation in two surveys. In addition, we presented a procedure of forming imputation class and formed four imputation classes for each survey and then compared them with analysis. As a result, we presented two most efficient imputation classes for two surveys.

On the Use of Weighted k-Nearest Neighbors for Missing Value Imputation (Weighted k-Nearest Neighbors를 이용한 결측치 대치)

  • Lim, Chanhui;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.1
    • /
    • pp.23-31
    • /
    • 2015
  • A conventional missing value problem in the statistical analysis k-Nearest Neighbor(KNN) method are used for a simple imputation method. When one of the k-nearest neighbors is an extreme value or outlier, the KNN method can create a bias. In this paper, we propose a Weighted k-Nearest Neighbors(WKNN) imputation method that can supplement KNN's faults. A Monte-Carlo simulation study is also adapted to compare the WKNN method and KNN method using real data set.

Comparing Accuracy of Imputation Methods for Incomplete Categorical Data

  • Shin, Hyung-Won;Sohn, So-Young
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2003.05a
    • /
    • pp.237-242
    • /
    • 2003
  • Various kinds of estimation methods have been developed for imputation of categorical missing data. They include modal category method, logistic regression, and association rule. In this study, we propose two imputation methods (neural network fusion and voting fusion) that combine the results of individual imputation methods. A Monte-Carlo simulation is used to compare the performance of these methods. Five factors used to simulate the missing data are (1) true model for the data, (2) data size, (3) noise size (4) percentage of missing data, and (5) missing pattern. Overall, neural network fusion performed the best while voting fusion is better than the individual imputation methods, although it was inferior to the neural network fusion. Result of an additional real data analysis confirms the simulation result.

  • PDF

A Study on Automatic Missing Value Imputation Replacement Method for Data Processing in Digital Data (디지털 데이터에서 데이터 전처리를 위한 자동화된 결측 구간 대치 방법에 관한 연구)

  • Kim, Jong-Chan;Sim, Chun-Bo;Jung, Se-Hoon
    • Journal of Korea Multimedia Society
    • /
    • v.24 no.2
    • /
    • pp.245-254
    • /
    • 2021
  • We proposed the research on an analysis and prediction model that allows the identification of outliers or abnormality in the data followed by effective and rapid imputation of missing values was conducted. This model is expected to analyze efficiently the problems in the data based on the calibrated raw data. As a result, a system that can adequately utilize the data was constructed by using the introduced KNN + MLE algorithm. With this algorithm, the problems in some of the existing KNN-based missing data imputation algorithms such as ignoring the missing values in some data sections or discarding normal observations were effectively addressed. A comparative evaluation was performed between the existing imputation approaches such as K-means, KNN, MEI, and MI as well as the data missing mechanisms including MCAR, MAR, and NI to check the effectiveness/efficiency of the proposed algorithm, and its superiority in all aspects was confirmed.

Comparison of GEE Estimators Using Imputation Methods (대체방법별 GEE추정량 비교)

  • 김동욱;노영화
    • The Korean Journal of Applied Statistics
    • /
    • v.16 no.2
    • /
    • pp.407-426
    • /
    • 2003
  • We consider the missing covariates problem in generalized estimating equations(GEE) model. If the covariate is partially missing, GEE can not be calculated. In this paper, we study the performance of 7 imputation methods to handle missing covariates in GEE models, and the properties of GEE estimators are investigated after missing covariates are imputed for ordinal data of repeated measurements. The 7 imputation methods include i) Naive Deletion ii) Sample Average Imputation iii) Row Average Imputation iv) Cross-wave Regression Imputation v) Carry-over Imputation vi) Bayesian Bootstrap vii) Approximate Bayesian Bootstrap. A Monte-Carlo simulation is used to compare the performance of these methods. For the missing mechanism generating the missing data, we assume ignorable nonresponse. Furthermore, we generate missing covariates with or without considering wave nonresp onse patterns.

Robust multiple imputation method for missings with boundary and outliers (한계와 이상치가 있는 결측치의 로버스트 다중대체 방법)

  • Park, Yousung;Oh, Do Young;Kwon, Tae Yeon
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.6
    • /
    • pp.889-898
    • /
    • 2019
  • The problem of missing value imputation for variables in surveys that include item missing becomes complicated if outliers and logical boundary conditions between other survey items cannot be ignored. If there are outliers and boundaries in a variable including missing values, imputed values based on previous regression-based imputation methods are likely to be biased and not meet boundary conditions. In this paper, we approach these difficulties in imputation by combining various robust regression models and multiple imputation methods. Through a simulation study on various scenarios of outliers and boundaries, we find and discuss the optimal combination of robust regression and multiple imputation method.

Imputation Method using the Space-Time Model in Sample Survey (공간-시계열 모형을 이용한 결측대체 방법에 대한 연구)

  • Lee, Jin-Hee;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.20 no.3
    • /
    • pp.499-514
    • /
    • 2007
  • It is a common practice to use the auxiliary variables to impute missing values from item nonresponse in surveys. Sometimes there are few auxiliary variables for missing value imputation, but if spatial and time autocorrelations exist, we should use these correlations for better results. Recently, Lee et al. (2006) showed that spatial autocorrelation could be efficiently used for missing value imputation when spatial autocorrelation existed, using the data from the farm household economy data in Gangwon-do, 2002. In this paper, we present au evaluation of spatial and space-time nonresponse imputation methods when there exist spatial and time autocorrelations using the monthly data during 2000-2002 from the same data previously used by Lee et al. (2006). We show that space-time imputation method is more efficient than the other through the numerical simulations.

Imputation method for missing data based on measure of property (특성도를 이용한 결측치 대체방법)

  • Kim, Hyungju;Kim, Dongjae
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.3
    • /
    • pp.463-473
    • /
    • 2017
  • How to handle missing data is a main issue in clinical trials. We impute missing data based on missing data that follows a mechanism according to the intention-to-treat rule. However, using the right imputation method for missing data is very important because this supposition is unclear. We suggest a new imputation method for missing data using agreement and maintenance introduced by Kang and Kim (1997). We give an example and adapt a Monte Carlo simulation to compare the performance between the established method and the suggested method.