• Title/Summary/Keyword: ratio imputation

Search Result 19, Processing Time 0.018 seconds

Jackknife Variance Estimation under Imputation for Nonrandom Nonresponse with Follow-ups

  • Park, Jinwoo
    • Journal of the Korean Statistical Society
    • /
    • v.29 no.4
    • /
    • pp.385-394
    • /
    • 2000
  • Jackknife variance estimation based on adjusted imputed values when nonresponse is nonrandom and follow-up data are available for a subsample of nonrespondents is provided. Both hot-deck and ratio imputation method are considered as imputation method. The performance of the proposed variance estimator under nonrandom response mechanism is investigated through numerical simulation.

  • PDF

Doubly Robust Imputation Using Auxiliary Information (보조 정보에 의한 이중적 로버스트 대체법)

  • Park, Hyeon-Ah;Jeon, Jong-Woo;Na, Seong-Ryong
    • Communications for Statistical Applications and Methods
    • /
    • v.18 no.1
    • /
    • pp.47-55
    • /
    • 2011
  • Ratio and regression imputations depend on the model of a survey variable and the relation between the survey variable and auxiliary variables. If the model is not true, the unbiasedness of the estimator using the ratio or regression imputation cannot be guaranteed. In this paper, we develop the doubly robust imputation, which satisfies the approximate unbiasedness of the estimator, whether the model assumption is valid or not. The proposed imputation increases the efficiency of estimation by using the population information of the auxiliary variables. The simulation study establishes the theoretical results of this paper.

Missing Value Imputation based on Locally Linear Reconstruction for Improving Classification Performance (분류 성능 향상을 위한 지역적 선형 재구축 기반 결측치 대치)

  • Kang, Pilsung
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.38 no.4
    • /
    • pp.276-284
    • /
    • 2012
  • Classification algorithms generally assume that the data is complete. However, missing values are common in real data sets due to various reasons. In this paper, we propose to use locally linear reconstruction (LLR) for missing value imputation to improve the classification performance when missing values exist. We first investigate how much missing values degenerate the classification performance with regard to various missing ratios. Then, we compare the proposed missing value imputation (LLR) with three well-known single imputation methods over three different classifiers using eight data sets. The experimental results showed that (1) any imputation methods, although some of them are very simple, helped to improve the classification accuracy; (2) among the imputation methods, the proposed LLR imputation was the most effective over all missing ratios, and (3) when the missing ratio is relatively high, LLR was outstanding and its classification accuracy was as high as the classification accuracy derived from the compete data set.

Estimation Using Response Probability Under Callbacks

  • Park, Hyeon-Ah
    • Proceedings of the Korean Association for Survey Research Conference
    • /
    • 2007.11a
    • /
    • pp.213-230
    • /
    • 2007
  • Although the response model has been frequently applied to nonresponse weighting adjustment or imputation, the estimation under callbacks has been relatively underdeveloped in the response model. The estimation method using the response probability is developed under callbacks. A replication method for the estimation of the variance of the proposed estimation is also developed. Since the true response probability is usually unknown, we study the estimation of the response probability. Finally, we propose an estimator under callbacks using the ratio imputation as well as the response probability. The simulation study illustrates our techniques.

  • PDF

Comparisons of Imputation Methods for Wave Nonresponse in Panel Surveys (패널조사 웨이브 무응답의 대체방법 비교)

  • Kim, Kyu-Seong;Park, In-Ho
    • Survey Research
    • /
    • v.11 no.1
    • /
    • pp.1-18
    • /
    • 2010
  • We compare various imputation methods for compensating wave nonresponse that are commonly adopted in many panel surveys. Unlike the cross-sectional survey, the panel survey is involved a time-effect in nonresponse in a sense that nonresponse may happen for some but not all waves. Thus, responses in neighboring waves can be used as powerful predictors for imputing wave nonresponse such as in longitudinal regression imputation, carry-over imputation, nearest neighborhood regression imputation and row-column imputation method. For comparison, we carry out a simulation study on a few income data from the Korean Welfare Panel Study based on two performance criteria: predictive accuracy and estimation accuracy. Our simulation shows that the ratio and row-column imputation methods are much more effective in terms of both criteria. Regression, longitudinal regression and carry-over imputation methods performed better in predictive accuracy, but less in estimation accuracy. On the other hand, nearest neighborhood, nearest neighbor regression and hot-deck imputation show higher performance in estimation accuracy but lower predictive accuracy. Finally, the mean imputation shows much lower performance in both criteria.

  • PDF

Fully Efficient Fractional Imputation for Incomplete Contingency Tables

  • Kang, Shin-Soo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.4
    • /
    • pp.993-1002
    • /
    • 2004
  • Imputation procedures such as fully efficient fractional imputation(FEFI) or multiple imputation(MI) can be used to construct complete contingency tables from samples with partially classified responses. Variances of FEFI estimators of population proportions are derived. Simulation results, when data are missing completely at random, reveal that FEFI provides more efficient estimates of population than either multiple imputation(MI) based on data augmentation or complete case analysis, but neither FEFI nor MI provides an improvement over complete-case(CC) analysis with respect to accuracy of estimation of some parameters for association between two variables like $\theta_{i+}\theta_{+i}-\theta_{ij}$ and log odds-ratio.

  • PDF

Estimation of Log-Odds Ratios for Incomplete $2{\times}2$ Tables with Covariates using FEFI

  • Kang, Shin-Soo;Bae, Je-Min
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.1
    • /
    • pp.185-194
    • /
    • 2007
  • The information of covariates are available to do fully efficient fractional imputation(FEFI). The new method, FEFI with logistic regression is proposed to construct complete contingency tables. Jackknife method is used to get a standard errors of log-odds ratio from the completed table by the new method. Simulation results, when covariates have more information about categorical variables, reveal that the new method provides more efficient estimates of log-odds ratio than either multiple imputation(MI) based on data augmentation or complete case analysis.

  • PDF

Comparison of imputation methods for item nonresponses in a panel study (패널자료에서의 항목무응답 대체 방법 비교)

  • Lee, Hyejung;Song, Juwon
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.3
    • /
    • pp.377-390
    • /
    • 2017
  • When conducting a survey, item nonresponse occurs if the respondent does not respond to some items. Since analysis based only on completely observed data may cause biased results, imputation is often conducted to analyze data in its complete form. The panel study is a survey method that examines changes of responses over time. In panel studies, there has been a preference for using information from response values of previous waves when the imputation of item nonresponses is performed; however, limited research has been conducted to support this preference. Therefore, this study compares the performance of imputation methods according to whether or not information from previous waves is utilized in the panel study. Among imputation methods that utilize information from previous responses, we consider ratio imputation, imputation based on the linear mixed model, and imputation based on the Bayesian linear mixed model approach. We compare the results from these methods against the results of methods that do not use information from previous responses, such as mean imputation and hot deck imputation. Simulation results show that imputation based on the Bayesian linear mixed model performs best and yields small biases and high coverage rates of the 95% confidence interval even at higher nonresponse rates.

A Comparison of BLS Non-Response Adjustment and Cross-Wave Regression Imputation Methods (BLS 무응답 보정법을 이용한 대체법과 이월대체법에 관한 연구)

  • Lee, Sang-Eun;Shin, Key-Il
    • The Korean Journal of Applied Statistics
    • /
    • v.23 no.5
    • /
    • pp.909-921
    • /
    • 2010
  • Cross-wave regression imputation and carry-over imputation method are generally used in the analysis of panel data with missing values. Recently it is known that the BLS non-response adjust method has good statistical properties. In this paper we show that the BLS method can be considered as an imputation method with a similar formula of a ratio-estimator. In addition, we show that the carry-over imputation and BLS imputation are approximately the same under the assumption that data follow a non-stationary process with drift. Small simulation studies and real data analysis are performed. For the real data analysis, a monthly labor statistic (2007) is used.

Usage of auxiliary variable and neural network in doubly robust estimation

  • Park, Hyeonah;Park, Wonjun
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.3
    • /
    • pp.659-667
    • /
    • 2013
  • If the regression model or the propensity model is correct, the unbiasedness of the estimator using doubly robust imputation can be guaranteed. Using a neural network instead of a logistic regression model for the propensity model, the estimators using doubly robust imputation are approximately unbiased even though both assumed models fail. We also propose a doubly robust estimator of ratio form using population information of an auxiliary variable. We prove some properties of proposed theory by restricted simulations.