• Title/Summary/Keyword: non-ignorable missing data

Search Result 7, Processing Time 0.018 seconds

On statistical Computing via EM Algorithm in Logistic Linear Models Involving Non-ignorable Missing data

  • Jun, Yu-Na;Qian, Guoqi;Park, Jeong-Soo
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2005.11a
    • /
    • pp.181-186
    • /
    • 2005
  • Many data sets obtained from surveys or medical trials often include missing observations. When these data sets are analyzed, it is general to use only complete cases. However, it is possible to have big biases or involve inefficiency. In this paper, we consider a method for estimating parameters in logistic linear models involving non-ignorable missing data mechanism. A binomial response and normal exploratory model for the missing data are used. We fit the model using the EM algorithm. The E-step is derived by Metropolis-hastings algorithm to generate a sample for missing data and Monte-carlo technique, and the M-step is by Newton-Raphson to maximize likelihood function. Asymptotic variances of the MLE's are derived and the standard error and estimates of parameters are compared.

  • PDF

Methods for Handling Incomplete Repeated Measures Data (불완전한 반복측정 자료의 보정방법)

  • Woo, Hae-Bong;Yoon, In-Jin
    • Survey Research
    • /
    • v.9 no.2
    • /
    • pp.1-27
    • /
    • 2008
  • Problems of incomplete data are pervasive in statistical analysis. In particular, incomplete data have been an important challenge in repeated measures studies. The objective of this study is to give a brief introduction to missing data mechanisms and conventional/recent missing data methods and to assess the performance of various missing data methods under ignorable and non-ignorable missingness mechanisms. Given the inadequate attention to longitudinal studies with missing data, this study applied recent advances in missing data methods to repeated measures models and investigated the performance of various missing data methods, such as FIML (Full Information Maximum Likelihood Estimation) and MICE(Multivariate Imputation by Chained Equations), under MCAR, MAR, and MNAR mechanisms. Overall, the results showed that listwise deletion and mean imputation performed poorly compared to other recommended missing data procedures. The better performance of EM, FIML, and MICE was more noticeable under MAR compared to MCAR. With the non-ignorable missing data, this study showed that missing data methods did not perform well. In particular, this problem was noticeable in slope-related estimates. Therefore, this study suggests that if missing data are suspected to be non-ignorable, developmental research may underestimate true rates of change over the life course. This study also suggests that bias from non-ignorable missing data can be substantially reduced by considering rich information from variables related to missingness.

  • PDF

EXTENSION OF FACTORING LIKELIHOOD APPROACH TO NON-MONOTONE MISSING DATA

  • Kim, Jae-Kwang
    • Journal of the Korean Statistical Society
    • /
    • v.33 no.4
    • /
    • pp.401-410
    • /
    • 2004
  • We address the problem of parameter estimation in multivariate distributions under ignorable non-monotone missing data. The factoring likelihood method for monotone missing data, termed by Rubin (1974), is extended to a more general case of non-monotone missing data. The proposed method is algebraically equivalent to the Newton-Raphson method for the observed likelihood, but avoids the burden of computing the first and the second partial derivatives of the observed likelihood. Instead, the maximum likelihood estimates and their information matrices for each partition of the data set are computed separately and combined naturally using the generalized least squares method.

Partitioning likelihood method in the analysis of non-monotone missing data

  • Kim Jae-Kwang
    • Proceedings of the Korean Statistical Society Conference
    • /
    • 2004.11a
    • /
    • pp.1-8
    • /
    • 2004
  • We address the problem of parameter estimation in multivariate distributions under ignorable non-monotone missing data. The factoring likelihood method for monotone missing data, termed by Robin (1974), is extended to a more general case of non-monotone missing data. The proposed method is algebraically equivalent to the Newton-Raphson method for the observed likelihood, but avoids the burden of computing the first and the second partial derivatives of the observed likelihood Instead, the maximum likelihood estimates and their information matrices for each partition of the data set are computed separately and combined naturally using the generalized least squares method. A numerical example is also presented to illustrate the method.

  • PDF

Model selection method for categorical data with non-response (무응답을 가지고 있는 범주형 자료에 대한 모형 선택 방법)

  • Yoon, Yong-Hwa;Choi, Bo-Seung
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.627-641
    • /
    • 2012
  • We consider a model estimation and model selection methods for the multi-way contingency table data with non-response or missing values. We also consider hierarchical Bayesian model in order to handle a boundary solution problem that can happen in the maximum likelihood estimation under non-ignorable non-response model and we deal with a model selection method to find the best model for the data. We utilized Bayes factors to handle model selection problem under Bayesian approach. We applied proposed method to the pre-election survey for the 2004 Korean National Assembly race. As a result, we got the non-ignorable non-response model was favored and the variable of voting intention was most suitable.

A joint modeling of longitudinal zero-inflated count data and time to event data (경시적 영과잉 가산자료와 생존자료의 결합모형)

  • Kim, Donguk;Chun, Jihun
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1459-1473
    • /
    • 2016
  • Both longitudinal data and survival data are collected simultaneously in longitudinal data which are observed throughout the passage of time. In this case, the effect of the independent variable becomes biased (provided that sole use of longitudinal data analysis does not consider the relation between both data used) if the missing that occurred in the longitudinal data is non-ignorable because it is caused by a correlation with the survival data. A joint model of longitudinal data and survival data was studied as a solution for such problem in order to obtain an unbiased result by considering the survival model for the cause of missing. In this paper, a joint model of the longitudinal zero-inflated count data and survival data is studied by replacing the longitudinal part with zero-inflated count data. A hurdle model and proportional hazards model were used for each longitudinal zero inflated count data and survival data; in addition, both sub-models were linked based on the assumption that the random effect of sub-models follow the multivariate normal distribution. We used the EM algorithm for the maximum likelihood estimator of parameters and estimated standard errors of parameters were calculated using the profile likelihood method. In simulation, we observed a better performance of the joint model in bias and coverage probability compared to the separate model.

An estimation method for non-response model using Monte-Carlo expectation-maximization algorithm (Monte-Carlo expectation-maximaization 방법을 이용한 무응답 모형 추정방법)

  • Choi, Boseung;You, Hyeon Sang;Yoon, Yong Hwa
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.3
    • /
    • pp.587-598
    • /
    • 2016
  • In predicting an outcome of election using a variety of methods ahead of the election, non-response is one of the major issues. Therefore, to address the non-response issue, a variety of methods of non-response imputation may be employed, but the result of forecasting tend to vary according to methods. In this study, in order to improve electoral forecasts, we studied a model based method of non-response imputation attempting to apply the Monte Carlo Expectation Maximization (MCEM) algorithm, introduced by Wei and Tanner (1990). The MCEM algorithm using maximum likelihood estimates (MLEs) is applied to solve the boundary solution problem under the non-ignorable non-response mechanism. We performed the simulation studies to compare estimation performance among MCEM, maximum likelihood estimation, and Bayesian estimation method. The results of simulation studies showed that MCEM method can be a reasonable candidate for non-response model estimation. We also applied MCEM method to the Korean presidential election exit poll data of 2012 and investigated prediction performance using modified within precinct error (MWPE) criterion (Bautista et al., 2007).