DOI QR코드

DOI QR Code

Comparison of missing data methods in clustered survival data using Bayesian adaptive B-Spline estimation

  • Yoo, Hanna (Department of Computer Software, Busan University of Foreign Studies) ;
  • Lee, Jae Won (Department of Statistics, Korea University)
  • Received : 2017.09.13
  • Accepted : 2017.12.22
  • Published : 2018.03.31

Abstract

In many epidemiological studies, missing values in the outcome arise due to censoring. Such censoring is what makes survival analysis special and differentiated from other analytical methods. There are many methods that deal with censored data in survival analysis. However, few studies have dealt with missing covariates in survival data. Furthermore, studies dealing with missing covariates are rare when data are clustered. In this paper, we conducted a simulation study to compare results of several missing data methods when data had clustered multi-structured type with missing covariates. In this study, we modeled unknown baseline hazard and frailty with Bayesian B-Spline to obtain more smooth and accurate estimates. We also used prior information to achieve more accurate results. We assumed the missing mechanism as MAR. We compared the performance of five different missing data techniques and compared these results through simulation studies. We also presented results from a Multi-Center study of Korean IBD patients with Crohn's disease(Lee et al., Journal of the Korean Society of Coloproctology, 28, 188-194, 2012).

Keywords

References

  1. Brand JPL (1999). Development, Implementation and Evaluation of Multiple Imputation Strategies for the Statistical Analysis of Incomplete Data Sets, Erasmus University, Rotterdam.
  2. Chen HY and Little RJA (1999). Proportional hazards regression with missing covariates. Journal of the American Statistical Associations, 94, 896-908. https://doi.org/10.1080/01621459.1999.10474195
  3. Clayton DG (1978). A model for association in bivariate life tables and its application in epidemio-logical studies of familial tendency in chronic disease incidence. Biometrika, 65, 141-152. https://doi.org/10.1093/biomet/65.1.141
  4. Frank H (2010). Hmisc: Miscellaneous library for R statistical software. R package 3.9-0.
  5. Heitjan DF and Little RJA (1991). Multiple imputation for the fatal accident reporting system. Journal of the Royal Statistical Society. Series C (Applied Statistics), 40, 13-29.
  6. Lee KY, Yu CS, Lee KY, Cho YB, Park KJ, Choi GS, Yoon SN, and Yoo H (2012). Risk factors for repeat abdominal surgery in Korean patients with Crohn’s disease: a multiple-center study of a Korean inflammatory bowel disease study group. Journal of the Korean Society of Coloproctol-ogy, 28, 188-194. https://doi.org/10.3393/jksc.2012.28.4.188
  7. Lipsitz SR and Ibrahim JG (1996). Using the EM algorithm for survival data with incomplete cate-gorical covariates. Lifetime Data Analysis, 2, 5-14. https://doi.org/10.1007/BF00128467
  8. Lipsitz SR and Ibrahim JG (2000). Estimation with correlated censored survival data with missing covariates. Biostatistics, 1, 315-327. https://doi.org/10.1093/biostatistics/1.3.315
  9. Marshall A, Altman DG, Royston P, and Holder RL (2010). Comparison of techniques for handling missing covariate data within prognostic modeling studies: a simulation study, BMC Medical Research Methodology, 10.
  10. Rubin DB (1987). Multiple Imputation for Nonresponse in Surveys, John Wiley & Sons, New York.
  11. Schenker N and Taylor JMG (1996). Partially parametric techniques for multiple imputation. Computational Statistics & Data Analysis, 22, 425-446.
  12. Sharef E, Strawderman RL, Ruppert D, Cowen M, and Halasyamani L (2010). Bayesian adaptive B-spline estimation in proportional hazard frailty models. Electronic Journal of Statistics, 4, 606-642. https://doi.org/10.1214/10-EJS566
  13. Spiegelhalter DJ, Best NG, Carlin BP, and Van der Linde A (2002). Bayesian measures of model complexity and fit. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 64, 583-639. https://doi.org/10.1111/1467-9868.00353
  14. van Buuren S (2007). Multiple imputation of discrete and continuous data by fully conditional specification. Statistical Methods in Medical Research, 16, 219-242. https://doi.org/10.1177/0962280206074463
  15. van Buuren S, Boshuizen HC, and Knook DL (1999). Multiple imputation of missing blood pressure covariates in survival analysis. Statistics in Medicine, 18, 681-694. https://doi.org/10.1002/(SICI)1097-0258(19990330)18:6<681::AID-SIM71>3.0.CO;2-R
  16. Vaupel JW, Manton KG, and Stallard E (1979). The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography, 16, 439-454. https://doi.org/10.2307/2061224
  17. Zhou H and Pepe MS (1995). Auxiliary covariate data in failure time regression. Biometrika, 82, 139-149.