[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.29220/CSAM.2018.25.4.385

Multiple imputation for competing risks survival data via pseudo-observations

Han, Seungbong (Department of Applied Statistics, Gachon University)
Andrei, Adin-Cristian (Department of Preventive Medicine, Northwestern University)
Tsui, Kam-Wah (Department of Statistics, University of Wisconsin-Madison)

Publication Information

Communications for Statistical Applications and Methods / v.25, no.4, 2018 , pp. 385-396 More about this Journal

Abstract

Competing risks are commonly encountered in biomedical research. Regression models for competing risks data can be developed based on data routinely collected in hospitals or general practices. However, these data sets usually contain the covariate missing values. To overcome this problem, multiple imputation is often used to fit regression models under a MAR assumption. Here, we introduce a multivariate imputation in a chained equations algorithm to deal with competing risks survival data. Using pseudo-observations, we make use of the available outcome information by accommodating the competing risk structure. Lastly, we illustrate the practical advantages of our approach using simulations and two data examples from a coronary artery disease data and hepatocellular carcinoma data.

Keywords

competing risks; missing data; multiple imputation; pseudo-observations; random forest;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	Nicolaie MA, van Houwelingen JC, deWitte TM, and Putter H (2013). Dynamic pseudo-observations: a robust approach to dynamic prediction in competing risks, Biometrics, 69, 1043-1052. DOI
2	Ripley B (2014). tree: Classification and regression trees. R package version 1.0-35, from: http://CRAN.R-project.org/package=tree
3	Royston P and White IR (2011). Multiple imputation by chained equations (MICE): implementation in Stata, Journal of Statistical Software, 45, 1-20.
4	Rubin DB (1987). Multiple Imputation for Nonresponse in Surveys, Wiley, New York.
5	Shah AD, Bartlett JW, Carpenter J, Nicholas O, and Hemingway H (2014). Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study, American Journal of Epidemiology, 179, 764-774. DOI
6	Seung KB, Park DW, Kim YH et al. (2008). Stents versus coronary-artery bypass grafting for left main coronary artery disease, The New England Journal of Medicine, 358, 1781-1792. DOI
7	Shim JH, Yoon DL, Han S, et al. (2012). Is Serum Alpha-Fetoprotein useful for predicting recurrence and mortality specific to hepatocellular carcinoma after hepatectomy? A test based on propensity scores and competing risks analysis, Annals of Surgical Oncology, 19, 3687-3696. DOI
8	van Buuren S, Boshuizen HC, and Knook DL (1999). Multiple imputation of missing blood pressure covariates in survival analysis, Statistics in Medicine, 18, 681-694 DOI
9	van Buuren S and Groothuis-Oudshoorn K (2011). mice: multivariate imputation by chained equations in R, Journal of Statistical Software, 45, 1-67.
10	Breiman L (2001). Random forests, Machine Learning, 45, 5-32. DOI
11	Burgette LF and Reiter JP (2010). Multiple imputation for missing data via sequential regression trees, American Journal of Epidemiology, 172, 1070-1076. DOI
12	Do G and Kim YJ (2017). Analysis of interval censored competing risk data with missing causes of failure using pseudo values approach, Journal of Statistical Computation and Simulation, 87, 631-639. DOI
13	Fine J and Gray R (1999). A proportional hazards model for the subdistribution of a competing risk, Journal of the American Statistical Association, 94, 496-509. DOI
14	Graham JW, Olchowski AE, and Gilreath TD (2007). How many imputations are really needed? Some practical clarifications of multiple imputation theory, Prevention Science, 8, 206-213. DOI
15	Gray B (2014). cmprsk: Subdistribution Analysis of Competing Risks, R package version 2.2-7. http://CRAN.R-project.org/package=cmprsk
16	Moreno-Betancur M and Latouche A (2013). Regression modeling of the cumulative incidence function with missing causes of failure using pseudo-values, Statistics in Medicine, 32, 3206-3223. DOI
17	Graw F, Gerds TA, and Schumacher M (2009). On pseudo-values for regression analysis in competing risks models, Lifetime Data Analysis, 15, 241-255. DOI
18	Kim S and Kim YJ (2016). Regression analysis of interval censored competing risk data using a pseudo-value approach, Communications for Statistical Applications and Methods, 23, 555-562. DOI
19	Klein JP and Andersen PK (2005). Regression modeling of competing risks data based on pseudovalues of the cumulative incidence function, Biometrics, 61, 223-229. DOI
20	Beyersmann J, Allignol A, and Schumacher M (2012). Competing Risks and Multistate Models with R, Springer-Verlag New York, Chapter 3, 45-50.
21	Liaw A and Wiener M (2002). Classification and regression by randomForest, R News, 2, 18-22.
22	Logan BR, Zhang MJ, and Klein JP (2011). Marginal models for clustered time to event data with competing risks using pseudovalues, Biometrics, 67, 1-7. DOI
23	Mogensen UB and Gerds TA (2013). A random forest approach for competing risks based on pseudovalues, Statistics in Medicine, 32, 3102-3114. DOI
24	Andersen PK and Perme MP (2010). Pseudo-observations in survival analysis, Statistical Methods in Medical Research, 19, 71-99. DOI
25	Ambler G, Omar RZ, Royston P, Kinsman R, Keogh BE, and Taylor KM (2005). Generic, simple risk stratification model for heart valve surgery, Circulation, 112, 224-231. DOI
26	Ahn KW and Mendolia F (2014). Pseudo-value approach for comparing survival medians for dependent data, Statistics in Medicine, 33, 1531-1538. DOI
27	Ambler G, Omar RZ, and Royston P (2007). A comparison of imputation techniques for handling missing predictor values in a risk model with a binary outcome, Statistical Methods in Medical Research, 16, 277-298. DOI
28	Aalen O, Borgan O, and Gjessing H (2008). Survival and Event History Analysis, Springer, New York.