• 제목/요약/키워드: missing covariate

검색결과 9건 처리시간 0.019초

A modified estimating equation for a binary time varying covariate with an interval censored changing time

  • Kim, Yang-Jin
    • Communications for Statistical Applications and Methods
    • /
    • 제23권4호
    • /
    • pp.335-341
    • /
    • 2016
  • Interval censored failure time data often occurs in an observational study where a subject is followed periodically. Instead of observing an exact failure time, two inspection times that include it are made available. Several methods have been suggested to analyze interval censored failure time data (Sun, 2006). In this article, we are concerned with a binary time-varying covariate whose changing time is interval censored. A modified estimating equation is proposed by extending the approach suggested in the presence of a missing covariate. Based on simulation results, the proposed method shows a better performance than other simple imputation methods. ACTG 181 dataset were analyzed as a real example.

Application of Multiple Imputation Method in Analyzing Data with Missing Continuous Covariates

  • Ghasemizadeh Tamar, S.;Ganjali, M.
    • 응용통계연구
    • /
    • 제21권4호
    • /
    • pp.659-664
    • /
    • 2008
  • Missing continuous covariates are pervasive in the use of generalized linear models for medical data. Multiple imputation is the most common and easy-to-do method of dealing with missing covariate data. However, there are always serious warnings in using this method. There should be concern to make imputed values more proper. In this paper, proper imputation from posterior predictive distribution is developed for implementing with arbitrary priors. We use empirical distribution of the posterior for approximating the posterior predictive distribution, to sample from it. This method is preferable in comparison with a presented imputation method of us which uses a full model to impute missing values using available software. The proposed methods are implemented on glucocorticoid data.

대체방법별 GEE추정량 비교 (Comparison of GEE Estimators Using Imputation Methods)

  • 김동욱;노영화
    • 응용통계연구
    • /
    • 제16권2호
    • /
    • pp.407-426
    • /
    • 2003
  • 본 연구에서는 범주형 반복측정자료의 일반화추정방정식(GEE)모형에서 결측이 발생할 경우 결측값 대체(imputation)방법들에 대한 성능을 비교하고자 한다. 설명변수 X가 부분적으로 결측을 갖는 경우 GEE추정량을 계산할 수 없다. 본 논문에서는 시점에 따라 값이 변하는 설명변수에 결측이 있는 경우 GEE모형에서 결측값을 추정하는 7가지의 대체방법을 다루며, 실제자료와 모의실험을 통하여 대체방법별 GEE추정량의 성질을 연구한다. 대체방법별 GEE추정량의 성능을 비교하기 위해 우리는 반응변수가 범주형인 반복측정모형에서 완전자료의 GEE추정량과 완전자료에서 결측을 생성하여 결측값에 각 대체방법을 적용하여 대체한 후 구한 GEE추정량을 비교한다. 대체방법으로는 (1) 단순삭제 (2) 표본 평균대체 (3) 행 평균대체 (4) 횡 시점 회귀대체 (5) 이월대체 (6) 베이지안 붓스트랩 (7) 근사적 베이지안 붓스트랩에 대해서 살펴본다. 결측과정(missing mechanism)은 무시할 수 있는 무응답(ignorable nonresponse)을 가정하며, 결측 발생에 대해서는 원자료의 시점 무응답 패턴(wave nonresponse pattern)을 고려하여 발생시키거나 또는 시점 무응답 패턴을 고려하지 않고 단순임의추출로 결측을 발생시키는 방법을 각각 고려한다.

Multiple imputation for competing risks survival data via pseudo-observations

  • Han, Seungbong;Andrei, Adin-Cristian;Tsui, Kam-Wah
    • Communications for Statistical Applications and Methods
    • /
    • 제25권4호
    • /
    • pp.385-396
    • /
    • 2018
  • Competing risks are commonly encountered in biomedical research. Regression models for competing risks data can be developed based on data routinely collected in hospitals or general practices. However, these data sets usually contain the covariate missing values. To overcome this problem, multiple imputation is often used to fit regression models under a MAR assumption. Here, we introduce a multivariate imputation in a chained equations algorithm to deal with competing risks survival data. Using pseudo-observations, we make use of the available outcome information by accommodating the competing risk structure. Lastly, we illustrate the practical advantages of our approach using simulations and two data examples from a coronary artery disease data and hepatocellular carcinoma data.

Survival Analysis of Gastric Cancer Patients with Incomplete Data

  • Moghimbeigi, Abbas;Tapak, Lily;Roshanaei, Ghodaratolla;Mahjub, Hossein
    • Journal of Gastric Cancer
    • /
    • 제14권4호
    • /
    • pp.259-265
    • /
    • 2014
  • Purpose: Survival analysis of gastric cancer patients requires knowledge about factors that affect survival time. This paper attempted to analyze the survival of patients with incomplete registered data by using imputation methods. Materials and Methods: Three missing data imputation methods, including regression, expectation maximization algorithm, and multiple imputation (MI) using Monte Carlo Markov Chain methods, were applied to the data of cancer patients referred to the cancer institute at Imam Khomeini Hospital in Tehran in 2003 to 2008. The data included demographic variables, survival times, and censored variable of 471 patients with gastric cancer. After using imputation methods to account for missing covariate data, the data were analyzed using a Cox regression model and the results were compared. Results: The mean patient survival time after diagnosis was $49.1{\pm}4.4$ months. In the complete case analysis, which used information from 100 of the 471 patients, very wide and uninformative confidence intervals were obtained for the chemotherapy and surgery hazard ratios (HRs). However, after imputation, the maximum confidence interval widths for the chemotherapy and surgery HRs were 8.470 and 0.806, respectively. The minimum width corresponded with MI. Furthermore, the minimum Bayesian and Akaike information criteria values correlated with MI (-821.236 and -827.866, respectively). Conclusions: Missing value imputation increased the estimate precision and accuracy. In addition, MI yielded better results when compared with the expectation maximization algorithm and regression simple imputation methods.

누락된 공변량을 가진 원인별 비례위험모형의 분석 (Analysis of the cause-specific proportional hazards model with missing covariates)

  • 이민정
    • 응용통계연구
    • /
    • 제37권2호
    • /
    • pp.225-237
    • /
    • 2024
  • 경쟁위험자료에서 일부 공변량들이 연구대상들의 일부분에 대해 관측되지 않을 수 있다. 그런 경우 결측된 공변량 값을 가진 연구대상들을 분석에서 제외하는 것은 편향된 추정치와 효율성 손실이 발생할 수 있다. 본 논문에서는 누락된 공변량을 가진 원인별 비례위험모형의 회귀모수 추정을 위해 다중대체 방법과 증대된 역 확률 가중 방법을 연구하였다. 모의실험을 통해 다중대체 방법과 증대된 역 확률 가중 방법에 의해 구해진 추정량의 성능을 평가한 결과, 이 방법들이 잘 수행됨을 확인하였다. 미국 국립암연구소의 전립선, 폐, 대장, 난소 암 선별 시험 연구에서 제공하는 종양 크기의 값이 누락된 유방암 자료에 대해 암 사망 위험률과 다른 원인 사망 위험률에 유의한 영향을 미치는 요인을 파악하기 위해 다중대체 방법과 증대된 역 확률 가중 방법을 적용하였다. 다중대체 방법과 증대된 역 확률 가중 방법에 의해 원인별 비례위험모형을 적합한 결과, 인종, 기혼여부, 병기, 분화도, 종양의 크기는 유방암 사망 위험률에 유의한 영향을 미치는 요인들이였으며, 병기가 유방암 사망 위험률을 높이는데 가장 큰 영향을 미치는 요인임을 확인하였다. 진단시 연령과 종양의 크기는 다른 원인 사망 위험률을 높이는데 유의한 영향을 미치는 요인이였다.

표본조사에서 무응답 가중치 조정층 구성방법에 따른 효과 (Forming Weighting Adjustment Cells for Unit-Nonresponse in Sample Surveys)

  • 김영원;남시주
    • Communications for Statistical Applications and Methods
    • /
    • 제16권1호
    • /
    • pp.103-113
    • /
    • 2009
  • 표본조사에서 무응답은 비 표본추출오차를 발생시키는 중요한 원인 중 하나이다. 단위무응답이 발생하는 경우 무응답에 의한 편향을 줄이는 동시에 추정의 정도를 향상시키기 위해 단위무응답 조정층을 구성해 무응답 가중치 조정을 하는 것이 일반적이다. 본 연구에서는 무응답 조정층 구성과 관련된 기존의 이론들을 정리하고 어업총조사 자료를 이용한 실증적인 모의실험을 통해 효과적으로 무응답 조정층을 구성하는 방안에 대해 살펴본다. 모의실험결과 응답성향에 따른 조정층 구성보다는 예측평균을 기준으로 한 조정층 구성이 효율성 측면에서 효과적인 것으로 나타났으며, 아울러 다른 관심변수에도 적용될 수 있는 로버스트한 조정층 구성을 위해서는 예측평균만을 고려하는 것보다 응답성향과 예측평균을 모두 고려한 조정층 구성방법이 효과적인 것으로 나타났다. 한편 무응답 조정을 위한 응답률 산출에 있어서 설계가중치의 적용 필요성에 대해 살펴본 결과 설계가중치 적용 여부는 추정결과에 거의 영향을 주지 않는다는 사실을 확인할 수 있었다.

대기오염 노출이 첫 출산아 저체중에 미치는 영향에 관한 연구 -서울지역 1999년~2003년 출생코호트를 중심으로- (Air Pollution Exposure and Low Birth Weight of Firstborn Fetus -A Birth Cohort Study in Seoul, 1999-2003-)

  • 조용성;손지영;이종태
    • 한국환경보건학회지
    • /
    • 제33권4호
    • /
    • pp.227-234
    • /
    • 2007
  • Recent epidemiologic studies show that gestational exposure to air pollution adversely affects pregnancy outcomes including low birth weight in preform birth. In this study, we evaluated the effect of air pollutants on LBW (low birth weight) on firstborn fetus throughout the gestational period using the birth cohort between 1999 and 2003 in Seoul. Using birth cohort data from the National Statistics Office of Korea we identified 288,346 firstborn births (excluded missing data on lack of information for birth weight and discordance between residential and certificated address from a total of 316,451) during 1999 to 2003 with complete covariate (gender, parity, date of birth, gestational age, parental age and educational level, maternal occupation etc.) and maternal residential history data. Our subjects were defined as more than 37 weeks and less than 44 weeks of completed gestation and we identified 5,457 persons (1.89%) by low birth weight (<2.5 kg) in this study. Using logistic regression, we estimated the risk of mean (entire pregnancy and trimester period) air pollution concentrations for CO, $O_3,\;PM_{10},\;NO_2\;and\;SO_2$. In terms of trimester-specific exposure, we found that some air pollutants exposure in each trimester would increase the risk for LBW. Results also showed that the effect size of air pollutants exposure during the first and third trimester is higher than during the second trimester. In all trimester, the estimated risk of LBW was 1.831 (95% CI=1.573-2.132) with unit increase for CO, 1.139 (95% CI=1.107-1.172) for 50, and 1.009 (95% CI=1.001-1.017) for $O_3$. Our results suggest that exposure during the gestation period to relatively low levels of some air pollutants may be associated with a reduction in birth weight on first-born fetus. These findings implicate the effective risk management strategies should be applied to minimize the public health impacts for pregnant women.

Prognostic Factor Analysis of Overall Survival in Gastric Cancer from Two Phase III Studies of Second-line Ramucirumab (REGARD and RAINBOW) Using Pooled Patient Data

  • Fuchs, Charles S.;Muro, Kei;Tomasek, Jiri;Van Cutsem, Eric;Cho, Jae Yong;Oh, Sang-Cheul;Safran, Howard;Bodoky, Gyorgy;Chau, Ian;Shimada, Yasuhiro;Al-Batran, Salah-Eddin;Passalacqua, Rodolfo;Ohtsu, Atsushi;Emig, Michael;Ferry, David;Chandrawansa, Kumari;Hsu, Yanzhi;Sashegyi, Andreas;Liepa, Astra M.;Wilke, Hansjochen
    • Journal of Gastric Cancer
    • /
    • 제17권2호
    • /
    • pp.132-144
    • /
    • 2017
  • Purpose: To identify baseline prognostic factors for survival in patients with disease progression, during or after chemotherapy for the treatment of advanced gastric or gastroesophageal junction (GEJ) cancer. Materials and Methods: We pooled data from patients randomized between 2009 and 2012 in 2 phase III, global double-blind studies of ramucirumab for the treatment of advanced gastric or GEJ adenocarcinoma following disease progression on first-line platinum- and/or fluoropyrimidine-containing therapy (REGARD and RAINBOW). Forty-one key baseline clinical and laboratory factors common in both studies were examined. Model building started with covariate screening using univariate Cox models (significance level=0.05). A stepwise multivariable Cox model identified the final prognostic factors (entry+exit significance level=0.01). Cox models were stratified by treatment and geographic region. The process was repeated to identify baseline prognostic quality of life (QoL) parameters. Results: Of 1,020 randomized patients, 953 (93%) patients without any missing covariates were included in the analysis. We identified 12 independent prognostic factors of poor survival: 1) peritoneal metastases; 2) Eastern Cooperative Oncology Group (ECOG) performance score 1; 3) the presence of a primary tumor; 4) time to progression since prior therapy <6 months; 5) poor/unknown tumor differentiation; abnormally low blood levels of 6) albumin, 7) sodium, and/or 8) lymphocytes; and abnormally high blood levels of 9) neutrophils, 10) aspartate aminotransferase (AST), 11) alkaline phosphatase (ALP), and/or 12) lactate dehydrogenase (LDH). Factors were used to devise a 4-tier prognostic index (median overall survival [OS] by risk [months]: high=3.4, moderate=6.4, medium=9.9, and low=14.5; Harrell's C-index=0.66; 95% confidence interval [CI], 0.64-0.68). Addition of QoL to the model identified patient-reported appetite loss as an independent prognostic factor. Conclusions: The identified prognostic factors and the reported prognostic index may help clinical decision-making, patient stratification, and planning of future clinical studies.