• Title/Summary/Keyword: covariates

Search Result 535, Processing Time 0.022 seconds

Bayesian Variable Selection in the Proportional Hazard Model with Application to DNA Microarray Data

  • Lee, Kyeon-Eun;Mallick, Bani K.
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2005.09a
    • /
    • pp.357-360
    • /
    • 2005
  • In this paper we consider the well-known semiparametric proportional hazards (PH) models for survival analysis. These models are usually used with few covariates and many observations (subjects). But, for a typical setting of gene expression data from DNA microarray, we need to consider the case where the number of covariates p exceeds the number of samples n. For a given vector of response values which are times to event (death or censored times) and p gene expressions (covariates), we address the issue of how to reduce the dimension by selecting the significant genes. This approach enable us to estimate the survival curve when n < < p. In our approach, rather than fixing the number of selected genes, we will assign a prior distribution to this number. The approach creates additional flexibility by allowing the imposition of constraints, such as bounding the dimension via a prior, which in effect works as a penalty. To implement our methodology, we use a Markov Chain Monte Carlo (MCMC) method. We demonstrate the use of the methodology to diffuse large B-cell lymphoma (DLBCL) complementary DNA(cDNA) data.

  • PDF

Estimation of Log-Odds Ratios for Incomplete $2{\times}2$ Tables with Covariates using FEFI

  • Kang, Shin-Soo;Bae, Je-Min
    • Journal of the Korean Data and Information Science Society
    • /
    • v.18 no.1
    • /
    • pp.185-194
    • /
    • 2007
  • The information of covariates are available to do fully efficient fractional imputation(FEFI). The new method, FEFI with logistic regression is proposed to construct complete contingency tables. Jackknife method is used to get a standard errors of log-odds ratio from the completed table by the new method. Simulation results, when covariates have more information about categorical variables, reveal that the new method provides more efficient estimates of log-odds ratio than either multiple imputation(MI) based on data augmentation or complete case analysis.

  • PDF

Semiparametric support vector machine for accelerated failure time model

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.21 no.4
    • /
    • pp.765-775
    • /
    • 2010
  • For the accelerated failure time (AFT) model a lot of effort has been devoted to develop effective estimation methods. AFT model assumes a linear relationship between the logarithm of event time and covariates. In this paper we propose a semiparametric support vector machine to consider situations where the functional form of the effect of one or more covariates is unknown. The proposed estimating equation can be computed by a quadratic programming and a linear equation. We study the effect of several covariates on a censored response variable with an unknown probability distribution. We also provide a generalized approximate cross-validation method for choosing the hyper-parameters which affect the performance of the proposed approach. The proposed method is evaluated through simulations using the artificial example.

Multiprocess Discount Survival Models With Survival Times

  • Shim, Joo-Yong
    • Journal of the Korean Statistical Society
    • /
    • v.26 no.2
    • /
    • pp.277-288
    • /
    • 1997
  • For the analysis of survival data including covariates whose effects vary in time, the multiprocess discount survival model is proposed. The parameter vector modeling the time-varying effects of covariates is to vary between time intervals and its evolution between time intervals depends on the perturbation of the next time interval. The recursive estimation of the parameter vector can be obtained at the end of each time interval. The retrospective estimation of the survival function and the forecasting of the survival function of individuals of the specific covariates also can be obtained based on the information gathered until the end of the time interval.

  • PDF

A Comparison of Estimation Methods for Willingness to Pay Amount in Constructed Oceans and Fisheries Resources Market by Contingent Valuation Method (해양수산자원 가상시장의 지불의사금액 추정방법 비교)

  • Kang, Seok-Kyu
    • The Journal of Fisheries Business Administration
    • /
    • v.49 no.3
    • /
    • pp.85-99
    • /
    • 2018
  • This study is to compare and evaluate the estimating method of WTP(willingness to pay) for the valuation of oceans and fisheries resources with non-market goods characteristics using contingent valuation method. In general, when estimating parameters of the WTP function, we should take into account the assumption of probability distribution, inclusion of covariates, method of inducement of payment, and the treatment of 0 payment intention and resistance responses. This study utilizes survey data that was used to estimate the value of fisheries resource protection zones, with a total of 1,200 samples. The main results of this study are summarized as follows: First, the final willness to pay amount is estimated at a statistical significance of less than 1 percent, and the distribution of the final willness to pay amount is from \6,926 of the double bounded dichotomous model to \10,721 of the spike model. Second, the willness to pay amount based on assumptions about the normal and logistic probability distributions are estimated to be \9,429 and \9,370 respectively, so there was no significant difference. Third, the willness to pay amount of the single bounded dichotomous model and the double bounded dichotomous model are estimated to be \8,951 and \6,926 respectively, making a relatively large difference. Fourth, the willness to pay amount of the model without covariates and the model with covariates are estimated to be \9,429 and \8,951, respectively, so the willness to pay amount is underestimated when the covariates are included. Fifth, the Spike model that considers zero payment intention and resistance response estimates \10,405 as the highest payment in this study. Finally, the CVM analysis guidelines proposed by the Korea Development Institute (KDI) are estimated to be \9,749 and \10,405 respectively, depending on including no covariates and with covariates. Compared to other models, the final willness to pay amount is not estimated underestimated. Therefore this study suggests the use of KDI's guidance under government public policy projects. In view of these results, the estimating model for willness to pay amount model will be selected by considering the sample size, the suitability of the model, the sign of the estimated coefficient, the statistical significance, the ratio of the zero payment intention and the payment rejection. And, for CVMs on government public policy projects, it is desirable to estimate by the method proposed by the KDI.

Discount Survival Models

  • Shim, Joo-Y.;Sohn, Joong-K.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.7 no.2
    • /
    • pp.227-234
    • /
    • 1996
  • The discount survival model is proposed for the application of the Cox model on the analysis of survival data with time-varying effects of covariates. Algorithms for the recursive estimation of the parameter vector and the retrospective estimation of the survival function are suggested. Also the algorithm of forecasting of the survival function of individuals of specific covariates in the next time interval based on the information gathered until the end of a certain time interval is suggested.

  • PDF

A Study on the Conditional Survival Function with Random Censored Data

  • Lee, Won-Kee;Song, Myung-Unn
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.405-411
    • /
    • 2004
  • In the analysis of cancer data, it is important to make inferences of survival function and to assess the effects of covariates. Cox's proportional hazard model(PHM) and Beran's nonparametric method are generally used to estimate the survival function with covariates. We adjusted the incomplete survival time using the Buckley and James's(1979) pseudo random variables, and then proposed the estimator for the conditional survival function. Also, we carried out the simulation studies to compare the performances of the proposed method.

  • PDF

Exploring Factors Related to Metastasis Free Survival in Breast Cancer Patients Using Bayesian Cure Models

  • Jafari-Koshki, Tohid;Mansourian, Marjan;Mokarian, Fariborz
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.22
    • /
    • pp.9673-9678
    • /
    • 2014
  • Background: Breast cancer is a fatal disease and the most frequently diagnosed cancer in women with an increasing pattern worldwide. The burden is mostly attributed to metastatic cancers that occur in one-third of patients and the treatments are palliative. It is of great interest to determine factors affecting time from cancer diagnosis to secondary metastasis. Materials and Methods: Cure rate models assume a Poisson distribution for the number of unobservable metastatic-component cells that are completely deleted from the non-metastasis patient body but some may remain and result in metastasis. Time to metastasis is defined as a function of the number of these cells and the time for each cell to develop a detectable sign of metastasis. Covariates are introduced to the model via the rate of metastatic-component cells. We used non-mixture cure rate models with Weibull and log-logistic distributions in a Bayesian setting to assess the relationship between metastasis free survival and covariates. Results: The median of metastasis free survival was 76.9 months. Various models showed that from covariates in the study, lymph node involvement ratio and being progesterone receptor positive were significant, with an adverse and a beneficial effect on metastasis free survival, respectively. The estimated fraction of patients cured from metastasis was almost 48%. The Weibull model had a slightly better performance than log-logistic. Conclusions: Cure rate models are popular in survival studies and outperform other models under certain conditions. We explored the prognostic factors of metastatic breast cancer from a different viewpoint. In this study, metastasis sites were analyzed all together. Conducting similar studies in a larger sample of cancer patients as well as evaluating the prognostic value of covariates in metastasis to each site separately are recommended.

Comparison of GEE Estimators Using Imputation Methods (대체방법별 GEE추정량 비교)

  • 김동욱;노영화
    • The Korean Journal of Applied Statistics
    • /
    • v.16 no.2
    • /
    • pp.407-426
    • /
    • 2003
  • We consider the missing covariates problem in generalized estimating equations(GEE) model. If the covariate is partially missing, GEE can not be calculated. In this paper, we study the performance of 7 imputation methods to handle missing covariates in GEE models, and the properties of GEE estimators are investigated after missing covariates are imputed for ordinal data of repeated measurements. The 7 imputation methods include i) Naive Deletion ii) Sample Average Imputation iii) Row Average Imputation iv) Cross-wave Regression Imputation v) Carry-over Imputation vi) Bayesian Bootstrap vii) Approximate Bayesian Bootstrap. A Monte-Carlo simulation is used to compare the performance of these methods. For the missing mechanism generating the missing data, we assume ignorable nonresponse. Furthermore, we generate missing covariates with or without considering wave nonresp onse patterns.

Penalized quantile regression tree (벌점화 분위수 회귀나무모형에 대한 연구)

  • Kim, Jaeoh;Cho, HyungJun;Bang, Sungwan
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.7
    • /
    • pp.1361-1371
    • /
    • 2016
  • Quantile regression provides a variety of useful statistical information to examine how covariates influence the conditional quantile functions of a response variable. However, traditional quantile regression (which assume a linear model) is not appropriate when the relationship between the response and the covariates is a nonlinear. It is also necessary to conduct variable selection for high dimensional data or strongly correlated covariates. In this paper, we propose a penalized quantile regression tree model. The split rule of the proposed method is based on residual analysis, which has a negligible bias to select a split variable and reasonable computational cost. A simulation study and real data analysis are presented to demonstrate the satisfactory performance and usefulness of the proposed method.