• Title/Summary/Keyword: survival data

Search Result 2,111, Processing Time 0.023 seconds

Discount Survival Models

  • Shim, Joo-Y.;Sohn, Joong-K.
    • Journal of the Korean Data and Information Science Society
    • /
    • v.7 no.2
    • /
    • pp.227-234
    • /
    • 1996
  • The discount survival model is proposed for the application of the Cox model on the analysis of survival data with time-varying effects of covariates. Algorithms for the recursive estimation of the parameter vector and the retrospective estimation of the survival function are suggested. Also the algorithm of forecasting of the survival function of individuals of specific covariates in the next time interval based on the information gathered until the end of a certain time interval is suggested.

  • PDF

Review of statistical methods for survival analysis using genomic data

  • Lee, Seungyeoun;Lim, Heeju
    • Genomics & Informatics
    • /
    • v.17 no.4
    • /
    • pp.41.1-41.12
    • /
    • 2019
  • Survival analysis mainly deals with the time to event, including death, onset of disease, and bankruptcy. The common characteristic of survival analysis is that it contains "censored" data, in which the time to event cannot be completely observed, but instead represents the lower bound of the time to event. Only the occurrence of either time to event or censoring time is observed. Many traditional statistical methods have been effectively used for analyzing survival data with censored observations. However, with the development of high-throughput technologies for producing "omics" data, more advanced statistical methods, such as regularization, should be required to construct the predictive survival model with high-dimensional genomic data. Furthermore, machine learning approaches have been adapted for survival analysis, to fit nonlinear and complex interaction effects between predictors, and achieve more accurate prediction of individual survival probability. Presently, since most clinicians and medical researchers can easily assess statistical programs for analyzing survival data, a review article is helpful for understanding statistical methods used in survival analysis. We review traditional survival methods and regularization methods, with various penalty functions, for the analysis of high-dimensional genomics, and describe machine learning techniques that have been adapted to survival analysis.

A Study on the Conditional Survival Function with Random Censored Data

  • Lee, Won-Kee;Song, Myung-Unn
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.405-411
    • /
    • 2004
  • In the analysis of cancer data, it is important to make inferences of survival function and to assess the effects of covariates. Cox's proportional hazard model(PHM) and Beran's nonparametric method are generally used to estimate the survival function with covariates. We adjusted the incomplete survival time using the Buckley and James's(1979) pseudo random variables, and then proposed the estimator for the conditional survival function. Also, we carried out the simulation studies to compare the performances of the proposed method.

  • PDF

A Study on the Survival Probability and Survival Factors of Small and Medium-sized Enterprises Using Technology Rating Data (기술평가 자료를 이용한 중소기업의 생존율 추정 및 생존요인 분석)

  • Lee, Young-Chan
    • Knowledge Management Research
    • /
    • v.11 no.2
    • /
    • pp.95-109
    • /
    • 2010
  • The objectives of this study are to identify the survival function (hazard function) of small and medium enterprises by using technology rating data for the companies guaranteed by Korea Technology Finance Corporation (KOTEC), and to figure out the factors that affects their survival. To serve the purposes, this study uses Kaplan-Meier Analysis as a non-parametric method and Cox proportional hazards model as a semi-parametric one. The 17,396 guaranteed companies that assessed from July 1st in 2005 to December 31st in 2009 are selected as samples (16,504 censored data and 829 accident data). The survival time is computed with random censoring (Type III) from July in 2005 as a starting point. The results of the analysis show that Kaplan-Meier Analysis and Cox proportional hazards model are able to readily estimate survival and hazard function and to perform comparative study among group variables such as industry and technology rating level. In particular, Cox proportional hazards model is recognized that it is useful to understand which technology rating items are meaningful to company's survival and how much they affect it. It is considered that these results will provide valuable knowledge for practitioners to find and manage the significant items for survival of the guaranteed companies through future technology rating.

  • PDF

Comparative Study on Statistical Packages Analyzing Survival Model - SAS, SPSS, STATA -

  • Cho, Mi-Soon;Kim, Soon-Kwi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.19 no.2
    • /
    • pp.487-496
    • /
    • 2008
  • Recently survival analysis becomes popular in a variety of fields so that a number of statistical packages are developed for analyzing the survival model. In this paper, several types of survival models are introduced and considered briefly. In addition, widely used three packages(SAS, SPSS, and STATA) for survival data are reviewed and their characteristics are investigated.

  • PDF

Survival in Fry and Juvenile Stages of Masu salmon Oncorhynchus masou : Estimates of Heritabilities and Correlations

  • Choe, Mi-Kyung
    • Journal of Aquaculture
    • /
    • v.12 no.3
    • /
    • pp.185-191
    • /
    • 1999
  • A genetic analysis for survival in fry and juvenile stages of masu salmon was described. Data from two year-classes of masu salmon were analyzed to estimate the heritability for survival during the fresh water-rearing period. The overall survival for each year-class during 8 months of freshwater rearing were 17.8 and 11.6%, respectively. Whirling disease virus (WDV) was the main cause of death in all year-classes. Survival data obtained for offspring of 42 sires and 60 dams of masu salmon (two year classes of data) was analyzed. Average survival rates in the observation period ranged 2-87% for 1994; 0-98% for 1995, repectively. In both year-classes, heritabilities for survival derived from the sire components of variance were low(0.13-0.18), except one. Heritabilities derived from the dam components of variance ranged 0.14-0.61, including non-additive genetic and /or common enviromental effects. Correlations between survival in two long-term periods were all positive and medium to high in magnitude(0.345-0.918). Correlations between survival in non-succeeding periods were, in general, low and insignificant. Correlation between long-term survival and growth rate was found in masu salmon. The corresponding correlation in masu salmon was not significantly different from zero. Correlations between sire survival and body weight, length and condition factor of slaughter were not significant, but varied.

  • PDF

Analyzing Survival Data as Binary Outcomes with Logistic Regression

  • Lim, Jo-Han;Lee, Kyeong-Eun;Hahn, Kyu-S.;Park, Kun-Woo
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.117-126
    • /
    • 2010
  • Clinical researchers often analyze survival data as binary outcomes using the logistic regression method. This paper examines the information loss resulting from analyzing survival time as binary outcomes. We first demonstrate that, under the proportional hazard assumption, this binary discretization does result in a significant information loss. Second, when fitting a logistic model to survival time data, researchers inadvertently use the maximal statistic. We implement a numerical study to examine the properties of the reference distribution for this statistic, finally, we show that the logistic regression method can still be a useful tool for analyzing survival data in particular when the proportional hazard assumption is questionable.

A modified partial least squares regression for the analysis of gene expression data with survival information

  • Lee, So-Yoon;Huh, Myung-Hoe;Park, Mira
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1151-1160
    • /
    • 2014
  • In DNA microarray studies, the number of genes far exceeds the number of samples and the gene expression measures are highly correlated. Partial least squares regression (PLSR) is one of the popular methods for dimensional reduction and known to be useful for the classifications of microarray data by several studies. In this study, we suggest a modified version of the partial least squares regression to analyze gene expression data with survival information. The method is designed as a new gene selection method using PLSR with an iterative procedure of imputing censored survival time. Mean square error of prediction criterion is used to determine the dimension of the model. To visualize the data, plot for variables superimposed with samples are used. The method is applied to two microarray data sets, both containing survival time. The results show that the proposed method works well for interpreting gene expression microarray data.

Model-Based Survival Estimates of Female Breast Cancer Data

  • Khan, Hafiz Mohammad Rafiqullah;Saxena, Anshul;Gabbidon, Kemesha;Rana, Sagar;Ahmed, Nasar Uddin
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.6
    • /
    • pp.2893-2900
    • /
    • 2014
  • Background: Statistical methods are very important to precisely measure breast cancer patient survival times for healthcare management. Previous studies considered basic statistics to measure survival times without incorporating statistical modeling strategies. The objective of this study was to develop a data-based statistical probability model from the female breast cancer patients' survival times by using the Bayesian approach to predict future inferences of survival times. Materials and Methods: A random sample of 500 female patients was selected from the Surveillance Epidemiology and End Results cancer registry database. For goodness of fit, the standard model building criteria were used. The Bayesian approach is used to obtain the predictive survival times from the data-based Exponentiated Exponential Model. Markov Chain Monte Carlo method was used to obtain the summary results for predictive inference. Results: The highest number of female breast cancer patients was found in California and the lowest in New Mexico. The majority of them were married. The mean (SD) age at diagnosis (in years) was 60.92 (14.92). The mean (SD) survival time (in months) for female patients was 90.33 (83.10). The Exponentiated Exponential Model found better fits for the female survival times compared to the Exponentiated Weibull Model. The Bayesian method is used to obtain predictive inference for future survival times. Conclusions: The findings with the proposed modeling strategy will assist healthcare researchers and providers to precisely predict future survival estimates as the recent growing challenges of analyzing healthcare data have created new demand for model-based survival estimates. The application of Bayesian will produce precise estimates of future survival times.

Comparison between Overall, Cause-specific, and Relative Survival Rates Based on Data from a Population-based Cancer Registry

  • Utada, Mai;Ohno, Yuko;Shimizu, Sachiko;Hori, Megumi;Soda, Midori
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.13 no.11
    • /
    • pp.5681-5685
    • /
    • 2012
  • Three kinds of survival rates are generally used depending on the purpose of the investigation: overall, cause-specific, and relative. The differences among these 3 survival rates are derived from their respective formulas; however, reports based on actual cancer registry data are few because of incomplete information and short follow-up duration recorded on cancer registration. The aim of this study was to numerically and visually compare these 3 survival rates on the basis of data from the Nagasaki Prefecture Cancer Registry. Subjects were patients diagnosed with cancer and registered in the registry between 1999 and 2003. We calculated the proportion of cause of death and 5-year survival rates. For lung, liver, or advanced stage cancers, the proportions of cancer-related death were high and the differences in survival rates were small. For prostate or early stage cancers, the proportions of death from other causes were high and the differences in survival rates were large. We concluded that the differences among the 3 survival rates increased when the proportion of death from other causes increased.