• Title/Summary/Keyword: survival model

Search Result 1,264, Processing Time 0.036 seconds

A prediction of overall survival status by deep belief network using Python® package in breast cancer: a nationwide study from the Korean Breast Cancer Society

  • Ryu, Dong-Won
    • Korean Journal of Artificial Intelligence
    • /
    • v.6 no.2
    • /
    • pp.11-15
    • /
    • 2018
  • Breast cancer is one of the leading causes of cancer related death among women. So prediction of overall survival status is important into decided in adjuvant treatment. Deep belief network is a kind of artificial intelligence (AI). We intended to construct prediction model by deep belief network using associated clinicopathologic factors. 103881 cases were found in the Korean Breast Cancer Registry. After preprocessing of data, a total of 15733 cases were enrolled in this study. The median follow-up period was 82.4 months. In univariate analysis for overall survival (OS), the patients with advanced AJCC stage showed relatively high HR (HR=1.216 95% CI: 0.011-289.331, p=0.001). Based on results of univariate and multivariate analysis, input variables for learning model included 17 variables associated with overall survival rate. output was presented in one of two states: event or cencored. Individual sensitivity of training set and test set for predicting overall survival status were 89.6% and 91.2% respectively. And specificity of that were 49.4% and 48.9% respectively. So the accuracy of our study for predicting overall survival status was 82.78%. Prediction model based on Deep belief network appears to be effective in predicting overall survival status and, in particular, is expected to be applicable to decide on adjuvant treatment after surgical treatment.

Estimating survival distributions for two-stage adaptive treatment strategies: A simulation study

  • Vilakati, Sifiso;Cortese, Giuliana;Dlamini, Thembelihle
    • Communications for Statistical Applications and Methods
    • /
    • v.28 no.5
    • /
    • pp.411-424
    • /
    • 2021
  • Inference following two-stage adaptive designs (also known as two-stage randomization designs) with survival endpoints usually focuses on estimating and comparing survival distributions for the different treatment strategies. The aim is to identify the treatment strategy(ies) that leads to better survival of the patients. The objectives of this study were to assess the performance three commonly cited methods for estimating survival distributions in two-stage randomization designs. We review three non-parametric methods for estimating survival distributions in two-stage adaptive designs and compare their performance using simulation studies. The simulation studies show that the method based on the marginal mean model is badly affected by high censoring rates and response rate. The other two methods which are natural extensions of the Nelson-Aalen estimator and the Kaplan-Meier estimator have similar performance. These two methods yield survival estimates which have less bias and more precise than the marginal mean model even in cases of small sample sizes. The weighted versions of the Nelson-Aalen and the Kaplan-Meier estimators are less affected by high censoring rates and low response rates. The bias of the method based on the marginal mean model increases rapidly with increase in censoring rate compared to the other two methods. We apply the three methods to a leukemia clinical trial dataset and also compare the results.

Review of statistical methods for survival analysis using genomic data

  • Lee, Seungyeoun;Lim, Heeju
    • Genomics & Informatics
    • /
    • v.17 no.4
    • /
    • pp.41.1-41.12
    • /
    • 2019
  • Survival analysis mainly deals with the time to event, including death, onset of disease, and bankruptcy. The common characteristic of survival analysis is that it contains "censored" data, in which the time to event cannot be completely observed, but instead represents the lower bound of the time to event. Only the occurrence of either time to event or censoring time is observed. Many traditional statistical methods have been effectively used for analyzing survival data with censored observations. However, with the development of high-throughput technologies for producing "omics" data, more advanced statistical methods, such as regularization, should be required to construct the predictive survival model with high-dimensional genomic data. Furthermore, machine learning approaches have been adapted for survival analysis, to fit nonlinear and complex interaction effects between predictors, and achieve more accurate prediction of individual survival probability. Presently, since most clinicians and medical researchers can easily assess statistical programs for analyzing survival data, a review article is helpful for understanding statistical methods used in survival analysis. We review traditional survival methods and regularization methods, with various penalty functions, for the analysis of high-dimensional genomics, and describe machine learning techniques that have been adapted to survival analysis.

Black Hispanic and Black Non-Hispanic Breast Cancer Survival Data Analysis with Half-normal Model Application

  • Khan, Hafiz Mohammad Rafiqullah;Saxena, Anshul;Vera, Veronica;Abdool-Ghany, Faheema;Gabbidon, Kemesha;Perea, Nancy;Stewart, Tiffanie Shauna-Jeanne;Ramamoorthy, Venkataraghavan
    • Asian Pacific Journal of Cancer Prevention
    • /
    • v.15 no.21
    • /
    • pp.9453-9458
    • /
    • 2014
  • Background: Breast cancer is the second leading cause of cancer death for women in the United States. Differences in survival of breast cancer have been noted among racial and ethnic groups, but the reasons for these disparities remain unclear. This study presents the characteristics and the survival curve of two racial and ethnic groups and evaluates the effects of race on survival times by measuring the lifetime data-based half-normal model. Materials and Methods: The distributions among racial and ethnic groups are compared using female breast cancer patients from nine states in the country all taken from the National Cancer Institute's Surveillance, Epidemiology, and End Results cancer registry. The main end points observed are: age at diagnosis, survival time in months, and marital status. The right skewed half-normal statistical probability model is used to show the differences in the survival times between black Hispanic (BH) and black non-Hispanic (BNH) female breast cancer patients. The Kaplan-Meier and Cox proportional hazard ratio are used to estimate and compare the relative risk of death in two minority groups, BH and BNH. Results: A probability random sample method was used to select representative samples from BNH and BH female breast cancer patients, who were diagnosed during the years of 1973-2009 in the United States. The sample contained 1,000 BNH and 298 BH female breast cancer patients. The median age at diagnosis was 57.75 years among BNH and 54.11 years among BH. The results of the half-normal model showed that the survival times formed positive skewed models with higher variability in BNH compared with BH. The Kaplan-Meir estimate was used to plot the survival curves for cancer patients; this test was positively skewed. The Kaplan-Meier and Cox proportional hazard ratio for survival analysis showed that BNH had a significantly longer survival time as compared to BH which is consistent with the results of the half-normal model. Conclusions: The findings with the proposed model strategy will assist in the healthcare field to measure future outcomes for BH and BNH, given their past history and conditions. These findings may provide an enhanced and improved outlook for the diagnosis and treatment of breast cancer patients in the United States.

Prediction Model on Delivery Time in Display FAB Using Survival Analysis (생존분석을 이용한 디스플레이 FAB의 반송시간 예측모형)

  • Han, Paul;Baek, Jun Geol
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.40 no.3
    • /
    • pp.283-290
    • /
    • 2014
  • In the flat panel display industry, to meet production target quantities and the deadline of production, the scheduler and dispatching systems are major production management systems which control the order of facility production and the distribution of WIP (Work In Process). Especially the delivery time is a key factor of the dispatching system for the time when a lot can be supplied to the facility. In this paper, we use survival analysis methods to identify main factors of the delivery time and to build the delivery time forecasting model. To select important explanatory variables, the cox proportional hazard model is used to. To make a prediction model, the accelerated failure time (AFT) model was used. Performance comparisons were conducted with two other models, which are the technical statistics model based on transfer history and the linear regression model using same explanatory variables with AFT model. As a result, the mean square error (MSE) criteria, the AFT model decreased by 33.8% compared to the statistics prediction model, decreased by 5.3% compared to the linear regression model. This survival analysis approach is applicable to implementing the delivery time estimator in display manufacturing. And it can contribute to improve the productivity and reliability of production management system.

Identifying the Factors Affecting the First Traffic Violation Duration by Novice Drivers (초보운전자 생애 첫 교통법규 위반기간에 영향을 미치는 요인)

  • Kang, Gyungmi;Kim, Do-Gyeong
    • International Journal of Highway Engineering
    • /
    • v.15 no.5
    • /
    • pp.203-215
    • /
    • 2013
  • PURPOSES : This study deals with first traffic violations occurred by novice drivers, which may be associated with traffic accidents. The objective of this study is to identify what kinds of drivers' characteristics influence on duration till the first traffic violation. METHODS : For the study, Survival Analysis and Cox proportional hazard model, that are usually used in the medical field, were employed. Survival Analysis was conducted to investigate whether there exist differences in survival duration by each covariate, whereas Cox proportional hazard model was used to identify significant factors that affect survival duration till novice drivers violate traffic regulations for the first time after getting a driver license. RESULTS : The results of Survival Analysis indicate that female, age (less than 21), low-frequency examinee of written exam, and non-crash involved drivers have longer duration till the first violation compared to male, greater than 21 years old, high-frequency examinee of written exam, and crash involved drivers, respectively. For the Cox proportional hazard model, license class 1 acquisitor was found to increase the survival duration till the first traffic violation was made, while male, age of 21-24, age of 25-34, age of 45-54, and crash involved drivers were more likely to reduce the survival duration. CONCLUSIONS : Absolutely, traffic violation is closely related to traffic accidents and all of the drivers should keep the traffic regulations to enhance highway safety. The results of this study might provide some insights to construct safe road environments by controlling the factors that reduce the traffic violation duration of novice drivers.

Estimating the Survival of Patients With Lung Cancer: What Is the Best Statistical Model?

  • Abedi, Siavosh;Janbabaei, Ghasem;Afshari, Mahdi;Moosazadeh, Mahmood;Alashti, Masoumeh Rashidi;Hedayatizadeh-Omran, Akbar;Alizadeh-Navaei, Reza;Abedini, Ehsan
    • Journal of Preventive Medicine and Public Health
    • /
    • v.52 no.2
    • /
    • pp.140-144
    • /
    • 2019
  • Objectives: Investigating the survival of patients with cancer is vitally necessary for controlling the disease and for assessing treatment methods. This study aimed to compare various statistical models of survival and to determine the survival rate and its related factors among patients suffering from lung cancer. Methods: In this retrospective cohort, the cumulative survival rate, median survival time, and factors associated with the survival of lung cancer patients were estimated using Cox, Weibull, exponential, and Gompertz regression models. Kaplan-Meier tables and the log-rank test were also used to analyze the survival of patients in different subgroups. Results: Of 102 patients with lung cancer, 74.5% were male. During the follow-up period, 80.4% died. The incidence rate of death among patients was estimated as 3.9 (95% confidence [CI], 3.1 to 4.8) per 100 person-months. The 5-year survival rate for all patients, males, females, patients with non-small cell lung carcinoma (NSCLC), and patients with small cell lung carcinoma (SCLC) was 17%, 13%, 29%, 21%, and 0%, respectively. The median survival time for all patients, males, females, those with NSCLC, and those with SCLC was 12.7 months, 12.0 months, 16.0 months, 16.0 months, and 6.0 months, respectively. Multivariate analyses indicated that the hazard ratios (95% CIs) for male sex, age, and SCLC were 0.56 (0.33 to 0.93), 1.03 (1.01 to 1.05), and 2.91 (1.71 to 4.95), respectively. Conclusions: Our results showed that the exponential model was the most precise. This model identified age, sex, and type of cancer as factors that predicted survival in patients with lung cancer.

Survival Prediction of Rats with Hemorrhagic Shocks Using Support Vector Machine (지원벡터기계를 이용한 출혈을 일으킨 흰쥐에서의 생존 예측)

  • Jang, K.H.;Choi, J.L.;Yoo, T.K.;Kwon, M.K.;Kim, D.W.
    • Journal of Biomedical Engineering Research
    • /
    • v.33 no.1
    • /
    • pp.1-7
    • /
    • 2012
  • Hemorrhagic shock is a common cause of death in emergency rooms. Early diagnosis of hemorrhagic shock makes it possible for physicians to treat patients successfully. Therefore, the purpose of this study was to select an optimal survival prediction model using physiological parameters for the two analyzed periods: two and five minutes before and after the bleeding end. We obtained heart rates, mean arterial pressures, respiration rates and temperatures from 45 rats. These physiological parameters were used for the training and testing data sets of survival prediction models using an artificial neural network (ANN) and support vector machine (SVM). We applied a 5-fold cross validation method to avoid over-fitting and to select the optimal survival prediction model. In conclusion, SVM model showed slightly better accuracy than ANN model for survival prediction during the entire analysis period.

Analysing Risk Factors of 5-Year Survival Colorectal Cancer Using the Network Model

  • Park, Won Jun;Lee, Young Ho;Kang, Un Gu
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.9
    • /
    • pp.103-108
    • /
    • 2019
  • The purpose of this study is to identify the factors that may affect the 5-year survival of colon cancer through network model and to use it as a clinical decision supporting system for colorectal cancer patients. This study was conducted using data from 2,540 patients who underwent colorectal cancer surgery from 1996 to 2018. Eleven factors related to survival of colorectal cancer were selected by consulting medical experts and previous studies. Analysis was proceeded from the data sorted out into 1,839 patients excluding missing values and outliers. Logistic regression analysis showed that age, BMI, and heart disease were statistically significant in order to identify factors affecting 5-year survival of colorectal cancer. Additionally, a correlation analysis was carried out age, BMI, heart disease, diabetes, and other diseases were correlated with 5-year survival of colorectal cancer. Sex was related with BMI, lung disease, and liver disease. Age was associated with heart disease, heart disease, hypertension, diabetes, and other diseases, and BMI with hypertension, diabetes, and other diseases. Heart disease was associated with hypertension, diabetes, hypertension, diabetes, and other diseases. In addition, diabetes and kidney disease were associated. In the correlation analysis, the network model was constructed with the Network Correlation Coefficient less than p <0.001 as the weight. The network model showed that factors directly affecting survival were age, BMI levels, heart disease, and indirectly influencing factors were diabetes, high blood pressure, liver disease and other diseases. If the network model is used as an assistant indicator for the treatment of colorectal cancer, it could contribute to increasing the survival rate of patients.

Muscle Radiation Attenuation in the Erector Spinae and Multifidus Muscles as a Determinant of Survival in Patients with Gastric Cancer

  • An, Soomin;Kim, Youn-Jung;Han, Ga Young;Eo, Wankyu
    • Journal of Korean Biological Nursing Science
    • /
    • v.24 no.1
    • /
    • pp.17-25
    • /
    • 2022
  • Purpose: To determine the prognostic role of muscle area and muscle radiation attenuation in the erector spinae (ES) and multifidus (MF) muscles in patients undergoing gastrectomy. Methods: Patients with stage I-III gastric cancer undergoing gastrectomy were retrospectively enrolled in this study. Clinicopathologic characteristics were collected and analyzed. Both paraspinal muscle index of ES/MF muscles (PMIEM) and paraspinal muscle radiation attenuation in the same muscles (PMRAEM) were analyzed at the 3rd lumbar level using axial computed tomographic images. Cox regression analysis was applied to estimate overall survival (OS) and disease-free survival (DFS). Results: There was only a weak correlation between PMIEM and PMRAEM (r= 0.28). Multivariate Cox regression revealed that PMRAEM, but not PMIEM, was an important determinant of survival. PMRAEM along with age, tumor-node-metastasis (TNM) stage, perineural invasion, and serum albumin level were significant determinants of both OS and DFS that constituted Model 1. Harrell's concordance index and integrated area under receiver operating characteristic curve were greater for Model 1 than for Model 2 (consisting of the same covariates as Model 1 except PMRAEM) or Model 3 (consisting of only TNM stage). Conclusion: PMRAEM, but not PMIEM, was an important determinant of survival. Because there was only a weak correlation between PMIEM and PMRAEM in this study, it was presumed that they were mutually exclusive. Model 1 consisting of age, TNM stage, perineural invasion, serum albumin level, and PMRAEM was greater than nested models (i.e., Model 2 or Model 3) in predicting survival outcomes.