• Title/Summary/Keyword: censored regression

Search Result 93, Processing Time 0.023 seconds

Mixed effects least squares support vector machine for survival data analysis (생존자료분석을 위한 혼합효과 최소제곱 서포트벡터기계)

  • Hwang, Chang-Ha;Shim, Joo-Yong
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.4
    • /
    • pp.739-748
    • /
    • 2012
  • In this paper we propose a mixed effects least squares support vector machine (LS-SVM) for the censored data which are observed from different groups. We use weights by which the randomly right censoring is taken into account in the nonlinear regression. The weights are formed with Kaplan-Meier estimates of censoring distribution. In the proposed model a random effects term representing inter-group variation is included. Furthermore generalized cross validation function is proposed for the selection of the optimal values of hyper-parameters. Experimental results are then presented which indicate the performance of the proposed LS-SVM by comparing with a standard LS-SVM for the censored data.

Statistical Inferences in the Weibull Regression Model based on Censored Data (중도절단(中途切斷)된 데이터를 이용한 와이블회귀모형(回歸模型)의 통계적(統計的) 추론(推論)에 관한 연구(硏究))

  • Cho, Kil-Ho
    • Journal of the Korean Data and Information Science Society
    • /
    • v.4
    • /
    • pp.13-30
    • /
    • 1993
  • We propose the ordered least squares estimators(OLSE's) of the parameters and the p-th quantiles for the two-parameter Weibull regression model under the Type II censoring, The Monte Carlo simulations are performed to compare the proposed estimators with the maximum likelihood estimators(MLE's), and it is shown that the proposed estimators are slightly better than MLE's as the censoring rate goes up.

  • PDF

Asymptotics for Accelerated Life Test Models under Type II Censoring

  • Park, Byung-Gu;Yoon, Sang-Chul
    • Journal of the Korean Data and Information Science Society
    • /
    • v.7 no.2
    • /
    • pp.179-188
    • /
    • 1996
  • Accelerated life testing(ALT) of products quickly yields information on life. In this paper, we investigate asymptotic normalities of maximum likelihood(ML) estimators of parameters for ALT model under Type II censored data using results of Bhattacharyya(1985). Further illustrations include the treatment of asymptotic of the exponential and Weibull regression models.

  • PDF

Survival Function Estimation for the Proportional Hazards Regression Model

  • Cha, Young Joon
    • Journal of Korean Society for Quality Management
    • /
    • v.18 no.1
    • /
    • pp.9-20
    • /
    • 1990
  • The purpose of this paper is to propose the modified semiparametric estimators for survival function in the Cox's regression model with randomly censored data based on Tsiatis and Breslow estimators, and present their asymptotic variances estimates. The proposed estimators are compared to Tsiatis, Breslow, and Kaplan-Meier estimators through a small-sample Monte Carlo study. The simulation results show that the proposed estimators are preferred for small sample sizes.

  • PDF

A Study on the Efficiency and Its Determinants in Korea's Service Sectors Using DEA (자료포락분석(DEA)를 이용한 우리나라 서비스산업의 효율성과 결정요인 분석)

  • Bae, Se-Young
    • Journal of Digital Convergence
    • /
    • v.19 no.10
    • /
    • pp.339-348
    • /
    • 2021
  • This paper aims to analyze the production efficiency in Korea's ten service sectors using DEA and its determinants utilizing a truncated-Tobit regression model and a censored-Tobit regression model in 2010-2019. This paper found: First, the Korean service sector's production efficiency in general has been significantly low and polarized. Especially, the inefficiency resulted from the scale inefficiency in the 'sewerage waste management industry.' Second, in the determinants analysis, the results show the positive effect of the investment and R&D expenses on technical efficiency, while FDI and lobbying expenses illustrate the negative impact. Moreover, it seems that the larger the industry, the higher the efficiency. Thus, the future Korean government's economic policy for the service sectors requires a mixed and integrated policy of the macroeconomic aspect such as active investment and R&D activities with microeconomic aspect including a convergence of FDI and human capital.

Regression discontinuity for survival data

  • Youngjoo Cho
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.1
    • /
    • pp.155-178
    • /
    • 2024
  • Regression discontinuity (RD) design is one of the most widely used methods in causal inference for estimation of treatment effect when the treatment is created by a cutpoint from the covariate of interest. There has been little attention to RD design, although it provides a very useful tool for analysis of treatment effect for censored data. In this paper, we define the causal effect for survival function in RD design when the treatment is assigned deterministically by the covariate of interest. We propose estimators of this causal effect for survival data by using transformation, which leads unbiased estimator of the survival function with local linear regression. Simulation studies show the validity of our approach. We also illustrate our proposed method using the prostate, lung, colorectal and ovarian (PLCO) dataset.

A Comparison of Analysis Methods for Work Environment Measurement Databases Including Left-censored Data (불검출 자료를 포함한 작업환경측정 자료의 분석 방법 비교)

  • Park, Ju-Hyun;Choi, Sangjun;Koh, Dong-Hee;Park, Donguk;Sung, Yeji
    • Journal of Korean Society of Occupational and Environmental Hygiene
    • /
    • v.32 no.1
    • /
    • pp.21-30
    • /
    • 2022
  • Objectives: The purpose of this study is to suggest an optimal method by comparing the analysis methods of work environment measurement datasets including left-censored data where one or more measurements are below the limit of detection (LOD). Methods: A computer program was used to generate left-censored datasets for various combinations of censoring rate (1% to 90%) and sample size (30 to 300). For the analysis of the censored data, the simple substitution method (LOD/2), β-substitution method, maximum likelihood estimation (MLE) method, Bayesian method, and regression on order statistics (ROS)were all compared. Each method was used to estimate four parameters of the log-normal distribution: (1) geometric mean (GM), (2) geometric standard deviation (GSD), (3) 95th percentile (X95), and (4) arithmetic mean (AM) for the censored dataset. The performance of each method was evaluated using relative bias and relative root mean squared error (rMSE). Results: In the case of the largest sample size (n=300), when the censoring rate was less than 40%, the relative bias and rMSE were small for all five methods. When the censoring rate was large (70%, 90%), the simple substitution method was inappropriate because the relative bias was the largest, regardless of the sample size. When the sample size was small and the censoring rate was large, the Bayesian method, the β-substitution method, and the MLE method showed the smallest relative bias. Conclusions: The accuracy and precision of all methods tended to increase as the sample size was larger and the censoring rate was smaller. The simple substitution method was inappropriate when the censoring rate was high, and the β-substitution method, MLE method, and Bayesian method can be widely applied.

Prediction of the Probability of Customer Attrition by Using Cox Regression

  • Kang, Hyuncheol;Han, Sang-Tae
    • Communications for Statistical Applications and Methods
    • /
    • v.11 no.2
    • /
    • pp.227-233
    • /
    • 2004
  • This paper presents our work on constructing a model that is intended to predict the probability of attrition at specified points in time among customers of an insurance company. There are some difficulties in building a data-based model because a data set may contain possibly censored observations. In an effort to avoid such kind of problem, we performed logistic regression over specified time intervals while using explanatory variables to construct the proposed model. Then, we developed a Cox-type regression model for estimating the probability of attrition over a specified period of time using time-dependent explanatory variables subject to changes in value over the course of the observations.

Mean Lifetime Estimation with Censored Observations

  • Kim, Jin-Heum;Kim, Jee-Hoon
    • Journal of the Korean Statistical Society
    • /
    • v.26 no.3
    • /
    • pp.299-308
    • /
    • 1997
  • In the simple linear regression model Y = .alpha.$_{0}$ + .beta.$_{0}$Z + .epsilon. under the right censorship of the response variables, the estimation of the mean lifetime E(Y) is an interesting problem. In this paper we propose a method of estimating E(Y) based on the observations modified by the arguments of Buckley and James (1979). It is shown that the proposed estimator is consistent and our proposed procedure in the simple linear regression case can be naturally extended to the multiple linear regression. Finally, we perform simulation studies to compare the proposed estimator with the estimator introduced by Gill (1983).83).

  • PDF

A modified partial least squares regression for the analysis of gene expression data with survival information

  • Lee, So-Yoon;Huh, Myung-Hoe;Park, Mira
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.5
    • /
    • pp.1151-1160
    • /
    • 2014
  • In DNA microarray studies, the number of genes far exceeds the number of samples and the gene expression measures are highly correlated. Partial least squares regression (PLSR) is one of the popular methods for dimensional reduction and known to be useful for the classifications of microarray data by several studies. In this study, we suggest a modified version of the partial least squares regression to analyze gene expression data with survival information. The method is designed as a new gene selection method using PLSR with an iterative procedure of imputing censored survival time. Mean square error of prediction criterion is used to determine the dimension of the model. To visualize the data, plot for variables superimposed with samples are used. The method is applied to two microarray data sets, both containing survival time. The results show that the proposed method works well for interpreting gene expression microarray data.