• Title/Summary/Keyword: covariate

Search Result 262, Processing Time 0.024 seconds

Comparative Study on Statistical Packages for Analyzing Logistic Regression - MINITAB, SAS, SPSS, STATA -

  • Kim, Soon-Kwi;Jeong, Dong-Bin;Park, Young-Sool
    • Journal of the Korean Data and Information Science Society
    • /
    • v.15 no.2
    • /
    • pp.367-378
    • /
    • 2004
  • Recently logistic regression is popular in a variety of fields so that a number of statistical packages are developed for analyzing the logistic regression. This paper briefly considers the several types of logistic regression models used depending on different types of data. In addition, when four statistical packages (MINTAB, SAS, SPSS and STATA) are used to apply logistic regression models to the real fields respectively, their scope and characteristics are investigated.

  • PDF

Financial performance analysis of guaranteed firms using propensity scores (성향점수를 활용한 보증기업의 재무성과 분석)

  • Nam, Joo-Ha;Kim, Jung-Ryol;Noh, Maengseok
    • The Korean Journal of Applied Statistics
    • /
    • v.29 no.2
    • /
    • pp.389-398
    • /
    • 2016
  • In this paper, we examine the financial performance of credit guarantee programs. We compared financial performance of guaranteed firms of KODIT and non-guaranteed firms. The of covariate adjusted propensity score method is used because a selection bias problem could occur if t-test or regression analysis were used. The results show that a credit guarantee program enhances the financial performance of beneficiary firms.

A Flexible Modeling Approach for Current Status Survival Data via Pseudo-Observations

  • Han, Seungbong;Andrei, Adin-Cristian;Tsui, Kam-Wah
    • The Korean Journal of Applied Statistics
    • /
    • v.25 no.6
    • /
    • pp.947-958
    • /
    • 2012
  • When modeling event times in biomedical studies, the outcome might be incompletely observed. In this paper, we assume that the outcome is recorded as current status failure time data. Despite well-developed literature the routine practical use of many current status data modeling methods remains infrequent due to the lack of specialized statistical software, the difficulty to assess model goodness-of-fit, as well as the possible loss of information caused by covariate grouping or discretization. We propose a model based on pseudo-observations that is convenient to implement and that allows for flexibility in the choice of the outcome. Parameter estimates are obtained based on generalized estimating equations. Examples from studies in bile duct hyperplasia and breast cancer in conjunction with simulated data illustrate the practical advantages of this model.

Bezier curve smoothing of cumulative hazard function estimators

  • Cha, Yongseb;Kim, Choongrak
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.3
    • /
    • pp.189-201
    • /
    • 2016
  • In survival analysis, the Nelson-Aalen estimator and Peterson estimator are often used to estimate a cumulative hazard function in randomly right censored data. In this paper, we suggested the smoothing version of the cumulative hazard function estimators using a Bezier curve. We compare them with the existing estimators including a kernel smooth version of the Nelson-Aalen estimator and the Peterson estimator in the sense of mean integrated square error to show through numerical studies that the proposed estimators are better than existing ones. Further, we applied our method to the Cox regression where covariates are used as predictors and suggested a survival function estimation at a given covariate.

Dynamic linear mixed models with ARMA covariance matrix

  • Han, Eun-Jeong;Lee, Keunbaik
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.6
    • /
    • pp.575-585
    • /
    • 2016
  • Longitudinal studies repeatedly measure outcomes over time. Therefore, repeated measurements are serially correlated from same subject (within-subject variation) and there is also variation between subjects (between-subject variation). The serial correlation and the between-subject variation must be taken into account to make proper inference on covariate effects (Diggle et al., 2002). However, estimation of the covariance matrix is challenging because of many parameters and positive definiteness of the matrix. To overcome these limitations, we propose autoregressive moving average Cholesky decomposition (ARMACD) for the linear mixed models. The ARMACD allows a class of flexible, nonstationary, and heteroscedastic models that exploits the structure allowed by combining the AR and MA modeling of the random effects covariance matrix. We analyze a real dataset to illustrate our proposed methods.

Regression analysis of interval censored competing risk data using a pseudo-value approach

  • Kim, Sooyeon;Kim, Yang-Jin
    • Communications for Statistical Applications and Methods
    • /
    • v.23 no.6
    • /
    • pp.555-562
    • /
    • 2016
  • Interval censored data often occur in an observational study where the subject is followed periodically. Instead of observing an exact failure time, two inspection times that include it are available. There are several methods to analyze interval censored failure time data (Sun, 2006). However, in the presence of competing risks, few methods have been suggested to estimate covariate effect on interval censored competing risk data. A sub-distribution hazard model is a commonly used regression model because it has one-to-one correspondence with a cumulative incidence function. Alternatively, Klein and Andersen (2005) proposed a pseudo-value approach that directly uses the cumulative incidence function. In this paper, we consider an extension of the pseudo-value approach into the interval censored data to estimate regression coefficients. The pseudo-values generated from the estimated cumulative incidence function then become response variables in a generalized estimating equation. Simulation studies show that the suggested method performs well in several situations and an HIV-AIDS cohort study is analyzed as a real data example.

On inference of multivariate means under ranked set sampling

  • Rochani, Haresh;Linder, Daniel F.;Samawi, Hani;Panchal, Viral
    • Communications for Statistical Applications and Methods
    • /
    • v.25 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • In many studies, a researcher attempts to describe a population where units are measured for multiple outcomes, or responses. In this paper, we present an efficient procedure based on ranked set sampling to estimate and perform hypothesis testing on a multivariate mean. The method is based on ranking on an auxiliary covariate, which is assumed to be correlated with the multivariate response, in order to improve the efficiency of the estimation. We showed that the proposed estimators developed under this sampling scheme are unbiased, have smaller variance in the multivariate sense, and are asymptotically Gaussian. We also demonstrated that the efficiency of multivariate regression estimator can be improved by using Ranked set sampling. A bootstrap routine is developed in the statistical software R to perform inference when the sample size is small. We use a simulation study to investigate the performance of the method under known conditions and apply the method to the biomarker data collected in China Health and Nutrition Survey (CHNS 2009) data.

Estimating the Mixture of Proportional Hazards Model with the Constant Baseline Hazards Function

  • Kim Jong-woon;Eo Seong-phil
    • Proceedings of the Korean Reliability Society Conference
    • /
    • 2005.06a
    • /
    • pp.265-269
    • /
    • 2005
  • Cox's proportional hazards model (PHM) has been widely applied in the analysis of lifetime data, and it can be characterized by the baseline hazard function and covariates influencing systems' lifetime, where the covariates describe operating environments (e.g. temperature, pressure, humidity). In this article, we consider the constant baseline hazard function and a discrete random variable of a covariate. The estimation procedure is developed in a parametric framework when there are not only complete data but also incomplete one. The Expectation-Maximization (EM) algorithm is employed to handle the incomplete data problem. Simulation results are presented to illustrate the accuracy and some properties of the estimation results.

  • PDF

A SOFTWARE RELIABILITY ESTIMATION METHOD TO NUCLEAR SAFETY SOFTWARE

  • Park, Gee-Yong;Jang, Seung Cheol
    • Nuclear Engineering and Technology
    • /
    • v.46 no.1
    • /
    • pp.55-62
    • /
    • 2014
  • A method for estimating software reliability for nuclear safety software is proposed in this paper. This method is based on the software reliability growth model (SRGM), where the behavior of software failure is assumed to follow a non-homogeneous Poisson process. Two types of modeling schemes based on a particular underlying method are proposed in order to more precisely estimate and predict the number of software defects based on very rare software failure data. The Bayesian statistical inference is employed to estimate the model parameters by incorporating software test cases as a covariate into the model. It was identified that these models are capable of reasonably estimating the remaining number of software defects which directly affects the reactor trip functions. The software reliability might be estimated from these modeling equations, and one approach of obtaining software reliability value is proposed in this paper.

Nonparametric Estimation of Univariate Binary Regression Function

  • Jung, Shin Ae;Kang, Kee-Hoon
    • International Journal of Advanced Culture Technology
    • /
    • v.10 no.1
    • /
    • pp.236-241
    • /
    • 2022
  • We consider methods of estimating a binary regression function using a nonparametric kernel estimation when there is only one covariate. For this, the Nadaraya-Watson estimation method using single and double bandwidths are used. For choosing a proper smoothing amount, the cross-validation and plug-in methods are compared. In the real data analysis for case study, German credit data and heart disease data are used. We examine whether the nonparametric estimation for binary regression function is successful with the smoothing parameter using the above two approaches, and the performance is compared.