• 제목/요약/키워드: covariate

검색결과 260건 처리시간 0.023초

Comparative Study on Statistical Packages for Analyzing Logistic Regression - MINITAB, SAS, SPSS, STATA -

  • Kim, Soon-Kwi;Jeong, Dong-Bin;Park, Young-Sool
    • Journal of the Korean Data and Information Science Society
    • /
    • 제15권2호
    • /
    • pp.367-378
    • /
    • 2004
  • Recently logistic regression is popular in a variety of fields so that a number of statistical packages are developed for analyzing the logistic regression. This paper briefly considers the several types of logistic regression models used depending on different types of data. In addition, when four statistical packages (MINTAB, SAS, SPSS and STATA) are used to apply logistic regression models to the real fields respectively, their scope and characteristics are investigated.

  • PDF

성향점수를 활용한 보증기업의 재무성과 분석 (Financial performance analysis of guaranteed firms using propensity scores)

  • 남주하;김정렬;노맹석
    • 응용통계연구
    • /
    • 제29권2호
    • /
    • pp.389-398
    • /
    • 2016
  • 본 연구에서는 신용보증기금으로부터 보증받은 기업의 미시적 성과를 분석하기 위해 비보증기업과 비교 분석하였다. t-test나 회귀모형과 같은 단순모형으로 비교하게 되면 선택편의에 의해 실제 보증성과를 나타낼 수 없다. 이러한 문제점을 해결하기 위해서, 선택편의를 보정한 회귀모형을 제안하였고 실제 자료에 적용하였다. 분석결과 비보증기업에 비하여 보증기업의 미시적 성과를 확인할 수 있었다.

A Flexible Modeling Approach for Current Status Survival Data via Pseudo-Observations

  • Han, Seungbong;Andrei, Adin-Cristian;Tsui, Kam-Wah
    • 응용통계연구
    • /
    • 제25권6호
    • /
    • pp.947-958
    • /
    • 2012
  • When modeling event times in biomedical studies, the outcome might be incompletely observed. In this paper, we assume that the outcome is recorded as current status failure time data. Despite well-developed literature the routine practical use of many current status data modeling methods remains infrequent due to the lack of specialized statistical software, the difficulty to assess model goodness-of-fit, as well as the possible loss of information caused by covariate grouping or discretization. We propose a model based on pseudo-observations that is convenient to implement and that allows for flexibility in the choice of the outcome. Parameter estimates are obtained based on generalized estimating equations. Examples from studies in bile duct hyperplasia and breast cancer in conjunction with simulated data illustrate the practical advantages of this model.

Bezier curve smoothing of cumulative hazard function estimators

  • Cha, Yongseb;Kim, Choongrak
    • Communications for Statistical Applications and Methods
    • /
    • 제23권3호
    • /
    • pp.189-201
    • /
    • 2016
  • In survival analysis, the Nelson-Aalen estimator and Peterson estimator are often used to estimate a cumulative hazard function in randomly right censored data. In this paper, we suggested the smoothing version of the cumulative hazard function estimators using a Bezier curve. We compare them with the existing estimators including a kernel smooth version of the Nelson-Aalen estimator and the Peterson estimator in the sense of mean integrated square error to show through numerical studies that the proposed estimators are better than existing ones. Further, we applied our method to the Cox regression where covariates are used as predictors and suggested a survival function estimation at a given covariate.

Dynamic linear mixed models with ARMA covariance matrix

  • Han, Eun-Jeong;Lee, Keunbaik
    • Communications for Statistical Applications and Methods
    • /
    • 제23권6호
    • /
    • pp.575-585
    • /
    • 2016
  • Longitudinal studies repeatedly measure outcomes over time. Therefore, repeated measurements are serially correlated from same subject (within-subject variation) and there is also variation between subjects (between-subject variation). The serial correlation and the between-subject variation must be taken into account to make proper inference on covariate effects (Diggle et al., 2002). However, estimation of the covariance matrix is challenging because of many parameters and positive definiteness of the matrix. To overcome these limitations, we propose autoregressive moving average Cholesky decomposition (ARMACD) for the linear mixed models. The ARMACD allows a class of flexible, nonstationary, and heteroscedastic models that exploits the structure allowed by combining the AR and MA modeling of the random effects covariance matrix. We analyze a real dataset to illustrate our proposed methods.

Regression analysis of interval censored competing risk data using a pseudo-value approach

  • Kim, Sooyeon;Kim, Yang-Jin
    • Communications for Statistical Applications and Methods
    • /
    • 제23권6호
    • /
    • pp.555-562
    • /
    • 2016
  • Interval censored data often occur in an observational study where the subject is followed periodically. Instead of observing an exact failure time, two inspection times that include it are available. There are several methods to analyze interval censored failure time data (Sun, 2006). However, in the presence of competing risks, few methods have been suggested to estimate covariate effect on interval censored competing risk data. A sub-distribution hazard model is a commonly used regression model because it has one-to-one correspondence with a cumulative incidence function. Alternatively, Klein and Andersen (2005) proposed a pseudo-value approach that directly uses the cumulative incidence function. In this paper, we consider an extension of the pseudo-value approach into the interval censored data to estimate regression coefficients. The pseudo-values generated from the estimated cumulative incidence function then become response variables in a generalized estimating equation. Simulation studies show that the suggested method performs well in several situations and an HIV-AIDS cohort study is analyzed as a real data example.

On inference of multivariate means under ranked set sampling

  • Rochani, Haresh;Linder, Daniel F.;Samawi, Hani;Panchal, Viral
    • Communications for Statistical Applications and Methods
    • /
    • 제25권1호
    • /
    • pp.1-13
    • /
    • 2018
  • In many studies, a researcher attempts to describe a population where units are measured for multiple outcomes, or responses. In this paper, we present an efficient procedure based on ranked set sampling to estimate and perform hypothesis testing on a multivariate mean. The method is based on ranking on an auxiliary covariate, which is assumed to be correlated with the multivariate response, in order to improve the efficiency of the estimation. We showed that the proposed estimators developed under this sampling scheme are unbiased, have smaller variance in the multivariate sense, and are asymptotically Gaussian. We also demonstrated that the efficiency of multivariate regression estimator can be improved by using Ranked set sampling. A bootstrap routine is developed in the statistical software R to perform inference when the sample size is small. We use a simulation study to investigate the performance of the method under known conditions and apply the method to the biomarker data collected in China Health and Nutrition Survey (CHNS 2009) data.

Estimating the Mixture of Proportional Hazards Model with the Constant Baseline Hazards Function

  • Kim Jong-woon;Eo Seong-phil
    • 한국신뢰성학회:학술대회논문집
    • /
    • 한국신뢰성학회 2005년도 학술발표대회 논문집
    • /
    • pp.265-269
    • /
    • 2005
  • Cox's proportional hazards model (PHM) has been widely applied in the analysis of lifetime data, and it can be characterized by the baseline hazard function and covariates influencing systems' lifetime, where the covariates describe operating environments (e.g. temperature, pressure, humidity). In this article, we consider the constant baseline hazard function and a discrete random variable of a covariate. The estimation procedure is developed in a parametric framework when there are not only complete data but also incomplete one. The Expectation-Maximization (EM) algorithm is employed to handle the incomplete data problem. Simulation results are presented to illustrate the accuracy and some properties of the estimation results.

  • PDF

A SOFTWARE RELIABILITY ESTIMATION METHOD TO NUCLEAR SAFETY SOFTWARE

  • Park, Gee-Yong;Jang, Seung Cheol
    • Nuclear Engineering and Technology
    • /
    • 제46권1호
    • /
    • pp.55-62
    • /
    • 2014
  • A method for estimating software reliability for nuclear safety software is proposed in this paper. This method is based on the software reliability growth model (SRGM), where the behavior of software failure is assumed to follow a non-homogeneous Poisson process. Two types of modeling schemes based on a particular underlying method are proposed in order to more precisely estimate and predict the number of software defects based on very rare software failure data. The Bayesian statistical inference is employed to estimate the model parameters by incorporating software test cases as a covariate into the model. It was identified that these models are capable of reasonably estimating the remaining number of software defects which directly affects the reactor trip functions. The software reliability might be estimated from these modeling equations, and one approach of obtaining software reliability value is proposed in this paper.

Nonparametric Estimation of Univariate Binary Regression Function

  • Jung, Shin Ae;Kang, Kee-Hoon
    • International Journal of Advanced Culture Technology
    • /
    • 제10권1호
    • /
    • pp.236-241
    • /
    • 2022
  • We consider methods of estimating a binary regression function using a nonparametric kernel estimation when there is only one covariate. For this, the Nadaraya-Watson estimation method using single and double bandwidths are used. For choosing a proper smoothing amount, the cross-validation and plug-in methods are compared. In the real data analysis for case study, German credit data and heart disease data are used. We examine whether the nonparametric estimation for binary regression function is successful with the smoothing parameter using the above two approaches, and the performance is compared.