• Title/Summary/Keyword: Binary logistic regression

Search Result 400, Processing Time 0.029 seconds

A Study on the Power Comparison between Logistic Regression and Offset Poisson Regression for Binary Data

  • Kim, Dae-Youb;Park, Heung-Sun
    • Communications for Statistical Applications and Methods
    • /
    • v.19 no.4
    • /
    • pp.537-546
    • /
    • 2012
  • In this paper, for analyzing binary data, Poisson regression with offset and logistic regression are compared with respect to the power via simulations. Poisson distribution can be used as an approximation of binomial distribution when n is large and p is small; however, we investigate if the same conditions can be held for the power of significant tests between logistic regression and offset poisson regression. The result is that when offset size is large for rare events offset poisson regression has a similar power to logistic regression, but it has an acceptable power even with a moderate prevalence rate. However, with a small offset size (< 10), offset poisson regression should be used with caution for rare events or common events. These results would be good guidelines for users who want to use offset poisson regression models for binary data.

Analyzing Survival Data as Binary Outcomes with Logistic Regression

  • Lim, Jo-Han;Lee, Kyeong-Eun;Hahn, Kyu-S.;Park, Kun-Woo
    • Communications for Statistical Applications and Methods
    • /
    • v.17 no.1
    • /
    • pp.117-126
    • /
    • 2010
  • Clinical researchers often analyze survival data as binary outcomes using the logistic regression method. This paper examines the information loss resulting from analyzing survival time as binary outcomes. We first demonstrate that, under the proportional hazard assumption, this binary discretization does result in a significant information loss. Second, when fitting a logistic model to survival time data, researchers inadvertently use the maximal statistic. We implement a numerical study to examine the properties of the reference distribution for this statistic, finally, we show that the logistic regression method can still be a useful tool for analyzing survival data in particular when the proportional hazard assumption is questionable.

Binary Forecast of Heavy Snow Using Statistical Models

  • Sohn, Keon-Tae
    • Communications for Statistical Applications and Methods
    • /
    • v.13 no.2
    • /
    • pp.369-378
    • /
    • 2006
  • This Study focuses on the binary forecast of occurrence of heavy snow in Honam area based on the MOS(model output statistic) method. For our study daily amount of snow cover at 17 stations during the cold season (November to March) in 2001 to 2005 and Corresponding 45 RDAPS outputs are used. Logistic regression model and neural networks are applied to predict the probability of occurrence of Heavy snow. Based on the distribution of estimated probabilities, optimal thresholds are determined via true shill score. According to the results of comparison the logistic regression model is recommended.

An educational tool for binary logistic regression model using Excel VBA (엑셀 VBA를 이용한 이분형 로지스틱 회귀모형 교육도구 개발)

  • Park, Cheolyong;Choi, Hyun Seok
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.2
    • /
    • pp.403-410
    • /
    • 2014
  • Binary logistic regression analysis is a statistical technique that explains binary response variable by quantitative or qualitative explanatory variables. In the binary logistic regression model, the probability that the response variable equals, say 1, one of the binary values is to be explained as a transformation of linear combination of explanatory variables. This is one of big barriers that non-statisticians have to overcome in order to understand the model. In this study, an educational tool is developed that explains the need of the binary logistic regression analysis using Excel VBA. More precisely, this tool explains the problems related to modeling the probability of the response variable equal to 1 as a linear combination of explanatory variables and then shows how these problems can be solved through some transformations of the linear combination.

A Logistic Regression Analysis of Two-Way Binary Attribute Data (이원 이항 계수치 자료의 로지스틱 회귀 분석)

  • Ahn, Hae-Il
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.35 no.3
    • /
    • pp.118-128
    • /
    • 2012
  • An attempt is given to the problem of analyzing the two-way binary attribute data using the logistic regression model in order to find a sound statistical methodology. It is demonstrated that the analysis of variance (ANOVA) may not be good enough, especially for the case that the proportion is very low or high. The logistic transformation of proportion data could be a help, but not sound in the statistical sense. Meanwhile, the adoption of generalized least squares (GLS) method entails much to estimate the variance-covariance matrix. On the other hand, the logistic regression methodology provides sound statistical means in estimating related confidence intervals and testing the significance of model parameters. Based on simulated data, the efficiencies of estimates are ensured with a view to demonstrate the usefulness of the methodology.

Supervised Learning-Based Collaborative Filtering Using Market Basket Data for the Cold-Start Problem

  • Hwang, Wook-Yeon;Jun, Chi-Hyuck
    • Industrial Engineering and Management Systems
    • /
    • v.13 no.4
    • /
    • pp.421-431
    • /
    • 2014
  • The market basket data in the form of a binary user-item matrix or a binary item-user matrix can be modelled as a binary classification problem. The binary logistic regression approach tackles the binary classification problem, where principal components are predictor variables. If users or items are sparse in the training data, the binary classification problem can be considered as a cold-start problem. The binary logistic regression approach may not function appropriately if the principal components are inefficient for the cold-start problem. Assuming that the market basket data can also be considered as a special regression problem whose response is either 0 or 1, we propose three supervised learning approaches: random forest regression, random forest classification, and elastic net to tackle the cold-start problem, comparing the performance in a variety of experimental settings. The experimental results show that the proposed supervised learning approaches outperform the conventional approaches.

Empirical Analysis on the Relationship between R&D Inputs and Performance Using Successive Binary Logistic Regression Models (연속적 이항 로지스틱 회귀모형을 이용한 R&D 투입 및 성과 관계에 대한 실증분석)

  • Park, Sungmin
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.40 no.3
    • /
    • pp.342-357
    • /
    • 2014
  • The present study analyzes the relationship between research and development (R&D) inputs and performance of a national technology innovation R&D program using successive binary Logistic regression models based on a typical R&D logic model. In particular, this study focuses on to answer the following three main questions; (1) "To what extent, do the R&D inputs have an effect on the performance creation?"; (2) "Is an obvious relationship verified between the immediate predecessor and its successor performance?"; and (3) "Is there a difference in the performance creation between R&D government subsidy recipient types and between R&D collaboration types?" Methodologically, binary Logistic regression models are established successively considering the "Success-Failure" binary data characteristic regarding the performance creation. An empirical analysis is presented analyzing the sample n = 2,178 R&D projects completed. This study's major findings are as follows. First, the R&D inputs have a statistically significant relationship only with the short-term, technical output, "Patent Registration." Second, strong dependencies are identified between the immediate predecessor and its successor performance. Third, the success probability of the performance creation is statistically significantly different between the R&D types aforementioned. Specifically, compared with "Large Company", "Small and Medium-Sized Enterprise (SMS)" shows a greater success probability of "Sales" and "New Employment." Meanwhile, "R&D Collaboration" achieves a larger success probability of "Patent Registration" and "Sales."

Collapsibility and Suppression for Cumulative Logistic Model

  • Hong, Chong-Sun;Kim, Kil-Tae
    • Communications for Statistical Applications and Methods
    • /
    • v.12 no.2
    • /
    • pp.313-322
    • /
    • 2005
  • In this paper, we discuss suppression for logistic regression model. Suppression for linear regression model was defined as the relationship among sums of squared for regression as well as correlation coefficients of. variables. Since it is not common to obtain simple correlation coefficient for binary response variable of logistic model, we consider cumulative logistic models with multinomial and ordinal response variables rather than usual logistic model. As number of category of a response variable for the cumulative logistic model gets collapsed into binary, it is found that suppressions for these logistic models are changed. These suppression results for cumulative logistic models are discussed and compared with those of linear model.

Blur Detection through Multinomial Logistic Regression based Adaptive Threshold

  • Mahmood, Muhammad Tariq;Siddiqui, Shahbaz Ahmed;Choi, Young Kyu
    • Journal of the Semiconductor & Display Technology
    • /
    • v.18 no.4
    • /
    • pp.110-115
    • /
    • 2019
  • Blur detection and segmentation play vital role in many computer vision applications. Among various methods, local binary pattern based methods provide reasonable blur detection results. However, in conventional local binary pattern based methods, the blur map is computed by using a fixed threshold irrespective of the type and level of blur. It may not be suitable for images with variations in imaging conditions and blur. In this paper we propose an effective method based on local binary pattern with adaptive threshold for blur detection. The adaptive threshold is computed based on the model learned through the multinomial logistic regression. The performance of the proposed method is evaluated using different datasets. The comparative analysis not only demonstrates the effectiveness of the proposed method but also exhibits it superiority over the existing methods.

A Bayesian Method for Narrowing the Scope of Variable Selection in Binary Response Logistic Regression

  • Kim, Hea-Jung;Lee, Ae-Kyung
    • Journal of Korean Society for Quality Management
    • /
    • v.26 no.1
    • /
    • pp.143-160
    • /
    • 1998
  • This article is concerned with the selection of subsets of predictor variables to be included in bulding the binary response logistic regression model. It is based on a Bayesian aproach, intended to propose and develop a procedure that uses probabilistic considerations for selecting promising subsets. This procedure reformulates the logistic regression setup in a hierarchical normal mixture model by introducing a set of hyperparameters that will be used to identify subset choices. It is done by use of the fact that cdf of logistic distribution is a, pp.oximately equivalent to that of $t_{(8)}$/.634 distribution. The a, pp.opriate posterior probability of each subset of predictor variables is obtained by the Gibbs sampler, which samples indirectly from the multinomial posterior distribution on the set of possible subset choices. Thus, in this procedure, the most promising subset of predictors can be identified as that with highest posterior probability. To highlight the merit of this procedure a couple of illustrative numerical examples are given.

  • PDF