• Title/Summary/Keyword: Binary logistic regression

Search Result 408, Processing Time 0.022 seconds

A Comparative Study of Predictive Factors for Hypertension using Logistic Regression Analysis and Decision Tree Analysis

  • SoHyun Kim;SungHyoun Cho
    • Physical Therapy Rehabilitation Science
    • /
    • v.12 no.2
    • /
    • pp.80-91
    • /
    • 2023
  • Objective: The purpose of this study is to identify factors that affect the incidence of hypertension using logistic regression and decision tree analysis, and to build and compare predictive models. Design: Secondary data analysis study Methods: We analyzed 9,859 subjects from the Korean health panel annual 2019 data provided by the Korea Institute for Health and Social Affairs and National Health Insurance Service. Frequency analysis, chi-square test, binary logistic regression, and decision tree analysis were performed on the data. Results: In logistic regression analysis, those who were 60 years of age or older (Odds ratio, OR=68.801, p<0.001), those who were divorced/widowhood/separated (OR=1.377, p<0.001), those who graduated from middle school or younger (OR=1, reference), those who did not walk at all (OR=1, reference), those who were obese (OR=5.109, p<0.001), and those who had poor subjective health status (OR=2.163, p<0.001) were more likely to develop hypertension. In the decision tree, those over 60 years of age, overweight or obese, and those who graduated from middle school or younger had the highest probability of developing hypertension at 83.3%. Logistic regression analysis showed a specificity of 85.3% and sensitivity of 47.9%; while decision tree analysis showed a specificity of 81.9% and sensitivity of 52.9%. In classification accuracy, logistic regression and decision tree analysis showed 73.6% and 72.6% prediction, respectively. Conclusions: Both logistic regression and decision tree analysis were adequate to explain the predictive model. It is thought that both analysis methods can be used as useful data for constructing a predictive model for hypertension.

Model assessment with residual plot in logistic regression (로지스틱회귀에서 잔차산점도를 이용한 모형평가)

  • Kahng, Myung Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.1
    • /
    • pp.141-150
    • /
    • 2015
  • Graphical paradigms for assessing the adequacy of models in logistic regression are discussed. The residual plot has been widely used as a graphical tool for evaluating the adequacy of the model. However, this approach works well only for linear models with constant variance, and the alternative approach, the marginal model plot, has its defects as well. We suggest a Chi-residual plot that overcomes the potential shortcomings of the marginal model plot.

A Data Mining Procedure for Unbalanced Binary Classification (불균형 이분 데이터 분류분석을 위한 데이터마이닝 절차)

  • Jung, Han-Na;Lee, Jeong-Hwa;Jun, Chi-Hyuck
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.36 no.1
    • /
    • pp.13-21
    • /
    • 2010
  • The prediction of contract cancellation of customers is essential in insurance companies but it is a difficult problem because the customer database is large and the target or cancelled customers are a small proportion of the database. This paper proposes a new data mining approach to the binary classification by handling a large-scale unbalanced data. Over-sampling, clustering, regularized logistic regression and boosting are also incorporated in the proposed approach. The proposed approach was applied to a real data set in the area of insurance and the results were compared with some other classification techniques.

How Do South Korean People View the US and Chinese National Influence?: Is Soft Power Zero-Sum?

  • Zhao, Xiaoyu
    • Asian Journal for Public Opinion Research
    • /
    • v.5 no.1
    • /
    • pp.15-40
    • /
    • 2017
  • This paper addresses the zero-sum of soft power against the backdrop of the rise of China and the relative "decline" of America. It attempts to find out that whether the "decline" of America's soft power is caused by the rise of China's soft power, and whether China's rise could guarantee with certainty the growth of soft power. In light of the particularity of South Korea, that is, its economy relies on China and its security relies on the US, this paper chooses South Korea as the entry point for the study. Based on the Pew data from a South Korean opinion poll, this paper conducts bivariate correlation and binary logistic regression respectively, to explore the existence of zero-sum "competitions" between China's and America's soft power.

Analysis of Factors Affecting Pedestrian Leg Injury Severity (보행자 다리상해 영향요인 분석)

  • Park, Jae-Hong;Oh, Cheol
    • Transactions of the Korean Society of Automotive Engineers
    • /
    • v.19 no.3
    • /
    • pp.9-15
    • /
    • 2011
  • This study analyzed contributing factors affecting leg injury severity in pedestrian-vehicle crashes. A Binary Logistic Regression (BLR) method was used to identify the factors. Independent variables include characteristics for pedestrian, vehicle, road, and environmental conditions. The leg injury severity is classified into two classes, which are dependent variables in this study, such as 'severe' and 'minor' injuries. Pedestrian age, collision speed, and the height of vehicle were identified as significant factors for the leg injury. The probabilistic outcome of predicting leg injury severity can be effectively used in not only deriving pedestrian-related safety policies but also developing advanced vehicular technologies for pedestrian protection.

Exploring interaction using 3-D residual plots in logistic regression model (3차원 잔차산점도를 이용한 로지스틱회귀모형에서 교호작용의 탐색)

  • Kahng, Myung-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.177-185
    • /
    • 2014
  • Under bivariate normal distribution assumptions, the interaction and quadratic terms are needed in the logistic regression model with two predictors. However, depending on the correlation coefficient and the variances of two conditional distributions, the interaction and quadratic terms may not be necessary. Although the need for these terms can be determined by comparing the two scatter plots, it is not as useful for interaction terms. We explore the structure and usefulness of the 3-D residual plot as a tool for dealing with interaction in logistic regression models. If predictors have an interaction effect, a 3-D residual plot can show the effect. This is illustrated by simulated and real data.

Log-density Ratio with Two Predictors in a Logistic Regression Model (로지스틱 회귀모형에서 이변량 정규분포에 근거한 로그-밀도비)

  • Kahng, Myung Wook;Yoon, Jae Eun
    • The Korean Journal of Applied Statistics
    • /
    • v.26 no.1
    • /
    • pp.141-149
    • /
    • 2013
  • We present methods for studying the log-density ratio that enables the selection of the predictors and the form to be included in the logistic regression model. Under bivariate normal distributional assumptions, we investigate the form of the log-density ratio as a function of two predictors. If two covariance matrices are equal, then the crossproduct and quadratic terms are not needed. If the variables are uncorrelated, we do not need the crossproduct terms, but we still need the linear and quadratic terms. We also explore other conditions in which the crossproduct and quadratic terms are not needed in the logistic regression model.

On a Bayes Criterion for the Goodness-of-Link Test for Binary Response Regression Models : Probit Link versus Logit Link

  • Kim, Hea-Jung
    • Journal of the Korean Statistical Society
    • /
    • v.26 no.2
    • /
    • pp.261-276
    • /
    • 1997
  • In the context of binary response regression, the problem of constructing Bayesian goodness-of-link test for testing logit link versus probit link is considered. Based upon the well known facts that cdf of logistic variate .approx. cdf of $t_{8}$/.634 and, as .nu. .to. .infty., cdf of $t_{\nu}$ approximates to that of N(0,1), Bayes factor is derived as a test criterion. A synthesis of the Gibbs sampling and a marginal likelihood estimation scheme is also proposed to compute the Bayes factor. Performance of the test is investigated via Monte Carlo study. The new test is also illustrated with an empirical data example.e.

  • PDF

A Proposal of the Evaluation Method for Rock Slope Stability Using Logistic Regression Analysis (로지스틱 회귀분석을 통한 암반사면의 안정성 평가법 제안)

  • 이용희;김종열
    • Tunnel and Underground Space
    • /
    • v.14 no.2
    • /
    • pp.133-141
    • /
    • 2004
  • Through the many site investigations, different methods for evaluating stability of rock slopes have been proposed. Those methods, however, may lead to different results depending on the subjective judgments associated with the selection of the evaluation items and the application of weighting factor. Accordingly, binary logistic regression analysis was carried out to ensure fair appliction of the weighting factor, leading to an equation for evaluating the stability of rock slopes.

Prediction Model with a Logistic Regression of Sequencing Two Arrival Flows (합류하는 두 항공기간 도착순서 결정에 대한 로지스틱회귀 예측 모형)

  • Jung, Soyeon;Lee, Keumjin
    • Journal of the Korean Society for Aviation and Aeronautics
    • /
    • v.23 no.4
    • /
    • pp.42-48
    • /
    • 2015
  • This paper has its purpose on constructing a prediction model of the arrival sequencing strategy which reflects the actual sequencing patterns of air traffic controllers. As the first step, we analyzed a pair-wise sequencing of two aircraft entering TMA from different entering points. Based on the historical trajectory data, several traffic factors such as time, speed and traffic density were examined for the model. With statistically significant factors, we constructed a prediction model of arrival sequencing through a binary logistic regression analysis. With the estimated coefficients, the performance of the model was conducted through a cross validation.