• Title/Summary/Keyword: Logistic regression analysis

Search Result 4,173, Processing Time 0.03 seconds

Steal Success Model for 2007 Korean Professional Baseball Games (2007년 한국프로야구에서 도루성공모형)

  • Hong, Chong-Sun;Choi, Jeong-Min
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.455-468
    • /
    • 2008
  • Based on the huge baseball game records, the steal plays an important role to affect the result of games. For the research about success or failure of the steal in baseball games, logistic regression models are developed based on 2007 Korean professional baseball games. The analyses of logistic regression models are compared of those of the discriminant models. It is found that the performance of the logistic regression analysis is more efficient than that of the discriminant analysis. Also, we consider an alternative logistic regression model based on categorical data which are transformed from uneasy obtainable continuous data.

Performance Comparison of Mahalanobis-Taguchi System and Logistic Regression : A Case Study (마할라노비스-다구치 시스템과 로지스틱 회귀의 성능비교 : 사례연구)

  • Lee, Seung-Hoon;Lim, Geun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.39 no.5
    • /
    • pp.393-402
    • /
    • 2013
  • The Mahalanobis-Taguchi System (MTS) is a diagnostic and predictive method for multivariate data. In the MTS, the Mahalanobis space (MS) of reference group is obtained using the standardized variables of normal data. The Mahalanobis space can be used for multi-class classification. Once this MS is established, the useful set of variables is identified to assist in the model analysis or diagnosis using orthogonal arrays and signal-to-noise ratios. And other several techniques have already been used for classification, such as linear discriminant analysis and logistic regression, decision trees, neural networks, etc. The goal of this case study is to compare the ability of the Mahalanobis-Taguchi System and logistic regression using a data set.

An Introduction to Logistic Regression: From Basic Concepts to Interpretation with Particular Attention to Nursing Domain

  • Park, Hyeoun-Ae
    • Journal of Korean Academy of Nursing
    • /
    • v.43 no.2
    • /
    • pp.154-164
    • /
    • 2013
  • Purpose: The purpose of this article is twofold: 1) introducing logistic regression (LR), a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, and 2) examining use and reporting of LR in the nursing literature. Methods: Text books on LR and research articles employing LR as main statistical analysis were reviewed. Twenty-three articles published between 2010 and 2011 in the Journal of Korean Academy of Nursing were analyzed for proper use and reporting of LR models. Results: Logistic regression from basic concepts such as odds, odds ratio, logit transformation and logistic curve, assumption, fitting, reporting and interpreting to cautions were presented. Substantial shortcomings were found in both use of LR and reporting of results. For many studies, sample size was not sufficiently large to call into question the accuracy of the regression model. Additionally, only one study reported validation analysis. Conclusion: Nursing researchers need to pay greater attention to guidelines concerning the use and reporting of LR models.

A Logistic Regression Analysis of Two-Way Binary Attribute Data (이원 이항 계수치 자료의 로지스틱 회귀 분석)

  • Ahn, Hae-Il
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.35 no.3
    • /
    • pp.118-128
    • /
    • 2012
  • An attempt is given to the problem of analyzing the two-way binary attribute data using the logistic regression model in order to find a sound statistical methodology. It is demonstrated that the analysis of variance (ANOVA) may not be good enough, especially for the case that the proportion is very low or high. The logistic transformation of proportion data could be a help, but not sound in the statistical sense. Meanwhile, the adoption of generalized least squares (GLS) method entails much to estimate the variance-covariance matrix. On the other hand, the logistic regression methodology provides sound statistical means in estimating related confidence intervals and testing the significance of model parameters. Based on simulated data, the efficiencies of estimates are ensured with a view to demonstrate the usefulness of the methodology.

Power Failure Sensitivity Analysis via Grouped L1/2 Sparsity Constrained Logistic Regression

  • Li, Baoshu;Zhou, Xin;Dong, Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.8
    • /
    • pp.3086-3101
    • /
    • 2021
  • To supply precise marketing and differentiated service for the electric power service department, it is very important to predict the customers with high sensitivity of electric power failure. To solve this problem, we propose a novel grouped 𝑙1/2 sparsity constrained logistic regression method for sensitivity assessment of electric power failure. Different from the 𝑙1 norm and k-support norm, the proposed grouped 𝑙1/2 sparsity constrained logistic regression method simultaneously imposes the inter-class information and tighter approximation to the nonconvex 𝑙0 sparsity to exploit multiple correlated attributions for prediction. Firstly, the attributes or factors for predicting the customer sensitivity of power failure are selected from customer sheets, such as customer information, electric consuming information, electrical bill, 95598 work sheet, power failure events, etc. Secondly, all these samples with attributes are clustered into several categories, and samples in the same category are assumed to be sharing similar properties. Then, 𝑙1/2 norm constrained logistic regression model is built to predict the customer's sensitivity of power failure. Alternating direction of multipliers (ADMM) algorithm is finally employed to solve the problem by splitting it into several sub-problems effectively. Experimental results on power electrical dataset with about one million customer data from a province validate that the proposed method has a good prediction accuracy.

Machine learning-based Predictive Model of Suicidal Thoughts among Korean Adolescents. (머신러닝 기반 한국 청소년의 자살 생각 예측 모델)

  • YeaJu JIN;HyunKi KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.1-6
    • /
    • 2023
  • This study developed models using decision forest, support vector machine, and logistic regression methods to predict and prevent suicidal ideation among Korean adolescents. The study sample consisted of 51,407 individuals after removing missing data from the raw data of the 18th (2022) Youth Health Behavior Survey conducted by the Korea Centers for Disease Control and Prevention. Analysis was performed using the MS Azure program with Two-Class Decision Forest, Two-Class Support Vector Machine, and Two-Class Logistic Regression. The results of the study showed that the decision forest model achieved an accuracy of 84.8% and an F1-score of 36.7%. The support vector machine model achieved an accuracy of 86.3% and an F1-score of 24.5%. The logistic regression model achieved an accuracy of 87.2% and an F1-score of 40.1%. Applying the logistic regression model with SMOTE to address data imbalance resulted in an accuracy of 81.7% and an F1-score of 57.7%. Although the accuracy slightly decreased, the recall, precision, and F1-score improved, demonstrating excellent performance. These findings have significant implications for the development of prediction models for suicidal ideation among Korean adolescents and can contribute to the prevention and improvement of youth suicide.

Analysis of the relationship between regulation compliance and occupational injuries - Focusing on logistic and poisson regression analysis - (규제 순응도와 산업재해 발생 수준간의 관계 분석 - 로지스틱 회귀분석과 포아송 회귀분석을 중심으로 -)

  • Rhee, Kyung-Yong;Kim, Ki-Sik;Yoon, Young-Shik
    • Journal of the Korea Safety Management & Science
    • /
    • v.15 no.2
    • /
    • pp.9-20
    • /
    • 2013
  • OSHA(Occupational Safety and Health Act) generally regulates employer's business principles in the workplace to maintain safety environment. This act has the fundamental purpose to protect employee's safety and health in the workplace by reducing industrial accidents. Authors tried to investigate the correlation between 'occupational injuries and illnesses' and level of regulation compliance using Survey on Current Status of Occupational Safety & Health data by the various statistical methods, such as generalized regression analysis, logistic regression analysis and poison regression analysis in order to compare the results of those methods. The results have shown that the significant affecting compliance factors were different among those statistical methods. This means that specific interpretation should be considered based on each statistical method. In the future, relevant statistical technique will be developed considering the distribution type of occupational injuries.

Categorical Analysis for the Factors of Incustrial Accident Cases (산업재해 사례인자의 범주형 분석)

  • Jhee, Kyung-Tek;Song, Young-Ho;Chung, Kook-Sam
    • Journal of the Korean Society of Safety
    • /
    • v.17 no.1
    • /
    • pp.94-98
    • /
    • 2002
  • This study aimed to search for the fundamental accident causes using a categorical analysis, a kind of statistical methods. As the analysis methods, correlation analysis, independence test and logistic regression analysis were used. And the SPSS package, a general-purpose mathematical library, was used to obtain statistical characteristics. As the result of this study, the accident causes associated with factor of 'lost working days' were factors such as 'employed periods', 'sex', 'type of accident', 'month'. In case of applying independence test method, the most important cause was the factor of 'month'. In case that logistic regression analysis method was applied, the cause contributed to the increase structure'. 'less than 6 month'. On the basis of these results, the plan for accident prevention and the proper investment for accident prevention expenditure could be carried out in each workshop.

An Analysis of Factors Relating to Agricultural Machinery Farm-Work Accidents Using Logistic Regression

  • Kim, Byounggap;Yum, Sunghyun;Kim, Yu-Yong;Yun, Namkyu;Shin, Seung-Yeoub;You, Seokcheol
    • Journal of Biosystems Engineering
    • /
    • v.39 no.3
    • /
    • pp.151-157
    • /
    • 2014
  • Purpose: In order to develop strategies to prevent farm-work accidents relating to agricultural machinery, influential factors were examined in this paper. The effects of these factors were quantified using logistic regression. Methods: Based on the results of a survey on farm-work accidents conducted by the National Academy of Agricultural Science, 21 tentative independent variables were selected. To apply these variables to regression, the presence of multicollinearity was examined by comparing correlation coefficients, checking the statistical significance of the coefficients in a simple linear regression model, and calculating the variance inflation factor. A logistic regression model and determination method of its goodness of fit was defined. Results: Among 21 independent variables, 13 variables were not collinear each other. The results of a logistic regression analysis using these variables showed that the model was significant and acceptable, with deviance of 714.053. Parameter estimation results showed that four variables (age, power tiller ownership, cognizance of the government's safety policy, and consciousness of safety) were significant. The logistic regression model predicted that the former two increased accident odds by 1.027 and 8.506 times, respectively, while the latter two decreased the odds by 0.243 and 0.545 times, respectively. Conclusions: Prevention strategies against factors causing an accident, such as the age of farmers and the use of a power tiller, are necessary. In addition, more efficient trainings to elevate the farmer's consciousness about safety must be provided.

Comparison of the Prediction Model of Adolescents' Suicide Attempt Using Logistic Regression and Decision Tree: Secondary Data Analysis of the 2019 Youth Health Risk Behavior Web-Based Survey (로지스틱 회귀모형과 의사결정 나무모형을 활용한 청소년 자살 시도 예측모형 비교: 2019 청소년 건강행태 온라인조사를 이용한 2차 자료분석)

  • Lee, Yoonju;Kim, Heejin;Lee, Yesul;Jeong, Hyesun
    • Journal of Korean Academy of Nursing
    • /
    • v.51 no.1
    • /
    • pp.40-53
    • /
    • 2021
  • Purpose: The purpose of this study was to develop and compare the prediction model for suicide attempts by Korean adolescents using logistic regression and decision tree analysis. Methods: This study utilized secondary data drawn from the 2019 Youth Health Risk Behavior web-based survey. A total of 20 items were selected as the explanatory variables (5 of sociodemographic characteristics, 10 of health-related behaviors, and 5 of psychosocial characteristics). For data analysis, descriptive statistics and logistic regression with complex samples and decision tree analysis were performed using IBM SPSS ver. 25.0 and Stata ver. 16.0. Results: A total of 1,731 participants (3.0%) out of 57,303 responded that they had attempted suicide. The most significant predictors of suicide attempts as determined using the logistic regression model were experience of sadness and hopelessness, substance abuse, and violent victimization. Girls who have experience of sadness and hopelessness, and experience of substance abuse have been identified as the most vulnerable group in suicide attempts in the decision tree model. Conclusion: Experiences of sadness and hopelessness, experiences of substance abuse, and experiences of violent victimization are the common major predictors of suicide attempts in both logistic regression and decision tree models, and the predict rates of both models were similar. We suggest to provide programs considering combination of high-risk predictors for adolescents to prevent suicide attempt.