• Title/Summary/Keyword: Logistic regression-analysis

Search Result 4,398, Processing Time 0.026 seconds

Steal Success Model for 2007 Korean Professional Baseball Games (2007년 한국프로야구에서 도루성공모형)

  • Hong, Chong-Sun;Choi, Jeong-Min
    • The Korean Journal of Applied Statistics
    • /
    • v.21 no.3
    • /
    • pp.455-468
    • /
    • 2008
  • Based on the huge baseball game records, the steal plays an important role to affect the result of games. For the research about success or failure of the steal in baseball games, logistic regression models are developed based on 2007 Korean professional baseball games. The analyses of logistic regression models are compared of those of the discriminant models. It is found that the performance of the logistic regression analysis is more efficient than that of the discriminant analysis. Also, we consider an alternative logistic regression model based on categorical data which are transformed from uneasy obtainable continuous data.

Performance Comparison of Mahalanobis-Taguchi System and Logistic Regression : A Case Study (마할라노비스-다구치 시스템과 로지스틱 회귀의 성능비교 : 사례연구)

  • Lee, Seung-Hoon;Lim, Geun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.39 no.5
    • /
    • pp.393-402
    • /
    • 2013
  • The Mahalanobis-Taguchi System (MTS) is a diagnostic and predictive method for multivariate data. In the MTS, the Mahalanobis space (MS) of reference group is obtained using the standardized variables of normal data. The Mahalanobis space can be used for multi-class classification. Once this MS is established, the useful set of variables is identified to assist in the model analysis or diagnosis using orthogonal arrays and signal-to-noise ratios. And other several techniques have already been used for classification, such as linear discriminant analysis and logistic regression, decision trees, neural networks, etc. The goal of this case study is to compare the ability of the Mahalanobis-Taguchi System and logistic regression using a data set.

An Introduction to Logistic Regression: From Basic Concepts to Interpretation with Particular Attention to Nursing Domain

  • Park, Hyeoun-Ae
    • Journal of Korean Academy of Nursing
    • /
    • v.43 no.2
    • /
    • pp.154-164
    • /
    • 2013
  • Purpose: The purpose of this article is twofold: 1) introducing logistic regression (LR), a multivariable method for modeling the relationship between multiple independent variables and a categorical dependent variable, and 2) examining use and reporting of LR in the nursing literature. Methods: Text books on LR and research articles employing LR as main statistical analysis were reviewed. Twenty-three articles published between 2010 and 2011 in the Journal of Korean Academy of Nursing were analyzed for proper use and reporting of LR models. Results: Logistic regression from basic concepts such as odds, odds ratio, logit transformation and logistic curve, assumption, fitting, reporting and interpreting to cautions were presented. Substantial shortcomings were found in both use of LR and reporting of results. For many studies, sample size was not sufficiently large to call into question the accuracy of the regression model. Additionally, only one study reported validation analysis. Conclusion: Nursing researchers need to pay greater attention to guidelines concerning the use and reporting of LR models.

Analysis of Donation Intention of MZ Generation and Senior Generation Using Machine Learning's logistic Regression (머신러닝의 로지스틱 회귀를 활용한 MZ세대와 시니어 세대의 기부의도 분석)

  • Min Jung Oh;IkJin Jeon
    • Journal of Information Technology Services
    • /
    • v.23 no.2
    • /
    • pp.1-12
    • /
    • 2024
  • This study aims to find ways to increase the declining donation intention by using machine learning techniques. To this end, in order to predict factors that affect donations between the MZ generation and the senior generation, various machine learning algorithms, including logistic regression analysis, are applied to build a model to determine variables that affect donation intention, and provide statistical verification and evaluation indicators. In this study, differences in donation intention by generation were expected as a variable affecting donation intention, and the senior generation was expected to show a higher donation intention tendency than the younger generation. However, although the research results were not statistically significant, the younger generation showed a higher intention to donate, and these results are interpreted to mean that value consumption and ethical consumption, which are important to today's MZ generation, also influenced donations. However, there were differences between generations in the amount of donations, and higher donation amounts were confirmed among the senior generation (those in their 50s or older) than the younger generation. In addition, the results of the logistic regression analysis showed that previous donation experience had a positive effect on future donation intention, and the more motivation and importance of donation and various social participation activities online and offline, the more active one became in donating.

Development of a Metabolic Syndrome Classification and Prediction Model for Koreans Using Deep Learning Technology: The Korea National Health and Nutrition Examination Survey (KNHANES) (2013-2018)

  • Hyerim Kim;Ji Hye Heo;Dong Hoon Lim;Yoona Kim
    • Clinical Nutrition Research
    • /
    • v.12 no.2
    • /
    • pp.138-153
    • /
    • 2023
  • The prevalence of metabolic syndrome (MetS) and its cost are increasing due to lifestyle changes and aging. This study aimed to develop a deep neural network model for prediction and classification of MetS according to nutrient intake and other MetS-related factors. This study included 17,848 individuals aged 40-69 years from the Korea National Health and Nutrition Examination Survey (2013-2018). We set MetS (3-5 risk factors present) as the dependent variable and 52 MetS-related factors and nutrient intake variables as independent variables in a regression analysis. The analysis compared and analyzed model accuracy, precision and recall by conventional logistic regression, machine learning-based logistic regression and deep learning. The accuracy of train data was 81.2089, and the accuracy of test data was 81.1485 in a MetS classification and prediction model developed in this study. These accuracies were higher than those obtained by conventional logistic regression or machine learning-based logistic regression. Precision, recall, and F1-score also showed the high accuracy in the deep learning model. Blood alanine aminotransferase (β = 12.2035) level showed the highest regression coefficient followed by blood aspartate aminotransferase (β = 11.771) level, waist circumference (β = 10.8555), body mass index (β = 10.3842), and blood glycated hemoglobin (β = 10.1802) level. Fats (cholesterol [β = -2.0545] and saturated fatty acid [β = -2.0483]) showed high regression coefficients among nutrient intakes. The deep learning model for classification and prediction on MetS showed a higher accuracy than conventional logistic regression or machine learning-based logistic regression.

A Logistic Regression Analysis of Two-Way Binary Attribute Data (이원 이항 계수치 자료의 로지스틱 회귀 분석)

  • Ahn, Hae-Il
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.35 no.3
    • /
    • pp.118-128
    • /
    • 2012
  • An attempt is given to the problem of analyzing the two-way binary attribute data using the logistic regression model in order to find a sound statistical methodology. It is demonstrated that the analysis of variance (ANOVA) may not be good enough, especially for the case that the proportion is very low or high. The logistic transformation of proportion data could be a help, but not sound in the statistical sense. Meanwhile, the adoption of generalized least squares (GLS) method entails much to estimate the variance-covariance matrix. On the other hand, the logistic regression methodology provides sound statistical means in estimating related confidence intervals and testing the significance of model parameters. Based on simulated data, the efficiencies of estimates are ensured with a view to demonstrate the usefulness of the methodology.

Power Failure Sensitivity Analysis via Grouped L1/2 Sparsity Constrained Logistic Regression

  • Li, Baoshu;Zhou, Xin;Dong, Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.8
    • /
    • pp.3086-3101
    • /
    • 2021
  • To supply precise marketing and differentiated service for the electric power service department, it is very important to predict the customers with high sensitivity of electric power failure. To solve this problem, we propose a novel grouped 𝑙1/2 sparsity constrained logistic regression method for sensitivity assessment of electric power failure. Different from the 𝑙1 norm and k-support norm, the proposed grouped 𝑙1/2 sparsity constrained logistic regression method simultaneously imposes the inter-class information and tighter approximation to the nonconvex 𝑙0 sparsity to exploit multiple correlated attributions for prediction. Firstly, the attributes or factors for predicting the customer sensitivity of power failure are selected from customer sheets, such as customer information, electric consuming information, electrical bill, 95598 work sheet, power failure events, etc. Secondly, all these samples with attributes are clustered into several categories, and samples in the same category are assumed to be sharing similar properties. Then, 𝑙1/2 norm constrained logistic regression model is built to predict the customer's sensitivity of power failure. Alternating direction of multipliers (ADMM) algorithm is finally employed to solve the problem by splitting it into several sub-problems effectively. Experimental results on power electrical dataset with about one million customer data from a province validate that the proposed method has a good prediction accuracy.

Machine learning-based Predictive Model of Suicidal Thoughts among Korean Adolescents. (머신러닝 기반 한국 청소년의 자살 생각 예측 모델)

  • YeaJu JIN;HyunKi KIM
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.1 no.1
    • /
    • pp.1-6
    • /
    • 2023
  • This study developed models using decision forest, support vector machine, and logistic regression methods to predict and prevent suicidal ideation among Korean adolescents. The study sample consisted of 51,407 individuals after removing missing data from the raw data of the 18th (2022) Youth Health Behavior Survey conducted by the Korea Centers for Disease Control and Prevention. Analysis was performed using the MS Azure program with Two-Class Decision Forest, Two-Class Support Vector Machine, and Two-Class Logistic Regression. The results of the study showed that the decision forest model achieved an accuracy of 84.8% and an F1-score of 36.7%. The support vector machine model achieved an accuracy of 86.3% and an F1-score of 24.5%. The logistic regression model achieved an accuracy of 87.2% and an F1-score of 40.1%. Applying the logistic regression model with SMOTE to address data imbalance resulted in an accuracy of 81.7% and an F1-score of 57.7%. Although the accuracy slightly decreased, the recall, precision, and F1-score improved, demonstrating excellent performance. These findings have significant implications for the development of prediction models for suicidal ideation among Korean adolescents and can contribute to the prevention and improvement of youth suicide.

Analysis of the relationship between regulation compliance and occupational injuries - Focusing on logistic and poisson regression analysis - (규제 순응도와 산업재해 발생 수준간의 관계 분석 - 로지스틱 회귀분석과 포아송 회귀분석을 중심으로 -)

  • Rhee, Kyung-Yong;Kim, Ki-Sik;Yoon, Young-Shik
    • Journal of the Korea Safety Management & Science
    • /
    • v.15 no.2
    • /
    • pp.9-20
    • /
    • 2013
  • OSHA(Occupational Safety and Health Act) generally regulates employer's business principles in the workplace to maintain safety environment. This act has the fundamental purpose to protect employee's safety and health in the workplace by reducing industrial accidents. Authors tried to investigate the correlation between 'occupational injuries and illnesses' and level of regulation compliance using Survey on Current Status of Occupational Safety & Health data by the various statistical methods, such as generalized regression analysis, logistic regression analysis and poison regression analysis in order to compare the results of those methods. The results have shown that the significant affecting compliance factors were different among those statistical methods. This means that specific interpretation should be considered based on each statistical method. In the future, relevant statistical technique will be developed considering the distribution type of occupational injuries.

Categorical Analysis for the Factors of Incustrial Accident Cases (산업재해 사례인자의 범주형 분석)

  • Jhee, Kyung-Tek;Song, Young-Ho;Chung, Kook-Sam
    • Journal of the Korean Society of Safety
    • /
    • v.17 no.1
    • /
    • pp.94-98
    • /
    • 2002
  • This study aimed to search for the fundamental accident causes using a categorical analysis, a kind of statistical methods. As the analysis methods, correlation analysis, independence test and logistic regression analysis were used. And the SPSS package, a general-purpose mathematical library, was used to obtain statistical characteristics. As the result of this study, the accident causes associated with factor of 'lost working days' were factors such as 'employed periods', 'sex', 'type of accident', 'month'. In case of applying independence test method, the most important cause was the factor of 'month'. In case that logistic regression analysis method was applied, the cause contributed to the increase structure'. 'less than 6 month'. On the basis of these results, the plan for accident prevention and the proper investment for accident prevention expenditure could be carried out in each workshop.