• Title/Summary/Keyword: Logistic Regression model

Search Result 1,548, Processing Time 0.024 seconds

Power Failure Sensitivity Analysis via Grouped L1/2 Sparsity Constrained Logistic Regression

  • Li, Baoshu;Zhou, Xin;Dong, Ping
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.15 no.8
    • /
    • pp.3086-3101
    • /
    • 2021
  • To supply precise marketing and differentiated service for the electric power service department, it is very important to predict the customers with high sensitivity of electric power failure. To solve this problem, we propose a novel grouped 𝑙1/2 sparsity constrained logistic regression method for sensitivity assessment of electric power failure. Different from the 𝑙1 norm and k-support norm, the proposed grouped 𝑙1/2 sparsity constrained logistic regression method simultaneously imposes the inter-class information and tighter approximation to the nonconvex 𝑙0 sparsity to exploit multiple correlated attributions for prediction. Firstly, the attributes or factors for predicting the customer sensitivity of power failure are selected from customer sheets, such as customer information, electric consuming information, electrical bill, 95598 work sheet, power failure events, etc. Secondly, all these samples with attributes are clustered into several categories, and samples in the same category are assumed to be sharing similar properties. Then, 𝑙1/2 norm constrained logistic regression model is built to predict the customer's sensitivity of power failure. Alternating direction of multipliers (ADMM) algorithm is finally employed to solve the problem by splitting it into several sub-problems effectively. Experimental results on power electrical dataset with about one million customer data from a province validate that the proposed method has a good prediction accuracy.

A Study on Diabetes Management System Based on Logistic Regression and Random Forest

  • ByungJoo Kim
    • International journal of advanced smart convergence
    • /
    • v.13 no.2
    • /
    • pp.61-68
    • /
    • 2024
  • In the quest for advancing diabetes diagnosis, this study introduces a novel two-step machine learning approach that synergizes the probabilistic predictions of Logistic Regression with the classification prowess of Random Forest. Diabetes, a pervasive chronic disease impacting millions globally, necessitates precise and early detection to mitigate long-term complications. Traditional diagnostic methods, while effective, often entail invasive testing and may not fully leverage the patterns hidden in patient data. Addressing this gap, our research harnesses the predictive capability of Logistic Regression to estimate the likelihood of diabetes presence, followed by employing Random Forest to classify individuals into diabetic, pre-diabetic or nondiabetic categories based on the computed probabilities. This methodology not only capitalizes on the strengths of both algorithms-Logistic Regression's proficiency in estimating nuanced probabilities and Random Forest's robustness in classification-but also introduces a refined mechanism to enhance diagnostic accuracy. Through the application of this model to a comprehensive diabetes dataset, we demonstrate a marked improvement in diagnostic precision, as evidenced by superior performance metrics when compared to other machine learning approaches. Our findings underscore the potential of integrating diverse machine learning models to improve clinical decision-making processes, offering a promising avenue for the early and accurate diagnosis of diabetes and potentially other complex diseases.

CHAIN DEPENDENCE AND STATIONARITY TEST FOR TRANSITION PROBABILITIES OF MARKOV CHAIN UNDER LOGISTIC REGRESSION MODEL

  • Sinha Narayan Chandra;Islam M. Ataharul;Ahmed Kazi Saleh
    • Journal of the Korean Statistical Society
    • /
    • v.35 no.4
    • /
    • pp.355-376
    • /
    • 2006
  • To identify whether the sequence of observations follows a chain dependent process and whether the chain dependent or repeated observations follow stationary process or not, alternative procedures are suggested in this paper. These test procedures are formulated on the basis of logistic regression model under the likelihood ratio test criterion and applied to the daily rainfall occurrence data of Bangladesh for selected stations. These test procedures indicate that the daily rainfall occurrences follow a chain dependent process, and the different types of transition probabilities and overall transition probabilities of Markov chain for the occurrences of rainfall follow a stationary process in the Mymensingh and Rajshahi areas, and non-stationary process in the Chittagong, Faridpur and Satkhira areas.

A Study on Improving the predict accuracy rate of Hybrid Model Technique Using Error Pattern Modeling : Using Logistic Regression and Discriminant Analysis

  • Cho, Yong-Jun;Hur, Joon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.17 no.2
    • /
    • pp.269-278
    • /
    • 2006
  • This paper presents the new hybrid data mining technique using error pattern, modeling of improving classification accuracy. The proposed method improves classification accuracy by combining two different supervised learning methods. The main algorithm generates error pattern modeling between the two supervised learning methods(ex: Neural Networks, Decision Tree, Logistic Regression and so on.) The Proposed modeling method has been applied to the simulation of 10,000 data sets generated by Normal and exponential random distribution. The simulation results show that the performance of proposed method is superior to the existing methods like Logistic regression and Discriminant analysis.

  • PDF

The Confidence Band of $ED_{100p}$ for the Simple Logistic Regression Model

  • Cho, Tae Kyoung;Shin, Mi Young
    • Communications for Statistical Applications and Methods
    • /
    • v.8 no.3
    • /
    • pp.581-588
    • /
    • 2001
  • The $ED_{100p}$ is that value of the dose associated with 100p% response rate in the analysis of quantal response data. Brand, Pinnock, and Jackson (1973) studied the confidence bands of $ED_{100p}$ obtained by solving extremal values algebraically on the ellipsoid confidence region of the parameters in the simple logistic regression model. In this paper, we develope and illustrate a simpler method for obtaining confidence bands for $ED_{100p}$ based on the rectangular confidence region of parameters.

  • PDF

A survival prediction model of hemorrhagic shock in rats using a logistic regression equation (출혈성 쇼크를 일으킨 흰쥐에서 로지스틱 회귀분석을 이용한 생존율 예측)

  • Lee, Tak-Hyung;Lee, Ju-Hyung;Chung, Sang-Won;Kim, Deok-Won
    • Proceedings of the IEEK Conference
    • /
    • 2009.05a
    • /
    • pp.132-134
    • /
    • 2009
  • Hemorrhagic shock is a common cause of death in emergency rooms. Since the symptoms of hemorrhagic shock occur after shock has considerably progressed, it is difficult to diagnose shock early. The purpose of this study was to improve early diagnosis of hemorrhagic shock using a survival prediction model in rats. We measured ECG, blood pressure, respiration and temperature in 45 Sprague-Dawley rats, and then obtained a logistic regression equation predicting survival rates. Area under the ROC curves was 0.99. The Hosmer-Lemeshow goodness-of-fit chi-square was 0.86(degree of freedom=8, p=0.999). Applying the determined optimal boundary value of 0.25, the accuracy of survival prediction was 94.7%

  • PDF

Logistic Regression for Investigating Credit Card Default

  • Yang, Jeong-Won;Ha, Sung-Ho;Min, Ji-Hong
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2008.10b
    • /
    • pp.164-169
    • /
    • 2008
  • The increasing late-payment rate of credit card customers caused by a recent economic downturn are incurring not only reduced profit of department stores but also significant loss. Under this pressure, the objective of credit forecasting is extended from presumption of good or bad customers to contribution to revenue growth. As a method of managing defaults of department store credit card, this study classifies credit delinquents into some clusters, analyzes repaying patterns of customers in each cluster, and develops credit forecasting system to manage delinquents of department store credit card using data of Korean D department store's delinquents. The model presented by this study uses Kohonen network, a kind of artificial neural network of data mining techniques to cluster credit delinquents into groups. Logistic regression model is also used to predict repayment rate of customers of each cluster per period. The accuracy of presented system for the whole clusters is 92.3%.

  • PDF

Evaluations of predicted models fitted for data mining - comparisons of classification accuracy and training time for 4 algorithms (데이터마이닝기법상에서 적합된 예측모형의 평가 -4개분류예측모형의 오분류율 및 훈련시간 비교평가 중심으로)

  • Lee, Sang-Bock
    • Journal of the Korean Data and Information Science Society
    • /
    • v.12 no.2
    • /
    • pp.113-124
    • /
    • 2001
  • CHAID, logistic regression, bagging trees, and bagging trees are compared on SAS artificial data set as HMEQ in terms of classification accuracy and training time. In error rates, bagging trees is at the top, although its run time is slower than those of others. The run time of logistic regression is best among given models, but there is no uniformly efficient model satisfied in both criteria.

  • PDF

Using Classification function to integrate Discriminant Analysis, Logistic Regression and Backpropagation Neural Networks for Interest Rates Forecasting

  • Oh, Kyong-Joo;Ingoo Han
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2000.11a
    • /
    • pp.417-426
    • /
    • 2000
  • This study suggests integrated neural network models for Interest rate forecasting using change-point detection, classifiers, and classification functions based on structural change. The proposed model is composed of three phases with tee-staged learning. The first phase is to detect successive and appropriate structural changes in interest rare dataset. The second phase is to forecast change-point group with classifiers (discriminant analysis, logistic regression, and backpropagation neural networks) and their. combined classification functions. The fecal phase is to forecast the interest rate with backpropagation neural networks. We propose some classification functions to overcome the problems of two-staged learning that cannot measure the performance of the first learning. Subsequently, we compare the structured models with a neural network model alone and, in addition, determine which of classifiers and classification functions can perform better. This article then examines the predictability of the proposed classification functions for interest rate forecasting using structural change.

  • PDF

Evaluating seismic liquefaction potential using multivariate adaptive regression splines and logistic regression

  • Zhang, Wengang;Goh, Anthony T.C.
    • Geomechanics and Engineering
    • /
    • v.10 no.3
    • /
    • pp.269-284
    • /
    • 2016
  • Simplified techniques based on in situ testing methods are commonly used to assess seismic liquefaction potential. Many of these simplified methods were developed by analyzing liquefaction case histories from which the liquefaction boundary (limit state) separating two categories (the occurrence or non-occurrence of liquefaction) is determined. As the liquefaction classification problem is highly nonlinear in nature, it is difficult to develop a comprehensive model using conventional modeling techniques that take into consideration all the independent variables, such as the seismic and soil properties. In this study, a modification of the Multivariate Adaptive Regression Splines (MARS) approach based on Logistic Regression (LR) LR_MARS is used to evaluate seismic liquefaction potential based on actual field records. Three different LR_MARS models were used to analyze three different field liquefaction databases and the results are compared with the neural network approaches. The developed spline functions and the limit state functions obtained reveal that the LR_MARS models can capture and describe the intrinsic, complex relationship between seismic parameters, soil parameters, and the liquefaction potential without having to make any assumptions about the underlying relationship between the various variables. Considering its computational efficiency, simplicity of interpretation, predictive accuracy, its data-driven and adaptive nature and its ability to map the interaction between variables, the use of LR_MARS model in assessing seismic liquefaction potential is promising.