• Title/Summary/Keyword: Binary Logistic Model

Search Result 160, Processing Time 0.024 seconds

Semiparametric Approach to Logistic Model with Random Intercept (준모수적 방법을 이용한 랜덤 절편 로지스틱 모형 분석)

  • Kim, Mijeong
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.6
    • /
    • pp.1121-1131
    • /
    • 2015
  • Logistic models with a random intercept are useful to analyze longitudinal binary data. Traditionally, the random intercept of the logistic model is assumed to be parametric (such as normal distribution) and is also assumed to be independent to variables. Such assumptions are very strong and restricted for application to real data. Recently, Garcia and Ma (2015) derived semiparametric efficient estimators for logistic model with a random intercept without these assumptions. Their estimator shows the consistency where we do not assume any parametric form for the random intercept. In addition, the method is computationally simple. In this paper, we apply this method to analyze toenail infection data. We compare the semiparametric estimator with maximum likelihood estimator, penalized quasi-likelihood estimator and hierarchical generalized linear estimator.

Analysis of Decision Factors on the Participation of Scaling Project for Private Forest Management using a Logit Model (로짓모형을 이용한 산주의 사유림 경영 규모화 사업 참여 결정요인 분석)

  • Kim, Ki Dong
    • Journal of Korean Society of Forest Science
    • /
    • v.105 no.3
    • /
    • pp.360-365
    • /
    • 2016
  • The purpose of this study is to provide the basic information for the early enforcement and extension of the improvement project of management scale of private forest land by understanding the characteristics of forest owners, who have an influence on the participation of the project as one of the private forest management vitalization plans. To achieve this goal, a questionnaire survey targeting 373 forest owners was conducted and analyzed by Binary-Logistic Regression. The variables for binary-logistic regression included gender, age, academic ability, occupation, income, residence, purpose of forest ownership, and status of cooperative membership. As a result of the analysis, 267 forest owners (71.6%) of total 373 forest owners have the intention to participate in the scaling project for private forest management. The rest of forest owners (106 forest owners, 28.4%) would not be willing to participate in the project. As a result of binary-logistic regression, the most important variables, which have an impact on the participation of private forest management scale improvement project, are age, job and forest own purpose.

Accidents involving Children in School Zones Study to identify the key influencing factors (어린이보호구역내 어린이 교통사고 발생에 미치는 영향요인 분석)

  • Park, Sinae;Lim, Junbeom;Kim, Hyungkyu;Lee, Soobeom
    • International Journal of Highway Engineering
    • /
    • v.19 no.2
    • /
    • pp.167-174
    • /
    • 2017
  • PURPOSES: This study aims to analyze the impact of the implementation of a school zone traffic safety improvement project on the number of accidents involving children in these zones. METHODS : To analyze the correlation between school zone traffic safety features of roads in the zone and the number of accidents involving children, we developed an occurrence probability model of traffic accidents involving children by using a binary logistic regression model with SPSS 23.0 software. Two separate models were developed for two zones: interior block and arterial road. RESULTS :The model depicted that in the case of the interior block, shorter sidewalk width, speed bump, and an elevated crosswalk were key factors affecting the occurrence of accidents involving children. In the case of arterial roads exceeding a width of 12 m, the speed limit, roadside barriers, and red paving of road surfaces were found to be the key factors. CONCLUSIONS:The results of this study can serve as the elementary research data to help improve the effectiveness of school zone traffic safety improvement projects and school zone road repair projects in future.

Comparative Analysis of the Binary Classification Model for Improving PM10 Prediction Performance (PM10 예측 성능 향상을 위한 이진 분류 모델 비교 분석)

  • Jung, Yong-Jin;Lee, Jong-Sung;Oh, Chang-Heon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.1
    • /
    • pp.56-62
    • /
    • 2021
  • High forecast accuracy is required as social issues on particulate matter increase. Therefore, many attempts are being made using machine learning to increase the accuracy of particulate matter prediction. However, due to problems with the distribution of imbalance in the concentration and various characteristics of particulate matter, the learning of prediction models is not well done. In this paper, to solve these problems, a binary classification model was proposed to predict the concentration of particulate matter needed for prediction by dividing it into two classes based on the value of 80㎍/㎥. Four classification algorithms were utilized for the binary classification of PM10. Classification algorithms used logistic regression, decision tree, SVM, and MLP. As a result of performance evaluation through confusion matrix, the MLP model showed the highest binary classification performance with 89.98% accuracy among the four models.

Penalized logistic regression models for determining the discharge of dyspnea patients (호흡곤란 환자 퇴원 결정을 위한 벌점 로지스틱 회귀모형)

  • Park, Cheolyong;Kye, Myo Jin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.125-133
    • /
    • 2013
  • In this paper, penalized binary logistic regression models are employed as statistical models for determining the discharge of 668 patients with a chief complaint of dyspnea based on 11 blood tests results. Specifically, the ridge model based on $L^2$ penalty and the Lasso model based on $L^1$ penalty are considered in this paper. In the comparison of prediction accuracy, our models are compared with the logistic regression models with all 11 explanatory variables and the selected variables by variable selection method. The results show that the prediction accuracy of the ridge logistic regression model is the best among 4 models based on 10-fold cross-validation.

A Case Study on Electronic Part Inspection Based on Screening Variables (전자부품 검사에서 대용특성을 이용한 사례연구)

  • 이종설;윤원영
    • Journal of Korean Society for Quality Management
    • /
    • v.29 no.3
    • /
    • pp.124-137
    • /
    • 2001
  • In general, it is very efficient and effective to use screening variables that are correlated with the performance variable in case that measuring the performance variable is impossible (destructive) or expensive. The general methodology for searching surrogate variables is regression analysis. This paper considers the inspection problem in CRT (Cathode Ray Tube) production line, in which the performance variable (dependent variable) is binary type and screening variables are continuous. The general regression with dummy variable, discriminant analysis and binary logistic regression are considered. The cost model is also formulated to determine economically inspection procedure with screening variables.

  • PDF

MEAT SPECIATION USING A HIERARCHICAL APPROACH AND LOGISTIC REGRESSION

  • Arnalds, Thosteinn;Fearn, Tom;Downey, Gerard
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1245-1245
    • /
    • 2001
  • Food adulteration is a serious consumer fraud and a matter of concern to food processors and regulatory agencies. A range of analytical methods have been investigated to facilitate the detection of adulterated or mis-labelled foods & food ingredients but most of these require sophisticated equipment, highly-qualified staff and are time-consuming. Regulatory authorities and the food industry require a screening technique which will facilitate fast and relatively inexpensive monitoring of food products with a high level of accuracy. Near infrared spectroscopy has been investigated for its potential in a number of authenticity issues including meat speciation (McElhinney, Downey & Fearn (1999) JNIRS, 7(3), 145-154; Downey, McElhinney & Fearn (2000). Appl. Spectrosc. 54(6), 894-899). This report describes further analysis of these spectral sets using a hierarchical approach and binary decisions solved using logistic regression. The sample set comprised 230 homogenized meat samples i. e. chicken (55), turkey (54), pork (55), beef (32) and lamb (34) purchased locally as whole cuts of meat over a 10-12 week period. NIR reflectance spectra were recorded over the wavelength range 400-2498nm at 2nm intervals on a NIR Systems 6500 scanning monochromator. The problem was defined as a series of binary decisions i. e. is the meat red or white\ulcorner is the red meat beef or lamb\ulcorner, is the white meat pork or poultry\ulcorner etc. Each of these decisions was made using an individual binary logistic model based on scores derived from principal component or partial least squares (PLS1 and PLS2) analysis. The results obtained were equal to or better than previous reports using factorial discriminant analysis, K-nearest neighbours and PLS2 regression. This new approach using a combination of exploratory and logistic analyses also appears to have advantages of transparency and the use of inherent structure in the spectral data. Additionally, it allows for the use of different data transforms and multivariate regression techniques at each decision step.

  • PDF

MEAT SPECIATION USING A HIERARCHICAL APPROACH AND LOGISTIC REGRESSION

  • Arnalds, Thosteinn;Fearn, Tom;Downey, Gerard
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1152-1152
    • /
    • 2001
  • Food adulteration is a serious consumer fraud and a matter of concern to food processors and regulatory agencies. A range of analytical methods have been investigated to facilitate the detection of adulterated or mis-labelled foods & food ingredients but most of these require sophisticated equipment, highly-qualified staff and are time-consuming. Regulatory authorities and the food industry require a screening technique which will facilitate fast and relatively inexpensive monitoring of food products with a high level of accuracy. Near infrared spectroscopy has been investigated for its potential in a number of authenticity issues including meat speciation (McElhinney, Downey & Fearn (1999) JNIRS, 7(3), 145 154; Downey, McElhinney & Fearn (2000). Appl. Spectrosc. 54(6), 894-899). This report describes further analysis of these spectral sets using a hierarchical approach and binary decisions solved using logistic regression. The sample set comprised 230 homogenized meat samples i. e. chicken (55), turkey (54), pork (55), beef (32) and lamb (34) purchased locally as whole cuts of meat over a 10-12 week period. NIR reflectance spectra were recorded over the wavelength range 400-2498nm at 2nm intervals on a NIR Systems 6500 scanning monochromator. The problem was defined as a series of binary decisions i. e. is the meat red or white\ulcorner is the red meat beef or lamb\ulcorner, is the white meat pork or poultry\ulcorner etc. Each of these decisions was made using an individual binary logistic model based on scores derived from principal component or partial least squares (PLS1 and PLS2) analysis. The results obtained were equal to or better than previous reports using factorial discriminant analysis, K-nearest neighbours and PLS2 regression. This new approach using a combination of exploratory and logistic analyses also appears to have advantages of transparency and the use of inherent structure in the spectral data. Additionally, it allows for the use of different data transforms and multivariate regression techniques at each decision step.

  • PDF

A GA-based Binary Classification Method for Bankruptcy Prediction (도산예측을 위한 유전 알고리듬 기반 이진분류기법의 개발)

  • Min, Jae-H.;Jeong, Chul-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.33 no.2
    • /
    • pp.1-16
    • /
    • 2008
  • The purpose of this paper is to propose a new binary classification method for predicting corporate failure based on genetic algorithm, and to validate its prediction power through empirical analysis. Establishing virtual companies representing bankrupt companies and non-bankrupt ones respectively, the proposed method measures the similarity between the virtual companies and the subject for prediction, and classifies the subject into either bankrupt or non-bankrupt one. The values of the classification variables of the virtual companies and the weights of the variables are determined by the proper model to maximize the hit ratio of training data set using genetic algorithm. In order to test the validity of the proposed method, we compare its prediction accuracy with ones of other existing methods such as multi-discriminant analysis, logistic regression, decision tree, and artificial neural network, and it is shown that the binary classification method we propose in this paper can serve as a premising alternative to the existing methods for bankruptcy prediction.

Exploring interaction using 3-D residual plots in logistic regression model (3차원 잔차산점도를 이용한 로지스틱회귀모형에서 교호작용의 탐색)

  • Kahng, Myung-Wook
    • Journal of the Korean Data and Information Science Society
    • /
    • v.25 no.1
    • /
    • pp.177-185
    • /
    • 2014
  • Under bivariate normal distribution assumptions, the interaction and quadratic terms are needed in the logistic regression model with two predictors. However, depending on the correlation coefficient and the variances of two conditional distributions, the interaction and quadratic terms may not be necessary. Although the need for these terms can be determined by comparing the two scatter plots, it is not as useful for interaction terms. We explore the structure and usefulness of the 3-D residual plot as a tool for dealing with interaction in logistic regression models. If predictors have an interaction effect, a 3-D residual plot can show the effect. This is illustrated by simulated and real data.