• Title/Summary/Keyword: Binary Logistic Regression

Search Result 419, Processing Time 0.031 seconds

Semiparametric kernel logistic regression with longitudinal data

  • Shim, Joo-Yong;Seok, Kyung-Ha
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.2
    • /
    • pp.385-392
    • /
    • 2012
  • Logistic regression is a well known binary classification method in the field of statistical learning. Mixed-effect regression models are widely used for the analysis of correlated data such as those found in longitudinal studies. We consider kernel extensions with semiparametric fixed effects and parametric random effects for the logistic regression. The estimation is performed through the penalized likelihood method based on kernel trick, and our focus is on the efficient computation and the effective hyperparameter selection. For the selection of optimal hyperparameters, cross-validation techniques are employed. Numerical results are then presented to indicate the performance of the proposed procedure.

Analysis of Decision Factors on the Participation of Scaling Project for Private Forest Management using a Logit Model (로짓모형을 이용한 산주의 사유림 경영 규모화 사업 참여 결정요인 분석)

  • Kim, Ki Dong
    • Journal of Korean Society of Forest Science
    • /
    • v.105 no.3
    • /
    • pp.360-365
    • /
    • 2016
  • The purpose of this study is to provide the basic information for the early enforcement and extension of the improvement project of management scale of private forest land by understanding the characteristics of forest owners, who have an influence on the participation of the project as one of the private forest management vitalization plans. To achieve this goal, a questionnaire survey targeting 373 forest owners was conducted and analyzed by Binary-Logistic Regression. The variables for binary-logistic regression included gender, age, academic ability, occupation, income, residence, purpose of forest ownership, and status of cooperative membership. As a result of the analysis, 267 forest owners (71.6%) of total 373 forest owners have the intention to participate in the scaling project for private forest management. The rest of forest owners (106 forest owners, 28.4%) would not be willing to participate in the project. As a result of binary-logistic regression, the most important variables, which have an impact on the participation of private forest management scale improvement project, are age, job and forest own purpose.

MEAT SPECIATION USING A HIERARCHICAL APPROACH AND LOGISTIC REGRESSION

  • Arnalds, Thosteinn;Fearn, Tom;Downey, Gerard
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1245-1245
    • /
    • 2001
  • Food adulteration is a serious consumer fraud and a matter of concern to food processors and regulatory agencies. A range of analytical methods have been investigated to facilitate the detection of adulterated or mis-labelled foods & food ingredients but most of these require sophisticated equipment, highly-qualified staff and are time-consuming. Regulatory authorities and the food industry require a screening technique which will facilitate fast and relatively inexpensive monitoring of food products with a high level of accuracy. Near infrared spectroscopy has been investigated for its potential in a number of authenticity issues including meat speciation (McElhinney, Downey & Fearn (1999) JNIRS, 7(3), 145-154; Downey, McElhinney & Fearn (2000). Appl. Spectrosc. 54(6), 894-899). This report describes further analysis of these spectral sets using a hierarchical approach and binary decisions solved using logistic regression. The sample set comprised 230 homogenized meat samples i. e. chicken (55), turkey (54), pork (55), beef (32) and lamb (34) purchased locally as whole cuts of meat over a 10-12 week period. NIR reflectance spectra were recorded over the wavelength range 400-2498nm at 2nm intervals on a NIR Systems 6500 scanning monochromator. The problem was defined as a series of binary decisions i. e. is the meat red or white\ulcorner is the red meat beef or lamb\ulcorner, is the white meat pork or poultry\ulcorner etc. Each of these decisions was made using an individual binary logistic model based on scores derived from principal component or partial least squares (PLS1 and PLS2) analysis. The results obtained were equal to or better than previous reports using factorial discriminant analysis, K-nearest neighbours and PLS2 regression. This new approach using a combination of exploratory and logistic analyses also appears to have advantages of transparency and the use of inherent structure in the spectral data. Additionally, it allows for the use of different data transforms and multivariate regression techniques at each decision step.

  • PDF

MEAT SPECIATION USING A HIERARCHICAL APPROACH AND LOGISTIC REGRESSION

  • Arnalds, Thosteinn;Fearn, Tom;Downey, Gerard
    • Proceedings of the Korean Society of Near Infrared Spectroscopy Conference
    • /
    • 2001.06a
    • /
    • pp.1152-1152
    • /
    • 2001
  • Food adulteration is a serious consumer fraud and a matter of concern to food processors and regulatory agencies. A range of analytical methods have been investigated to facilitate the detection of adulterated or mis-labelled foods & food ingredients but most of these require sophisticated equipment, highly-qualified staff and are time-consuming. Regulatory authorities and the food industry require a screening technique which will facilitate fast and relatively inexpensive monitoring of food products with a high level of accuracy. Near infrared spectroscopy has been investigated for its potential in a number of authenticity issues including meat speciation (McElhinney, Downey & Fearn (1999) JNIRS, 7(3), 145 154; Downey, McElhinney & Fearn (2000). Appl. Spectrosc. 54(6), 894-899). This report describes further analysis of these spectral sets using a hierarchical approach and binary decisions solved using logistic regression. The sample set comprised 230 homogenized meat samples i. e. chicken (55), turkey (54), pork (55), beef (32) and lamb (34) purchased locally as whole cuts of meat over a 10-12 week period. NIR reflectance spectra were recorded over the wavelength range 400-2498nm at 2nm intervals on a NIR Systems 6500 scanning monochromator. The problem was defined as a series of binary decisions i. e. is the meat red or white\ulcorner is the red meat beef or lamb\ulcorner, is the white meat pork or poultry\ulcorner etc. Each of these decisions was made using an individual binary logistic model based on scores derived from principal component or partial least squares (PLS1 and PLS2) analysis. The results obtained were equal to or better than previous reports using factorial discriminant analysis, K-nearest neighbours and PLS2 regression. This new approach using a combination of exploratory and logistic analyses also appears to have advantages of transparency and the use of inherent structure in the spectral data. Additionally, it allows for the use of different data transforms and multivariate regression techniques at each decision step.

  • PDF

Forecasting Probability of Precipitation Using Morkov Logistic Regression Model

  • Park, Jeong-Soo;Kim, Yun-Seon
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.1
    • /
    • pp.1-9
    • /
    • 2007
  • A three-state Markov logistic regression model is suggested to forecast the probability of tomorrow's precipitation based on the current meteorological situation. The suggested model turns out to be better than Markov regression model in the sense of the mean squared error of forecasting for the rainfall data of Seoul area.

Effect of zero imputation methods for log-transformation of independent variables in logistic regression

  • Seo Young Park
    • Communications for Statistical Applications and Methods
    • /
    • v.31 no.4
    • /
    • pp.409-425
    • /
    • 2024
  • Logistic regression models are commonly used to explain binary health outcome variable using independent variables such as patient characteristics in medical science and public health research. Although there is no distributional assumption required for independent variables in logistic regression, variables with severely right-skewed distribution such as lab values are often log-transformed to achieve symmetry or approximate normality. However, lab values often have zeros due to limit of detection which makes it impossible to apply log-transformation. Therefore, preprocessing to handle zeros in the observation before log-transformation is necessary. In this study, five methods that remove zeros (shift by 1, shift by half of the smallest nonzero, shift by square root of the smallest nonzero, replace zeros with half of the smallest nonzero, replace zeros with the square root of the smallest nonzero) are investigated in logistic regression setting. To evaluate performances of these methods, we performed a simulation study based on randomly generated data from log-normal distribution and logistic regression model. Shift by 1 method has the worst performance, and overall shift by half of the smallest nonzero method, replace zeros with half of the smallest nonzero method, and replace zeros with the square root of the smallest nonzero method showed comparable and stable performances.

Estimation of Asymmetric Bell Shaped Probability Curve using Logistic Regression (로지스틱 회귀모형을 이용한 비대칭 종형 확률곡선의 추정)

  • 박성현;김기호;이소형
    • The Korean Journal of Applied Statistics
    • /
    • v.14 no.1
    • /
    • pp.71-80
    • /
    • 2001
  • Logistic regression model is one of the most popular linear models for a binary response variable and used for the estimation of probability function. In many practical situations, the probability function can be expressed by a bell shaped curve and such a function can be estimated by a second order logistic regression model. However, when the probability curve is asymmetric, the estimation results using a second order logistic regression model may not be precise because a second order logistic regression model is a symmetric function. In addition, even if a second order logistic regression model is used, the interpretation for the effect of second order term may not be easy. In this paper, in order to alleviate such problems, an estimation method for asymmetric probabiity curve based on a first order logistic regression model and iterative bi-section method is proposed and its performance is compared with that of a second order logistic regression model by a simulation study.

  • PDF

Evaluation of the Probability of Detection Surface for ODSCC in Steam Generator Tubes Using Multivariate Logistic Regression (다변량 로지스틱 회귀분석을 이용한 증기발생기 전열관 ODSCC의 POD곡면 분석)

  • Lee, Jae-Bong;Park, Jai-Hak;Kim, Hong-Deok;Chung, Han-Sub
    • Proceedings of the KSME Conference
    • /
    • 2007.05a
    • /
    • pp.250-255
    • /
    • 2007
  • Steam generator tubes play an important role in safety because they constitute one of the primary barriers between the radioactive and non-radioactive sides of the nuclear power plant. For this reason, the integrity of the tubes is essential in minimizing the leakage possibility of radioactive water. The integrity of the tubes is evaluated based on NDE (non-destructive evaluation) inspection results. Especially ECT (eddy current test) method is usually used for detecting the flaws in steam generator tubes. However, detection capacity of the NDE is not perfect and all of the "real flaws" which actually existing in steam generator tunes is not known by NDE results. Therefore reliability of NDE system is one of the essential parts in assessing the integrity of steam generators. In this study POD (probability of detection) of ECT system for ODSCC in steam generator tubes is evaluated using multivariate logistic regression. The cracked tube specimens are made using the withdrawn steam generator tubes. Therefore the cracks are not artificial but real. Using the multivariate logistic regression method, continuous POD surfaces are evaluated from hit (detection) and miss (no detection) binary data obtained from destructive and non-destructive evaluation of the cracked tubes. Length and depth of cracks are considered in multivariate logistic regression and their effects on detection capacity are evaluated.

  • PDF

Penalized logistic regression models for determining the discharge of dyspnea patients (호흡곤란 환자 퇴원 결정을 위한 벌점 로지스틱 회귀모형)

  • Park, Cheolyong;Kye, Myo Jin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.125-133
    • /
    • 2013
  • In this paper, penalized binary logistic regression models are employed as statistical models for determining the discharge of 668 patients with a chief complaint of dyspnea based on 11 blood tests results. Specifically, the ridge model based on $L^2$ penalty and the Lasso model based on $L^1$ penalty are considered in this paper. In the comparison of prediction accuracy, our models are compared with the logistic regression models with all 11 explanatory variables and the selected variables by variable selection method. The results show that the prediction accuracy of the ridge logistic regression model is the best among 4 models based on 10-fold cross-validation.

A Comparative Study of Predictive Factors for Passing the National Physical Therapy Examination using Logistic Regression Analysis and Decision Tree Analysis

  • Kim, So Hyun;Cho, Sung Hyoun
    • Physical Therapy Rehabilitation Science
    • /
    • v.11 no.3
    • /
    • pp.285-295
    • /
    • 2022
  • Objective: The purpose of this study is to use logistic regression and decision tree analysis to identify the factors that affect the success or failurein the national physical therapy examination; and to build and compare predictive models. Design: Secondary data analysis study Methods: We analyzed 76,727 subjects from the physical therapy national examination data provided by the Korea Health Personnel Licensing Examination Institute. The target variable was pass or fail, and the input variables were gender, age, graduation status, and examination area. Frequency analysis, chi-square test, binary logistic regression, and decision tree analysis were performed on the data. Results: In the logistic regression analysis, subjects in their 20s (Odds ratio, OR=1, reference), expected to graduate (OR=13.616, p<0.001) and from the examination area of Jeju-do (OR=3.135, p<0.001), had a high probability of passing. In the decision tree, the predictive factors for passing result had the greatest influence in the order of graduation status (x2=12366.843, p<0.001) and examination area (x2=312.446, p<0.001). Logistic regression analysis showed a specificity of 39.6% and sensitivity of 95.5%; while decision tree analysis showed a specificity of 45.8% and sensitivity of 94.7%. In classification accuracy, logistic regression and decision tree analysis showed 87.6% and 88.0% prediction, respectively. Conclusions: Both logistic regression and decision tree analysis were adequate to explain the predictive model. Additionally, whether actual test takers passed the national physical therapy examination could be determined, by applying the constructed prediction model and prediction rate.