• Title/Summary/Keyword: Binary Logistic Model

Search Result 162, Processing Time 0.027 seconds

Imputation for Binary or Ordered Categorical Traits Based on the Bayesian Threshold Model (베이지안 분계점 모형에 의한 순서 범주형 변수의 대체)

  • Lee Seung-Chun
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.3
    • /
    • pp.597-606
    • /
    • 2005
  • The nonresponse in sample survey causes a problem when it comes time to analyze dataset in public-use files where the user has only complete-data methods available and has limited information about the reasons for nonresponse. Recently imputation for nonresponse is becoming a standard approach for handling nonresponse and various imputation methods have been devised . However, most imputation methods concern with continuous traits while many interesting features are measured by binary or ordered categorical scales in sample survey. In this note. an imputation method for ignorable nonresponse in binary or ordered categorical traits is considered.

A Comparative Study of Predictive Factors for Passing the National Physical Therapy Examination using Logistic Regression Analysis and Decision Tree Analysis

  • Kim, So Hyun;Cho, Sung Hyoun
    • Physical Therapy Rehabilitation Science
    • /
    • v.11 no.3
    • /
    • pp.285-295
    • /
    • 2022
  • Objective: The purpose of this study is to use logistic regression and decision tree analysis to identify the factors that affect the success or failurein the national physical therapy examination; and to build and compare predictive models. Design: Secondary data analysis study Methods: We analyzed 76,727 subjects from the physical therapy national examination data provided by the Korea Health Personnel Licensing Examination Institute. The target variable was pass or fail, and the input variables were gender, age, graduation status, and examination area. Frequency analysis, chi-square test, binary logistic regression, and decision tree analysis were performed on the data. Results: In the logistic regression analysis, subjects in their 20s (Odds ratio, OR=1, reference), expected to graduate (OR=13.616, p<0.001) and from the examination area of Jeju-do (OR=3.135, p<0.001), had a high probability of passing. In the decision tree, the predictive factors for passing result had the greatest influence in the order of graduation status (x2=12366.843, p<0.001) and examination area (x2=312.446, p<0.001). Logistic regression analysis showed a specificity of 39.6% and sensitivity of 95.5%; while decision tree analysis showed a specificity of 45.8% and sensitivity of 94.7%. In classification accuracy, logistic regression and decision tree analysis showed 87.6% and 88.0% prediction, respectively. Conclusions: Both logistic regression and decision tree analysis were adequate to explain the predictive model. Additionally, whether actual test takers passed the national physical therapy examination could be determined, by applying the constructed prediction model and prediction rate.

A Comparative Study of Predictive Factors for Hypertension using Logistic Regression Analysis and Decision Tree Analysis

  • SoHyun Kim;SungHyoun Cho
    • Physical Therapy Rehabilitation Science
    • /
    • v.12 no.2
    • /
    • pp.80-91
    • /
    • 2023
  • Objective: The purpose of this study is to identify factors that affect the incidence of hypertension using logistic regression and decision tree analysis, and to build and compare predictive models. Design: Secondary data analysis study Methods: We analyzed 9,859 subjects from the Korean health panel annual 2019 data provided by the Korea Institute for Health and Social Affairs and National Health Insurance Service. Frequency analysis, chi-square test, binary logistic regression, and decision tree analysis were performed on the data. Results: In logistic regression analysis, those who were 60 years of age or older (Odds ratio, OR=68.801, p<0.001), those who were divorced/widowhood/separated (OR=1.377, p<0.001), those who graduated from middle school or younger (OR=1, reference), those who did not walk at all (OR=1, reference), those who were obese (OR=5.109, p<0.001), and those who had poor subjective health status (OR=2.163, p<0.001) were more likely to develop hypertension. In the decision tree, those over 60 years of age, overweight or obese, and those who graduated from middle school or younger had the highest probability of developing hypertension at 83.3%. Logistic regression analysis showed a specificity of 85.3% and sensitivity of 47.9%; while decision tree analysis showed a specificity of 81.9% and sensitivity of 52.9%. In classification accuracy, logistic regression and decision tree analysis showed 73.6% and 72.6% prediction, respectively. Conclusions: Both logistic regression and decision tree analysis were adequate to explain the predictive model. It is thought that both analysis methods can be used as useful data for constructing a predictive model for hypertension.

A Probability Mapping for Land Cover Change Prediction using CLUE Model (토지피복변화 예측을 위한 CLUE 모델의 확률지도 생성)

  • Oh, Yun-Gyeong;Choi, Jin-Yong;Bae, Seung-Jong;Yoo, Seung-Hwan;Lee, Sang-Hyun
    • Journal of Korean Society of Rural Planning
    • /
    • v.16 no.2
    • /
    • pp.47-55
    • /
    • 2010
  • Land cover and land use change data are important in many studies including climate change and hydrological studies. Although the various theories and models have been developed, it is difficult to identify the driving factors of the land use change because land use change is related to policy options and natural and socio-economic conditions. This study is to attempt to simulate the land cover change using the CLUE model based on a statistical analysis of land-use change. CLUE model has dynamic modeling tools from the competition among land use change in between driving force and land use, so that this model depends on statistical relations between land use change and driving factors. In this study, Yongin, Icheon and Anseong were selected for the study areas, and binary logistic regression and factor analysis were performed verifying with ROC curve. Land cover probability map was also prepared to compare with the land cover data and higher probability areas are well matched with the present land cover demonstrating CLUE model applicability.

Optimization Method of Knapsack Problem Based on BPSO-SA in Logistics Distribution

  • Zhang, Yan;Wu, Tengyu;Ding, Xiaoyue
    • Journal of Information Processing Systems
    • /
    • v.18 no.5
    • /
    • pp.665-676
    • /
    • 2022
  • In modern logistics, the effective use of the vehicle volume and loading capacity will reduce the logistic cost. Many heuristic algorithms can solve this knapsack problem, but lots of these algorithms have a drawback, that is, they often fall into locally optimal solutions. A fusion optimization method based on simulated annealing algorithm (SA) and binary particle swarm optimization algorithm (BPSO) is proposed in the paper. We establish a logistics knapsack model of the fusion optimization algorithm. Then, a new model of express logistics simulation system is used for comparing three algorithms. The experiment verifies the effectiveness of the algorithm proposed in this paper. The experimental results show that the use of BPSO-SA algorithm can improve the utilization rate and the load rate of logistics distribution vehicles. So, the number of vehicles used for distribution and the average driving distance will be reduced. The purposes of the logistics knapsack problem optimization are achieved.

Compliance Level with Therapeutic Regimen of Medication and Life Style among Patients with Hypertension in Rural Communities (일 농촌지역 고혈압 환자의 치료적 요법의 이행수준 - 약물복용과 생활습관을 중심으로 -)

  • Ahn, Yang-Heui
    • Journal of Korean Public Health Nursing
    • /
    • v.21 no.2
    • /
    • pp.125-133
    • /
    • 2007
  • Purpose: To identify the compliance level with therapeutic regimen among patients with hypertension residing in rural communities. Method: A descriptive-retrospective research design was employed. One hundred patients with hypertension using 8 Primary Health Care Posts under W Public Health Center were randomly recruited on the basis of being over 35 years of age. After obtaining written consent, the patients underwent direct interviews with a structured questionnaire carried out by 8 public health practitioners. Descriptive statistics and binary logistic regression were utilized. Results: In a binary logistic regression model adjusted for age, sex, education, income, and occupation, those who were receiving medication (OR=5.34), were undergoing a weight control program (OR=4.45), restricted alcohol (OR=9.93), or smoking cessation (OR=25.59) as recommended by medical or health professionals were more compliant (p<.05) while those under a low salt diet, exercise, and stress management were not significant statistically (p>.05). Conclusions: Further research should be conducted to validate these findings so as to facilitate the development of nursing intervention strategies for improving the compliance of hypertensive patients in respect to medication and life style modification.

  • PDF

Analysis of Stress level of Korean Household Members due to Household Debt (한국국민의 가계 금융부채에 대한 체감도 분석)

  • Oh, Man-Suk;Hyun, Seung-Me
    • The Korean Journal of Applied Statistics
    • /
    • v.22 no.2
    • /
    • pp.297-307
    • /
    • 2009
  • Korean household debt is one of the main sources of the current financial crisis. This paper studies the impact of household members' attributes such as a type of housing(self-own or rent), education, age, average monthly income of the head of household, and the area of residence, on the stress level of the household members due to household debt. We analyze a real data set collected by KB Kookmin Bank in 2004. We consider low and high stress level as a binary response variable and use a logistic regression model with the attributes of household members as explanatory variables. A simple but well-fitting model is selected by backward elimination method based on the likelihood statistic for goodness-of-fit test, and the impact of the attributes on the stress level is studied from parameter estimates of the selected model. We also perform the similar analysis on a binary response variable which distinguishes households with no debt from the rest. From the analysis, the stress level tends to be low for households with self-own houses, high average monthly income, low education level, and young members.

Characteristics and Influencing Factors of Red Light Running (RLR) Crashes (신호위반사고의 특성과 영향요인 분석)

  • Park, Jeong Soon;Jung, Yong Il;Kim, Yun Hwan
    • Journal of Korean Society of Transportation
    • /
    • v.32 no.3
    • /
    • pp.198-206
    • /
    • 2014
  • According to the statistics of the National Police Agency, red light running (RLR) crashes represent a significant safety issue throughout Korea. This study deals with the RLR crashes occurred at signalized intersections in Cheongju. The objectives of this study are to comparatively analyze the characteristics of between RLR crashes and the Non-RLR crashes, and to find out factors using a Binary Logistic Regression(BLR) model. In pursuing the above, the study gives particular attentions to testing the differences between the above two groups with the data of 2,246 RLR/ 3,884 Non-RLR crashes (2007-2011). The main results are as follows. First, many RLR crashes were occurred in the nighttime and in going straight. Second, the difference between RLR and Non-RLR crashes were clearly defined by crash type, maneuver of vehicle before crash, age of driver (30s, 50s), alcohol use and accident pattern. Finally, a statistically significant model (Hosmer and Lemeshow test : 7.052, p-value : 0.531) was developed through the BLR model.

Comparison Study of Multi-class Classification Methods

  • Bae, Wha-Soo;Jeon, Gab-Dong;Seok, Kyung-Ha
    • Communications for Statistical Applications and Methods
    • /
    • v.14 no.2
    • /
    • pp.377-388
    • /
    • 2007
  • As one of multi-class classification methods, ECOC (Error Correcting Output Coding) method is known to have low classification error rate. This paper aims at suggesting effective multi-class classification method (1) by comparing various encoding methods and decoding methods in ECOC method and (2) by comparing ECOC method and direct classification method. Both SVM (Support Vector Machine) and logistic regression model were used as binary classifiers in comparison.

A GA-based Classification Model for Predicting Consumer Choice (유전 알고리듬 기반 제품구매예측 모형의 개발)

  • Min, Jae-H.;Jeong, Chul-Woo
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.34 no.3
    • /
    • pp.29-41
    • /
    • 2009
  • The purpose of this paper is to develop a new classification method for predicting consumer choice based on genetic algorithm, and to validate Its prediction power over existing methods. To serve this purpose, we propose a hybrid model, and discuss Its methodological characteristics in comparison with other existing classification methods. Also, we conduct a series of experiments employing survey data of consumer choices of MP3 players to assess the prediction power of the model. The results show that the suggested model in this paper is statistically superior to the existing methods such as logistic regression model, artificial neural network model and decision tree model in terms of prediction accuracy. The model is also shown to have an advantage of providing several strategic information of practical use for consumer choice.