• Title/Summary/Keyword: Predictive decision tree

Search Result 116, Processing Time 0.023 seconds

Investment, Export, and Exchange Rate on Prediction of Employment with Decision Tree, Random Forest, and Gradient Boosting Machine Learning Models (투자와 수출 및 환율의 고용에 대한 의사결정 나무, 랜덤 포레스트와 그래디언트 부스팅 머신러닝 모형 예측)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.46 no.2
    • /
    • pp.281-299
    • /
    • 2021
  • This paper analyzes the feasibility of using machine learning methods to forecast the employment. The machine learning methods, such as decision tree, artificial neural network, and ensemble models such as random forest and gradient boosting regression tree were used to forecast the employment in Busan regional economy. The following were the main findings of the comparison of their predictive abilities. First, the forecasting power of machine learning methods can predict the employment well. Second, the forecasting values for the employment by decision tree models appeared somewhat differently according to the depth of decision trees. Third, the predictive power of artificial neural network model, however, does not show the high predictive power. Fourth, the ensemble models such as random forest and gradient boosting regression tree model show the higher predictive power. Thus, since the machine learning method can accurately predict the employment, we need to improve the accuracy of forecasting employment with the use of machine learning methods.

A Comparative Study of Predictive Factors for Passing the National Physical Therapy Examination using Logistic Regression Analysis and Decision Tree Analysis

  • Kim, So Hyun;Cho, Sung Hyoun
    • Physical Therapy Rehabilitation Science
    • /
    • v.11 no.3
    • /
    • pp.285-295
    • /
    • 2022
  • Objective: The purpose of this study is to use logistic regression and decision tree analysis to identify the factors that affect the success or failurein the national physical therapy examination; and to build and compare predictive models. Design: Secondary data analysis study Methods: We analyzed 76,727 subjects from the physical therapy national examination data provided by the Korea Health Personnel Licensing Examination Institute. The target variable was pass or fail, and the input variables were gender, age, graduation status, and examination area. Frequency analysis, chi-square test, binary logistic regression, and decision tree analysis were performed on the data. Results: In the logistic regression analysis, subjects in their 20s (Odds ratio, OR=1, reference), expected to graduate (OR=13.616, p<0.001) and from the examination area of Jeju-do (OR=3.135, p<0.001), had a high probability of passing. In the decision tree, the predictive factors for passing result had the greatest influence in the order of graduation status (x2=12366.843, p<0.001) and examination area (x2=312.446, p<0.001). Logistic regression analysis showed a specificity of 39.6% and sensitivity of 95.5%; while decision tree analysis showed a specificity of 45.8% and sensitivity of 94.7%. In classification accuracy, logistic regression and decision tree analysis showed 87.6% and 88.0% prediction, respectively. Conclusions: Both logistic regression and decision tree analysis were adequate to explain the predictive model. Additionally, whether actual test takers passed the national physical therapy examination could be determined, by applying the constructed prediction model and prediction rate.

A Comparative Study of Predictive Factors for Hypertension using Logistic Regression Analysis and Decision Tree Analysis

  • SoHyun Kim;SungHyoun Cho
    • Physical Therapy Rehabilitation Science
    • /
    • v.12 no.2
    • /
    • pp.80-91
    • /
    • 2023
  • Objective: The purpose of this study is to identify factors that affect the incidence of hypertension using logistic regression and decision tree analysis, and to build and compare predictive models. Design: Secondary data analysis study Methods: We analyzed 9,859 subjects from the Korean health panel annual 2019 data provided by the Korea Institute for Health and Social Affairs and National Health Insurance Service. Frequency analysis, chi-square test, binary logistic regression, and decision tree analysis were performed on the data. Results: In logistic regression analysis, those who were 60 years of age or older (Odds ratio, OR=68.801, p<0.001), those who were divorced/widowhood/separated (OR=1.377, p<0.001), those who graduated from middle school or younger (OR=1, reference), those who did not walk at all (OR=1, reference), those who were obese (OR=5.109, p<0.001), and those who had poor subjective health status (OR=2.163, p<0.001) were more likely to develop hypertension. In the decision tree, those over 60 years of age, overweight or obese, and those who graduated from middle school or younger had the highest probability of developing hypertension at 83.3%. Logistic regression analysis showed a specificity of 85.3% and sensitivity of 47.9%; while decision tree analysis showed a specificity of 81.9% and sensitivity of 52.9%. In classification accuracy, logistic regression and decision tree analysis showed 73.6% and 72.6% prediction, respectively. Conclusions: Both logistic regression and decision tree analysis were adequate to explain the predictive model. It is thought that both analysis methods can be used as useful data for constructing a predictive model for hypertension.

Development and Evaluation of Electronic Health Record Data-Driven Predictive Models for Pressure Ulcers (전자건강기록 데이터 기반 욕창 발생 예측모델의 개발 및 평가)

  • Park, Seul Ki;Park, Hyeoun-Ae;Hwang, Hee
    • Journal of Korean Academy of Nursing
    • /
    • v.49 no.5
    • /
    • pp.575-585
    • /
    • 2019
  • Purpose: The purpose of this study was to develop predictive models for pressure ulcer incidence using electronic health record (EHR) data and to compare their predictive validity performance indicators with that of the Braden Scale used in the study hospital. Methods: A retrospective case-control study was conducted in a tertiary teaching hospital in Korea. Data of 202 pressure ulcer patients and 14,705 non-pressure ulcer patients admitted between January 2015 and May 2016 were extracted from the EHRs. Three predictive models for pressure ulcer incidence were developed using logistic regression, Cox proportional hazards regression, and decision tree modeling. The predictive validity performance indicators of the three models were compared with those of the Braden Scale. Results: The logistic regression model was most efficient with a high area under the receiver operating characteristics curve (AUC) estimate of 0.97, followed by the decision tree model (AUC 0.95), Cox proportional hazards regression model (AUC 0.95), and the Braden Scale (AUC 0.82). Decreased mobility was the most significant factor in the logistic regression and Cox proportional hazards models, and the endotracheal tube was the most important factor in the decision tree model. Conclusion: Predictive validity performance indicators of the Braden Scale were lower than those of the logistic regression, Cox proportional hazards regression, and decision tree models. The models developed in this study can be used to develop a clinical decision support system that automatically assesses risk for pressure ulcers to aid nurses.

A Study on the Development of Construction Dispute Predictive Analytics Model - Based on Decision Tree - (PA기법을 활용한 건설분쟁 예측모델 개발에 관한 연구 - 의사결정나무를 중심으로 -)

  • Jang, Se Rim;Kim, Han Soo
    • Korean Journal of Construction Engineering and Management
    • /
    • v.22 no.6
    • /
    • pp.76-86
    • /
    • 2021
  • Construction projects have high potentials of claims and disputes due to inherent risks where a variety of stakeholders are involved. Since disputes could cause losses in terms of cost and time, it is a critical issue for contractors to forecast and pro-actively manage disputes in advance in order to secure project efficiency and higher profits. The objective of the study is to develop a decision tree-based predictive analytics model for forecasting dispute types and their probabilities according to construction project conditions. It can be a useful tool to forecast potential disputes and thus provide opportunities for proactive management.

A Predictive Model of Depression in Rural Elders-Decision Tree Analysis (의사결정나무 분석기법을 이용한 농촌거주 노인의 우울예측모형 구축)

  • Kim, Seong Eun;Kim, Sun Ah
    • Journal of Korean Academy of Nursing
    • /
    • v.43 no.3
    • /
    • pp.442-451
    • /
    • 2013
  • Purpose: This descriptive study was done to develop a predictive model of depression in rural elders that will guide prevention and reduction of depression in elders. Methods: A cross-sectional descriptive survey was done using face-to-face private interviews. Participants included in the final analysis were 461 elders (aged${\geq}$ 65 years). The questions were on depression, personal and environmental factors, body functions and structures, activity and participation. Decision tree analysis using the SPSS Modeler 14.1 program was applied to build an optimum and significant predictive model to predict depression in rural elders. Results: From the data analysis, the predictive model for factors related to depression in rural elders presented with 4 pathways. Predictive factors included exercise capacity, self-esteem, farming, social activity, cognitive function, and gender. The accuracy of the model was 83.7%, error rate 16.3%, sensitivity 63.3%, and specificity 93.6%. Conclusion: The results of this study can be used as a theoretical basis for developing a systematic knowledge system for nursing and for developing a protocol that prevents depression in elders living in rural areas, thereby contributing to advanced depression prevention for elders.

Prediction Model for the Risk of Scapular Winging in Young Women Based on the Decision Tree

  • Gwak, Gyeong-tae;Ahn, Sun-hee;Kim, Jun-hee;Weon, Young-soo;Kwon, Oh-yun
    • Physical Therapy Korea
    • /
    • v.27 no.2
    • /
    • pp.140-148
    • /
    • 2020
  • Background: Scapular winging (SW) could be caused by tightness or weakness of the periscapular muscles. Although data mining techniques are useful in classifying or predicting risk of musculoskeletal disorder, predictive models for risk of musculoskeletal disorder using the results of clinical test or quantitative data are scarce. Objects: This study aimed to (1) investigate the difference between young women with and without SW, (2) establish a predictive model for presence of SW, and (3) determine the cutoff value of each variable for predicting the risk of SW using the decision tree method. Methods: Fifty young female subjects participated in this study. To classify the presence of SW as the outcome variable, scapular protractor strength, elbow flexor strength, shoulder internal rotation, and whether the scapula is in the dominant or nondominant side were determined. Results: The classification tree selected scapular protractor strength, shoulder internal rotation range of motion, and whether the scapula is in the dominant or nondominant side as predictor variables. The classification tree model correctly classified 78.79% (p = 0.02) of the training data set. The accuracy obtained by the classification tree on the test data set was 82.35% (p = 0.04). Conclusion: The classification tree showed acceptable accuracy (82.35%) and high specificity (95.65%) but low sensitivity (54.55%). Based on the predictive model in this study, we suggested that 20% of body weight in scapular protractor strength is a meaningful cutoff value for presence of SW.

Machine Learning and Deep Learning Models to Predict Income and Employment with Busan's Strategic Industry and Export (머신러닝과 딥러닝 기법을 이용한 부산 전략산업과 수출에 의한 고용과 소득 예측)

  • Chae-Deug Yi
    • Korea Trade Review
    • /
    • v.46 no.1
    • /
    • pp.169-187
    • /
    • 2021
  • This paper analyzes the feasibility of using machine learning and deep learning methods to forecast the income and employment using the strategic industries as well as investment, export, and exchange rates. The decision tree, artificial neural network, support vector machine, and deep learning models were used to forecast the income and employment in Busan. The following were the main findings of the comparison of their predictive abilities. First, the decision tree models predict the income and employment well. The forecasting values for the income and employment appeared somewhat differently according to the depth of decision trees and several conditions of strategic industries as well as investment, export, and exchange rates. Second, since the artificial neural network models show that the coefficients are somewhat low and RMSE are somewhat high, these models are not good forecasting the income and employment. Third, the support vector machine models show the high predictive power with the high coefficients of determination and low RMSE. Fourth, the deep neural network models show the higher predictive power with appropriate epochs and batch sizes. Thus, since the machine learning and deep learning models can predict the employment well, we need to adopt the machine learning and deep learning models to forecast the income and employment.

A Development of Suicidal Ideation Prediction Model and Decision Rules for the Elderly: Decision Tree Approach (의사결정나무 기법을 이용한 노인들의 자살생각 예측모형 및 의사결정 규칙 개발)

  • Kim, Deok Hyun;Yoo, Dong Hee;Jeong, Dae Yul
    • The Journal of Information Systems
    • /
    • v.28 no.3
    • /
    • pp.249-276
    • /
    • 2019
  • Purpose The purpose of this study is to develop a prediction model and decision rules for the elderly's suicidal ideation based on the Korean Welfare Panel survey data. By utilizing this data, we obtained many decision rules to predict the elderly's suicide ideation. Design/methodology/approach This study used classification analysis to derive decision rules to predict on the basis of decision tree technique. Weka 3.8 is used as the data mining tool in this study. The decision tree algorithm uses J48, also known as C4.5. In addition, 66.6% of the total data was divided into learning data and verification data. We considered all possible variables based on previous studies in predicting suicidal ideation of the elderly. Finally, 99 variables including the target variable were used. Classification analysis was performed by introducing sampling technique through backward elimination and data balancing. Findings As a result, there were significant differences between the data sets. The selected data sets have different, various decision tree and several rules. Based on the decision tree method, we derived the rules for suicide prevention. The decision tree derives not only the rules for the suicidal ideation of the depressed group, but also the rules for the suicidal ideation of the non-depressed group. In addition, in developing the predictive model, the problem of over-fitting due to the data imbalance phenomenon was directly identified through the application of data balancing. We could conclude that it is necessary to balance the data on the target variables in order to perform the correct classification analysis without over-fitting. In addition, although data balancing is applied, it is shown that performance is not inferior in prediction rate when compared with a biased prediction model.

A Prediction Model for studying the Impact of Separated Families on Students using Decision Tree

  • Ourida Ben boubaker;Ines Hosni;Hala Elhadidy
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.4
    • /
    • pp.79-84
    • /
    • 2023
  • Social studies show that the number of separated families have lately increased due to different reasons. Despite the causes for family rift, many problems are resulted which affected the children physically and psychologically. This effect may cause them fail in their life especially at school. This paper focuses on the negative reaction of the parents' separation with other factors from the computer science prospective. Since the artificial intelligent field is the most common widespread in computer science, a predictive model is built to predict if a specific child whose parents separated, may complete the school successfully or fail to continue his education. This will be done using Decision Tree that have proved their effectiveness on the predication applications. As an experiment, a sample of individuals is randomly chosen and applied on our prediction model. As a result, this model shows that the separation may cause the child success at school if other factors are satisfied; the intelligent of the guardian, the relation between the parents after the separation, his age at the separation time, etc.