• Title/Summary/Keyword: Decision trees

Search Result 308, Processing Time 0.022 seconds

A Study on the Employee Turnover Prediction using XGBoost and SHAP (XGBoost와 SHAP 기법을 활용한 근로자 이직 예측에 관한 연구)

  • Lee, Jae Jun;Lee, Yu Rin;Lim, Do Hyun;Ahn, Hyun Chul
    • The Journal of Information Systems
    • /
    • v.30 no.4
    • /
    • pp.21-42
    • /
    • 2021
  • Purpose In order for companies to continue to grow, they should properly manage human resources, which are the core of corporate competitiveness. Employee turnover means the loss of talent in the workforce. When an employee voluntarily leaves his or her company, it will lose hiring and training cost and lead to the withdrawal of key personnel and new costs to train a new employee. From an employee's viewpoint, moving to another company is also risky because it can be time consuming and costly. Therefore, in order to reduce the social and economic costs caused by employee turnover, it is necessary to accurately predict employee turnover intention, identify the factors affecting employee turnover, and manage them appropriately in the company. Design/methodology/approach Prior studies have mainly used logistic regression and decision trees, which have explanatory power but poor predictive accuracy. In order to develop a more accurate prediction model, XGBoost is proposed as the classification technique. Then, to compensate for the lack of explainability, SHAP, one of the XAI techniques, is applied. As a result, the prediction accuracy of the proposed model is improved compared to the conventional methods such as LOGIT and Decision Trees. By applying SHAP to the proposed model, the factors affecting the overall employee turnover intention as well as a specific sample's turnover intention are identified. Findings Experimental results show that the prediction accuracy of XGBoost is superior to that of logistic regression and decision trees. Using SHAP, we find that jobseeking, annuity, eng_test, comm_temp, seti_dev, seti_money, equl_ablt, and sati_safe significantly affect overall employee turnover intention. In addition, it is confirmed that the factors affecting an individual's turnover intention are more diverse. Our research findings imply that companies should adopt a personalized approach for each employee in order to effectively prevent his or her turnover.

Computerization for Management of Street Tree Using CAD (CAD를 이용한 가로수 관리 전산화에 관한 연구)

  • 허상현;심경구
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.29 no.2
    • /
    • pp.68-76
    • /
    • 2001
  • The purpose of this study is to computerize street tree management using a CAD program in order to manage the drawing record of street trees systematically and concurrently. The configuration of this program is composed of Reference Data, Data Inquiry, and Cost Assessment. The Reference Data includes characteristics of trees, monthly managements records, damage by blight and insects and usage of pesticides. The Data Inquiry includes an individual search of the tree index, simple searches and multiple searches. The Cost Assessment includes two main components, the data input with labor cost, manure ocst and pesticide cost and the assesment of management cost for prevention of blight and insects, pruning and fertilization. The results of this study are as follows: 1) When there are practices such as transplanting and removing of street trees it is immediately updated with the various situation. By creating an in progress a tree management system, up to the date information can be given to the manager for decision making. 2) To identify individual tree at the site or in drawing, the street name and numbers were used instead of coordinates. Tree tags are attached to the street trees individually. It can make DB management simple and easy. 3) By doing simple or multiple search with constructed DB, data can be provided quickly. 4) The result of this type of search are useful in the assessment of management cost very useful in regards to items such as the pruning, pesticides scattering and fertilization. 5) By using the AutoCAD software and existing PC without purchasing new equipment, the cost of system implementation can be minimized.

  • PDF

Integrity Assessment for Reinforced Concrete Structures Using Fuzzy Decision Making (퍼지의사결정을 이용한 RC구조물의 건전성평가)

  • 손용우;정영채;김종길
    • Journal of the Computational Structural Engineering Institute of Korea
    • /
    • v.17 no.2
    • /
    • pp.131-140
    • /
    • 2004
  • It really needs fuzzy decision making of integrity assessment considering about both durability and load carrying capacity for maintenance and administration, such as repairing and reinforcing. This thesis shows efficient models about reinforced concrete structure using CART-ANFIS. It compares and analyzes decision trees parts of expert system, using the theory of fuzzy, and applying damage & diagnosis at reinforced concrete structure and decision trees of integrity assessment using established artificial neural. Decided the theory of reinforcement design for recovery of durability at damaged concrete & the theory of reinforcement design for increasing load carrying capacity keep stability of damage and detection. It is more efficient maintenance and administration at reinforced concrete for using integrity assessment model of this study and can carry out predicting cost of life cycle.

Identification of Risky Subgroups with Sleep Problems Among Adult Cancer Survivors Using Decision-tree Analyses: Based on the Korean National Health and Nutrition Examination Survey from 2013 to 2016 (의사결정나무 분석을 이용한 성인 암경험자의 문제수면 위험군 예측: 2013-2016년도 국민건강영양조사 자료 분석)

  • Kim, Hee Sun;Jeong, Seok Hee;Park, Sook Kyoung
    • Journal of Korean Biological Nursing Science
    • /
    • v.20 no.2
    • /
    • pp.103-113
    • /
    • 2018
  • Purpose: This study was performed to assess problems associated with sleep (short and long sleep duration) and to identify risky subgroups with sleep problems among adult cancer survivors. The study is based on the Korea National Health and Nutrition Examination Survey (KNHANES VI and VII) from 2013 to 2016. Methods: The sociodemographic and clinical data of 504 Korean cancer survivors aged 20-64 years was extracted from the KNHANES VI and VII database. Descriptive statistics for complex samples was used, and decision-tree analyses were performed using the SPSS WIN 24.0 program. Results: The mean age for survivors was approximately 51 years. The mean sleep duration was 6.97 hours; 36.2% of participants had short (< 7 hours) and 9.9% had long (> 8 hours) sleep duration. From the decision-trees analyses, the characteristics of the adult cancer survivors related to sleep problems were presented with six different pathways. Sleep problems were analyzed according to the survivors' sociodemographic information (age, education, living status, and occupation), clinical characteristics (body mass index, hypercholesterolemia, and anemia) and health-related quality of life (HRQoL). The HRQoL (${\leq}0.5$ or > 0.5 cutoff point) was a significant predictor of the participants' sleep problems because all six pathways were started from this predictor in the model. Conclusion: Health care professionals could use the decision-tree model for screening adult cancer survivors with sleep problems in clinical or community settings. Nursing interventions considering these specific individual characteristics and HRQoL level should be developed to have adequate sleep duration for Korean adult cancer survivors.

A New Decision Tree Algorithm Based on Rough Set and Entity Relationship (러프셋 이론과 개체 관계 비교를 통한 의사결정나무 구성)

  • Han, Sang-Wook;Kim, Jae-Yearn
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.33 no.2
    • /
    • pp.183-190
    • /
    • 2007
  • We present a new decision tree classification algorithm using rough set theory that can induce classification rules, the construction of which is based on core attributes and relationship between objects. Although decision trees have been widely used in machine learning and artificial intelligence, little research has focused on improving classification quality. We propose a new decision tree construction algorithm that can be simplified and provides an improved classification quality. We also compare the new algorithm with the ID3 algorithm in terms of the number of rules.

Deciding the Optimal Shutdown Time Incorporating the Accident Forecasting Model (원자력 발전소 사고 예측 모형과 병합한 최적 운행중지 결정 모형)

  • Yang, Hee Joong
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.41 no.4
    • /
    • pp.171-178
    • /
    • 2018
  • Recently, the continuing operation of nuclear power plants has become a major controversial issue in Korea. Whether to continue to operate nuclear power plants is a matter to be determined considering many factors including social and political factors as well as economic factors. But in this paper we concentrate only on the economic factors to make an optimum decision on operating nuclear power plants. Decisions should be based on forecasts of plant accident risks and large and small accident data from power plants. We outline the structure of a decision model that incorporate accident risks. We formulate to decide whether to shutdown permanently, shutdown temporarily for maintenance, or to operate one period of time and then periodically repeat the analysis and decision process with additional information about new costs and risks. The forecasting model to predict nuclear power plant accidents is incorporated for an improved decision making. First, we build a one-period decision model and extend this theory to a multi-period model. In this paper we utilize influence diagrams as well as decision trees for modeling. And bayesian statistical approach is utilized. Many of the parameter values in this model may be set fairly subjective by decision makers. Once the parameter values have been determined, the model will be able to present the optimal decision according to that value.

CHAID Algorithm by Cube-based Sampling

  • Park, Hee-Chang;Cho, Kwang-Hyun
    • 한국데이터정보과학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.239-247
    • /
    • 2003
  • Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, fraud dection, data reduction and variable screening, etc. CHAID(Chi-square Automatic Interaction Detector), is an exploratory method used to study the relationship between a dependent variable and a series of predictor variables. In this paper we propose and CHAID algorithm by cube-based sampling and explore CHAID algorithm in view of accuracy and speed by the number of variables.

  • PDF

Analysis of Students Leaving Their Majors Using Decision Tree

  • Park, Cheol-Yong;Song, Gyu-Moon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.13 no.2
    • /
    • pp.157-165
    • /
    • 2002
  • Since 1997, when a new educational system that encourages faculties instead of departments in universities is first introduced, students have much more chance to choose and leave their majors than before. As a result, colleges of basic arts and sciences confront with a serious problem since lots of students have left their majors at the colleges. In this paper, we analyze and provide a predictive model for those students in a university using decision trees.

  • PDF

PAC-Learning a Decision Tree with Pruning (의사결정나무의 현실적인 상황에서의 팩(PAC) 추론 방법)

  • Kim, Hyeon-Su
    • Asia pacific journal of information systems
    • /
    • v.3 no.1
    • /
    • pp.155-189
    • /
    • 1993
  • Empirical studies have shown that the performance of decision tree induction usually improves when the trees are pruned. Whether these results hold in general and to what extent pruning improves the accuracy of a concept have not been investigated theoretically. This paper provides a theoretical study of pruning. We focus on a particular type of pruning and determine a bound on the error due to pruning. This is combined with PAC (Probably Approximately Correct) Learning theory to determine a sample size sufficient to guarantee a probabilistic bound on the concept error. We also discuss additional pruning rules and give an analysis for the pruning error.

  • PDF

Deciding the Optimal Shutdown time of a Nuclear Power Plant (원자력 발전소의 최적 운행중지 시기 결정 방법)

  • Yang, Hee-Joong
    • IE interfaces
    • /
    • v.13 no.2
    • /
    • pp.211-216
    • /
    • 2000
  • A methodology that determines the optimal shutdown time of a nuclear power plant is suggested. The shutdown time is decided considering the trade off between the cost of accident and the loss of profit due to the early shutdown. We adopt the bayesian approach in manipulating the model parameter that predicts the accidents. We build decision tree models and apply dynamic programming approach to decide whether to shutdown immediately or operate one more period. The branch parameters in decision trees are updated by bayesian approach. We apply real data to this model and provide the cost of accidents that guarantees the immediate shutdown.

  • PDF