• Title/Summary/Keyword: SHAP(Shapley additive explanations)

Search Result 11, Processing Time 0.022 seconds

Explainable Credit Default Prediction Using SHAP (SHAP을 이용한 설명 가능한 신용카드 연체 예측)

  • Minjoong Kim;Seungwoo Kim;Jihoon Moon
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2024.01a
    • /
    • pp.39-40
    • /
    • 2024
  • 본 연구는 SHAP(SHapley Additive exPlanations)을 활용하여 신용카드 사용자의 연체 가능성을 예측하는 기계학습 모델의 해석 가능성을 강화하는 방법을 제안한다. 대규모 신용카드 데이터를 분석하여, 고객의 나이, 성별, 결혼 상태, 결제 이력 등이 연체 발생에 미치는 영향을 명확히 하는 것을 목표로 한다. 본 연구를 토대로 금융기관은 더 정확한 위험 관리를 수행하고, 고객에게 맞춤형 서비스를 제공할 수 있는 기반을 마련할 수 있다.

  • PDF

A Study on the Prediction of Fuel Consumption of Bulk Ship Main Engine Using Explainable Artificial Intelligence (SHAP을 활용한 벌크선 메인엔진 연료 소모량 예측연구)

  • Hyun-Ju Kim;Min-Gyu Park;Ji-Hwan Lee
    • Journal of Navigation and Port Research
    • /
    • v.47 no.4
    • /
    • pp.182-190
    • /
    • 2023
  • This study proposes a predictive model using XGBoost and SHapley Additive exPlanation (SHAP) to estimate fuel consumption in bulk carriers. Previous studies have also utilized ship engine data and weather data. However, they lacked reliability in predicted results and explanations of variables used in the fuel consumption prediction model implementation. To address these limitations, this study developed a predictive model using XGBoost and SHAP. It provides research background, scope, relevant regulations, previous studies, and research methodology. Additionally, it explains the data cleaning method for bulk carriers and verifies results of the predictive model.

Experimental Analysis of Bankruptcy Prediction with SHAP framework on Polish Companies

  • Tuguldur Enkhtuya;Dae-Ki Kang
    • International journal of advanced smart convergence
    • /
    • v.12 no.1
    • /
    • pp.53-58
    • /
    • 2023
  • With the fast development of artificial intelligence day by day, users are demanding explanations about the results of algorithms and want to know what parameters influence the results. In this paper, we propose a model for bankruptcy prediction with interpretability using the SHAP framework. SHAP (SHAPley Additive exPlanations) is framework that gives a visualized result that can be used for explanation and interpretation of machine learning models. As a result, we can describe which features are important for the result of our deep learning model. SHAP framework Force plot result gives us top features which are mainly reflecting overall model score. Even though Fully Connected Neural Networks are a "black box" model, Shapley values help us to alleviate the "black box" problem. FCNNs perform well with complex dataset with more than 60 financial ratios. Combined with SHAP framework, we create an effective model with understandable interpretation. Bankruptcy is a rare event, then we avoid imbalanced dataset problem with the help of SMOTE. SMOTE is one of the oversampling technique that resulting synthetic samples are generated for the minority class. It uses K-nearest neighbors algorithm for line connecting method in order to producing examples. We expect our model results assist financial analysts who are interested in forecasting bankruptcy prediction of companies in detail.

A Study on the Remaining Useful Life Prediction Performance Variation based on Identification and Selection by using SHAP (SHAP를 활용한 중요변수 파악 및 선택에 따른 잔여유효수명 예측 성능 변동에 대한 연구)

  • Yoon, Yeon Ah;Lee, Seung Hoon;Kim, Yong Soo
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.44 no.4
    • /
    • pp.1-11
    • /
    • 2021
  • Recently, the importance of preventive maintenance has been emerging since failures in a complex system are automatically detected due to the development of artificial intelligence techniques and sensor technology. Therefore, prognostic and health management (PHM) is being actively studied, and prediction of the remaining useful life (RUL) of the system is being one of the most important tasks. A lot of researches has been conducted to predict the RUL. Deep learning models have been developed to improve prediction performance, but studies on identifying the importance of features are not carried out. It is very meaningful to extract and interpret features that affect failures while improving the predictive accuracy of RUL is important. In this paper, a total of six popular deep learning models were employed to predict the RUL, and identified important variables for each model through SHAP (Shapley Additive explanations) that one of the explainable artificial intelligence (XAI). Moreover, the fluctuations and trends of prediction performance according to the number of variables were identified. This paper can suggest the possibility of explainability of various deep learning models, and the application of XAI can be demonstrated. Also, through this proposed method, it is expected that the possibility of utilizing SHAP as a feature selection method.

A Transformer-Based Emotion Classification Model Using Transfer Learning and SHAP Analysis (전이 학습 및 SHAP 분석을 활용한 트랜스포머 기반 감정 분류 모델)

  • Subeen Leem;Byeongcheon Lee;Insu Jeon;Jihoon Moon
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.05a
    • /
    • pp.706-708
    • /
    • 2023
  • In this study, we embark on a journey to uncover the essence of emotions by exploring the depths of transfer learning on three pre-trained transformer models. Our quest to classify five emotions culminates in discovering the KLUE (Korean Language Understanding Evaluation)-BERT (Bidirectional Encoder Representations from Transformers) model, which is the most exceptional among its peers. Our analysis of F1 scores attests to its superior learning and generalization abilities on the experimental data. To delve deeper into the mystery behind its success, we employ the powerful SHAP (Shapley Additive Explanations) method to unravel the intricacies of the KLUE-BERT model. The findings of our investigation are presented with a mesmerizing text plot visualization, which serves as a window into the model's soul. This approach enables us to grasp the impact of individual tokens on emotion classification and provides irrefutable, visually appealing evidence to support the predictions of the KLUE-BERT model.

Socio-economic Indicators Based Relative Comparison Methodology of National Occupational Accident Fatality Rates Using Machine Learning (머신러닝을 활용한 사회 · 경제지표 기반 산재 사고사망률 상대비교 방법론)

  • Kyunghun, Kim;Sudong, Lee
    • Journal of the Korea Safety Management & Science
    • /
    • v.24 no.4
    • /
    • pp.41-47
    • /
    • 2022
  • A reliable prediction model of national occupational accident fatality rate can be used to evaluate level of safety and health protection for workers in a country. Moreover, the socio-economic aspects of occupational accidents can be identified through interpretation of a well-organized prediction model. In this paper, we propose a machine learning based relative comparison methods to predict and interpret a national occupational accident fatality rate based on socio-economic indicators. First, we collected 29 years of the relevant data from 11 developed countries. Second, we applied 4 types of machine learning regression models and evaluate their performance. Third, we interpret the contribution of each input variable using Shapley Additive Explanations(SHAP). As a result, Gradient Boosting Regressor showed the best predictive performance. We found that different patterns exist across countries in accordance with different socio-economic variables and occupational accident fatality rate.

A Framework for Early Detection and Interpretation of Concept Drift (컨셉 드리프트를 고려한 조기탐지 및 해석 프레임워크)

  • Min-Jung Kang;Su-Bin Oh;Sang-Min Lee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2023.11a
    • /
    • pp.701-704
    • /
    • 2023
  • 본 연구는 반도체 제조 과정에서 생산 가용 능력이 저하되는 시점을 조기 탐지하기 위한 프레임워크를 제안한다. 이를 위해 데이터 패턴의 불규칙한 변동이 잦은 환경에서 모델의 재학습 없이 최적의 성능을 유지할 수 있도록 온라인 학습 방식을 활용하였다. Augmented Dicky-Fuller test 를 통해 데이터의 정상성 여부를 검정하고, 데이터에 변화가 있을 경우 학습 모델은 지속적으로 업데이트된다. 특히, 상한 재공재고는 생산량과 직결되는 주요 지표로써, 낮게 예측된 시점에서 주요 원인 변수를 파악하는 것이 중요하다. 따라서 정확도와 효율성 측면에서 다른 모델 대비 가장 우수한 성능을 보였던 제안 기법에 shapley additive explanations(SHAP)을 적용하여 생산 저하 시 문제가 되는 원인 변수를 분석하고자 하였다.

RDP-based Lateral Movement Detection using PageRank and Interpretable System using SHAP (PageRank 특징을 활용한 RDP기반 내부전파경로 탐지 및 SHAP를 이용한 설명가능한 시스템)

  • Yun, Jiyoung;Kim, Dong-Wook;Shin, Gun-Yoon;Kim, Sang-Soo;Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.22 no.4
    • /
    • pp.1-11
    • /
    • 2021
  • As the Internet developed, various and complex cyber attacks began to emerge. Various detection systems were used outside the network to defend against attacks, but systems and studies to detect attackers inside were remarkably rare, causing great problems because they could not detect attackers inside. To solve this problem, studies on the lateral movement detection system that tracks and detects the attacker's movements have begun to emerge. Especially, the method of using the Remote Desktop Protocol (RDP) is simple but shows very good results. Nevertheless, previous studies did not consider the effects and relationships of each logon host itself, and the features presented also provided very low results in some models. There was also a problem that the model could not explain why it predicts that way, which resulted in reliability and robustness problems of the model. To address this problem, this study proposes an interpretable RDP-based lateral movement detection system using page rank algorithm and SHAP(Shapley Additive Explanations). Using page rank algorithms and various statistical techniques, we create features that can be used in various models and we provide explanations for model prediction using SHAP. In this study, we generated features that show higher performance in most models than previous studies and explained them using SHAP.

Development of ensemble machine learning models for evaluating seismic demands of steel moment frames

  • Nguyen, Hoang D.;Kim, JunHee;Shin, Myoungsu
    • Steel and Composite Structures
    • /
    • v.44 no.1
    • /
    • pp.49-63
    • /
    • 2022
  • This study aims to develop ensemble machine learning (ML) models for estimating the peak floor acceleration and maximum top drift of steel moment frames. For this purpose, random forest, adaptive boosting, gradient boosting regression tree (GBRT), and extreme gradient boosting (XGBoost) models were considered. A total of 621 steel moment frames were analyzed under 240 ground motions using OpenSees software to generate the dataset for ML models. From the results, the GBRT and XGBoost models exhibited the highest performance for predicting peak floor acceleration and maximum top drift, respectively. The significance of each input variable on the prediction was examined using the best-performing models and Shapley additive explanations approach (SHAP). It turned out that the peak ground acceleration had the most significant impact on the peak floor acceleration prediction. Meanwhile, the spectral accelerations at 1 and 2 s had the most considerable influence on the maximum top drift prediction. Finally, a graphical user interface module was created that places a pioneering step for the application of ML to estimate the seismic demands of building structures in practical design.

The Prediction of Cryptocurrency Prices Using eXplainable Artificial Intelligence based on Deep Learning (설명 가능한 인공지능과 CNN을 활용한 암호화폐 가격 등락 예측모형)

  • Taeho Hong;Jonggwan Won;Eunmi Kim;Minsu Kim
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.2
    • /
    • pp.129-148
    • /
    • 2023
  • Bitcoin is a blockchain technology-based digital currency that has been recognized as a representative cryptocurrency and a financial investment asset. Due to its highly volatile nature, Bitcoin has gained a lot of attention from investors and the public. Based on this popularity, numerous studies have been conducted on price and trend prediction using machine learning and deep learning. This study employed LSTM (Long Short Term Memory) and CNN (Convolutional Neural Networks), which have shown potential for predictive performance in the finance domain, to enhance the classification accuracy in Bitcoin price trend prediction. XAI(eXplainable Artificial Intelligence) techniques were applied to the predictive model to enhance its explainability and interpretability by providing a comprehensive explanation of the model. In the empirical experiment, CNN was applied to technical indicators and Google trend data to build a Bitcoin price trend prediction model, and the CNN model using both technical indicators and Google trend data clearly outperformed the other models using neural networks, SVM, and LSTM. Then SHAP(Shapley Additive exPlanations) was applied to the predictive model to obtain explanations about the output values. Important prediction drivers in input variables were extracted through global interpretation, and the interpretation of the predictive model's decision process for each instance was suggested through local interpretation. The results show that our proposed research framework demonstrates both improved classification accuracy and explainability by using CNN, Google trend data, and SHAP.