• Title/Summary/Keyword: Gradient-based Explanation

Search Result 11, Processing Time 0.021 seconds

A Gradient-Based Explanation Method for Node Classification Using Graph Convolutional Networks

  • Chaehyeon Kim;Hyewon Ryu;Ki Yong Lee
    • Journal of Information Processing Systems
    • /
    • v.19 no.6
    • /
    • pp.803-816
    • /
    • 2023
  • Explainable artificial intelligence is a method that explains how a complex model (e.g., a deep neural network) yields its output from a given input. Recently, graph-type data have been widely used in various fields, and diverse graph neural networks (GNNs) have been developed for graph-type data. However, methods to explain the behavior of GNNs have not been studied much, and only a limited understanding of GNNs is currently available. Therefore, in this paper, we propose an explanation method for node classification using graph convolutional networks (GCNs), which is a representative type of GNN. The proposed method finds out which features of each node have the greatest influence on the classification of that node using GCN. The proposed method identifies influential features by backtracking the layers of the GCN from the output layer to the input layer using the gradients. The experimental results on both synthetic and real datasets demonstrate that the proposed explanation method accurately identifies the features of each node that have the greatest influence on its classification.

A Gradient-Based Explanation Method for Graph Convolutional Neural Networks (그래프 합성곱 신경망에 대한 기울기(Gradient) 기반 설명 기법)

  • Kim, Chaehyeon;Lee, Ki Yong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.670-673
    • /
    • 2022
  • 설명가능한 인공지능은 딥러닝과 같은 복잡한 모델에서 어떠한 원리로 해당 결과를 도출해냈는지에 대한 설명을 함으로써 구축된 모델을 이해할 수 있도록 설명하는 기술이다. 최근 여러 분야에서 그래프 형태의 데이터들이 생성되고 있으며, 이들에 대한 분류를 위해 다양한 그래프 신경망들이 사용되고 있다. 본 논문에서는 대표적인 그래프 신경망인 그래프 합성곱 신경망(graph convolutional network, GCN)에 대한 설명 기법을 제안한다. 제안 기법은 주어진 그래프의 각 노드를 GCN을 사용하여 분류했을 때, 각 노드의 어떤 특징들이 분류에 가장 큰 영향을 미쳤는지를 수치로 알려준다. 제안 기법은 최종 분류 결과에 영향을 미친 요소들을 gradient를 통해 단계적으로 추적함으로써 각 노드의 어떤 특징들이 분류에 중요한 역할을 했는지 파악한다. 가상 데이터를 통한 실험을 통해 제안 방법은 분류에 가장 큰 영향을 주는 노드들의 특징들을 실제로 정확히 찾아냄을 확인하였다.

Optimized machine learning algorithms for predicting the punching shear capacity of RC flat slabs

  • Huajun Yan;Nan Xie;Dandan Shen
    • Advances in concrete construction
    • /
    • v.17 no.1
    • /
    • pp.27-36
    • /
    • 2024
  • Reinforced concrete (RC) flat slabs should be designed based on punching shear strength. As part of this study, machine learning (ML) algorithms were developed to accurately predict the punching shear strength of RC flat slabs without shear reinforcement. It is based on Bayesian optimization (BO), combined with four standard algorithms (Support vector regression, Decision trees, Random forests, Extreme gradient boosting) on 446 datasets that contain six design parameters. Furthermore, an analysis of feature importance is carried out by Shapley additive explanation (SHAP), in order to quantify the effect of design parameters on punching shear strength. According to the results, the BO method produces high prediction accuracy by selecting the optimal hyperparameters for each model. With R2 = 0.985, MAE = 0.0155 MN, RMSE = 0.0244 MN, the BO-XGBoost model performed better than the original XGBoost prediction, which had R2 = 0.917, MAE = 0.064 MN, RMSE = 0.121 MN in total dataset. Additionally, recommendations are provided on how to select factors that will influence punching shear resistance of RC flat slabs without shear reinforcement.

Visual Explanation of a Deep Learning Solar Flare Forecast Model and Its Relationship to Physical Parameters

  • Yi, Kangwoo;Moon, Yong-Jae;Lim, Daye;Park, Eunsu;Lee, Harim
    • The Bulletin of The Korean Astronomical Society
    • /
    • v.46 no.1
    • /
    • pp.42.1-42.1
    • /
    • 2021
  • In this study, we present a visual explanation of a deep learning solar flare forecast model and its relationship to physical parameters of solar active regions (ARs). For this, we use full-disk magnetograms at 00:00 UT from the Solar and Heliospheric Observatory/Michelson Doppler Imager and the Solar Dynamics Observatory/Helioseismic and Magnetic Imager, physical parameters from the Space-weather HMI Active Region Patch (SHARP), and Geostationary Operational Environmental Satellite X-ray flare data. Our deep learning flare forecast model based on the Convolutional Neural Network (CNN) predicts "Yes" or "No" for the daily occurrence of C-, M-, and X-class flares. We interpret the model using two CNN attribution methods (guided backpropagation and Gradient-weighted Class Activation Mapping [Grad-CAM]) that provide quantitative information on explaining the model. We find that our deep learning flare forecasting model is intimately related to AR physical properties that have also been distinguished in previous studies as holding significant predictive ability. Major results of this study are as follows. First, we successfully apply our deep learning models to the forecast of daily solar flare occurrence with TSS = 0.65, without any preprocessing to extract features from data. Second, using the attribution methods, we find that the polarity inversion line is an important feature for the deep learning flare forecasting model. Third, the ARs with high Grad-CAM values produce more flares than those with low Grad-CAM values. Fourth, nine SHARP parameters such as total unsigned vertical current, total unsigned current helicity, total unsigned flux, and total photospheric magnetic free energy density are well correlated with Grad-CAM values.

  • PDF

A Comparative Analysis of Ensemble Learning-Based Classification Models for Explainable Term Deposit Subscription Forecasting (설명 가능한 정기예금 가입 여부 예측을 위한 앙상블 학습 기반 분류 모델들의 비교 분석)

  • Shin, Zian;Moon, Jihoon;Rho, Seungmin
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.3
    • /
    • pp.97-117
    • /
    • 2021
  • Predicting term deposit subscriptions is one of representative financial marketing in banks, and banks can build a prediction model using various customer information. In order to improve the classification accuracy for term deposit subscriptions, many studies have been conducted based on machine learning techniques. However, even if these models can achieve satisfactory performance, utilizing them is not an easy task in the industry when their decision-making process is not adequately explained. To address this issue, this paper proposes an explainable scheme for term deposit subscription forecasting. For this, we first construct several classification models using decision tree-based ensemble learning methods, which yield excellent performance in tabular data, such as random forest, gradient boosting machine (GBM), extreme gradient boosting (XGB), and light gradient boosting machine (LightGBM). We then analyze their classification performance in depth through 10-fold cross-validation. After that, we provide the rationale for interpreting the influence of customer information and the decision-making process by applying Shapley additive explanation (SHAP), an explainable artificial intelligence technique, to the best classification model. To verify the practicality and validity of our scheme, experiments were conducted with the bank marketing dataset provided by Kaggle; we applied the SHAP to the GBM and LightGBM models, respectively, according to different dataset configurations and then performed their analysis and visualization for explainable term deposit subscriptions.

On differentiation of multi -variable functions (다변수 미분에 관하여)

  • Pak, Hee-Chul;Park, Young-Ja
    • Journal for History of Mathematics
    • /
    • v.21 no.2
    • /
    • pp.81-90
    • /
    • 2008
  • It has been noticed the greater importance of mathematical education, particularly of multi-variable calculus in the undergraduate level with remarkable progress of all sorts of sciences requiring mathematical analysis. However, there was lack of variety of introducing the definition of differentiation of multi-variable functions - in fact, all of them basically rely on the chain rules. Here we will introduce a way of defining the geometrical differentiation of the multi-variable functions based upon our teaching experience. One of its merits is that it provides the geometric explanation of the differentiation of the multi-variable functions, so that it conveys the meaning of the differentiation better compared with the known methods.

  • PDF

Turbulent Flow over Thin Rectangular Riblets

  • El-Samni O. A.;Yoon Hyun Sik;Chun Ho Hwan
    • Journal of Mechanical Science and Technology
    • /
    • v.19 no.9
    • /
    • pp.1801-1810
    • /
    • 2005
  • The effect of longitudinal thin rectangular riblets aligned with the flow direction on turbulent channel flow has been investigated using direct numerical simulation. The thin riblets have been modeled using the immersed boundary method (IBM) where the velocities at only one set of vertical nodes at the riblets positions are enforced to be zeros. Different spacings, ranging between 11 and 43 wall units, have been simulated aiming at getting the optimum spacing corresponding to the maximum drag reduction while keeping the height/spacing ratio at 0.5. Reynolds number based on the friction velocity ${\mu}_\tau$ and the channel half depth $\delta$ is set to 150. The flow is driven by adjusted pressure gradient so that the mass flow rate is kept constant in all the simulations. This study shows similar trend of the drag ratio to that of the experiments at the different spacings. Also, this research provides an optimum spacing of around 17 wall units leading to maximum drag reduction as experimental data. Explanation of drag increasing/decreasing mechanism is highlighted.

Comparison of Four Different Ordination Methods for Patterning Water Quality of Agricultural Reservoirs

  • Bae, Mi-Jung;Kwon, Yong-Su;Hwang, Soon-Jin;Park, Young-Seuk
    • Korean Journal of Ecology and Environment
    • /
    • v.41 no.spc
    • /
    • pp.1-10
    • /
    • 2008
  • We patterned water quality of agricultural reservoirs according to the differences of six physico-chemical environmental factors (TN, TP, DO, BOD, COD, and SS) using four different ordination methods: Principal Components Analysis (PCA), Detrended Correspondence Analysis (DCA), Nonmetric Multidimensional Scaling (NMS), and Isometric Feature Mapping (Isomap). The data set was obtained from the water quality monitoring networks operated by the Ministry of Agriculture and Forestry and the Ministry of Environments. Chlorophyll-${\alpha}$ displayed the highest correlation with COD, followed by TP, BOD, SS, and TN (p<0.01), while negatively correlated with altitude and bank height of the reservoirs (p<0.01). Although four different ordination methods similarly patterned the reservoirs according to the gradient of nutrient concentration, PCA and NMS appeared to be the most efficient methods to pattern water quality of reservoirs based on the explanation power. Considering variable scores in the ordination map, the concentration of nutrients was positively correlated with Chl-${\alpha}$, while negatively correlated with altitude and bank height. These ordination methods may help to pattern agricultural reservoirs according to their water quality characteristics.

Analyze weeds classification with visual explanation based on Convolutional Neural Networks

  • Vo, Hoang-Trong;Yu, Gwang-Hyun;Nguyen, Huy-Toan;Lee, Ju-Hwan;Dang, Thanh-Vu;Kim, Jin-Young
    • Smart Media Journal
    • /
    • v.8 no.3
    • /
    • pp.31-40
    • /
    • 2019
  • To understand how a Convolutional Neural Network (CNN) model captures the features of a pattern to determine which class it belongs to, in this paper, we use Gradient-weighted Class Activation Mapping (Grad-CAM) to visualize and analyze how well a CNN model behave on the CNU weeds dataset. We apply this technique to Resnet model and figure out which features this model captures to determine a specific class, what makes the model get a correct/wrong classification, and how those wrong label images can cause a negative effect to a CNN model during the training process. In the experiment, Grad-CAM highlights the important regions of weeds, depending on the patterns learned by Resnet, such as the lobe and limb on 미국가막사리, or the entire leaf surface on 단풍잎돼지풀. Besides, Grad-CAM points out a CNN model can localize the object even though it is trained only for the classification problem.

Prediction of patent lifespan and analysis of influencing factors using machine learning (기계학습을 활용한 특허수명 예측 및 영향요인 분석)

  • Kim, Yongwoo;Kim, Min Gu;Kim, Young-Min
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.147-170
    • /
    • 2022
  • Although the number of patent which is one of the core outputs of technological innovation continues to increase, the number of low-value patents also hugely increased. Therefore, efficient evaluation of patents has become important. Estimation of patent lifespan which represents private value of a patent, has been studied for a long time, but in most cases it relied on a linear model. Even if machine learning methods were used, interpretation or explanation of the relationship between explanatory variables and patent lifespan was insufficient. In this study, patent lifespan (number of renewals) is predicted based on the idea that patent lifespan represents the value of the patent. For the research, 4,033,414 patents applied between 1996 and 2017 and finally granted were collected from USPTO (US Patent and Trademark Office). To predict the patent lifespan, we use variables that can reflect the characteristics of the patent, the patent owner's characteristics, and the inventor's characteristics. We build four different models (Ridge Regression, Random Forest, Feed Forward Neural Network, Gradient Boosting Models) and perform hyperparameter tuning through 5-fold Cross Validation. Then, the performance of the generated models are evaluated, and the relative importance of predictors is also presented. In addition, based on the Gradient Boosting Model which have excellent performance, Accumulated Local Effects Plot is presented to visualize the relationship between predictors and patent lifespan. Finally, we apply Kernal SHAP (SHapley Additive exPlanations) to present the evaluation reason of individual patents, and discuss applicability to the patent evaluation system. This study has academic significance in that it cumulatively contributes to the existing patent life estimation research and supplements the limitations of existing patent life estimation studies based on linearity. It is academically meaningful that this study contributes cumulatively to the existing studies which estimate patent lifespan, and that it supplements the limitations of linear models. Also, it is practically meaningful to suggest a method for deriving the evaluation basis for individual patent value and examine the applicability to patent evaluation systems.