• Title/Summary/Keyword: XAI

Search Result 87, Processing Time 0.024 seconds

Prediction of Key Variables Affecting NBA Playoffs Advancement: Focusing on 3 Points and Turnover Features (미국 프로농구(NBA)의 플레이오프 진출에 영향을 미치는 주요 변수 예측: 3점과 턴오버 속성을 중심으로)

  • An, Sehwan;Kim, Youngmin
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.1
    • /
    • pp.263-286
    • /
    • 2022
  • This study acquires NBA statistical information for a total of 32 years from 1990 to 2022 using web crawling, observes variables of interest through exploratory data analysis, and generates related derived variables. Unused variables were removed through a purification process on the input data, and correlation analysis, t-test, and ANOVA were performed on the remaining variables. For the variable of interest, the difference in the mean between the groups that advanced to the playoffs and did not advance to the playoffs was tested, and then to compensate for this, the average difference between the three groups (higher/middle/lower) based on ranking was reconfirmed. Of the input data, only this year's season data was used as a test set, and 5-fold cross-validation was performed by dividing the training set and the validation set for model training. The overfitting problem was solved by comparing the cross-validation result and the final analysis result using the test set to confirm that there was no difference in the performance matrix. Because the quality level of the raw data is high and the statistical assumptions are satisfied, most of the models showed good results despite the small data set. This study not only predicts NBA game results or classifies whether or not to advance to the playoffs using machine learning, but also examines whether the variables of interest are included in the major variables with high importance by understanding the importance of input attribute. Through the visualization of SHAP value, it was possible to overcome the limitation that could not be interpreted only with the result of feature importance, and to compensate for the lack of consistency in the importance calculation in the process of entering/removing variables. It was found that a number of variables related to three points and errors classified as subjects of interest in this study were included in the major variables affecting advancing to the playoffs in the NBA. Although this study is similar in that it includes topics such as match results, playoffs, and championship predictions, which have been dealt with in the existing sports data analysis field, and comparatively analyzed several machine learning models for analysis, there is a difference in that the interest features are set in advance and statistically verified, so that it is compared with the machine learning analysis result. Also, it was differentiated from existing studies by presenting explanatory visualization results using SHAP, one of the XAI models.

Domain Knowledge Incorporated Counterfactual Example-Based Explanation for Bankruptcy Prediction Model (부도예측모형에서 도메인 지식을 통합한 반사실적 예시 기반 설명력 증진 방법)

  • Cho, Soo Hyun;Shin, Kyung-shik
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.307-332
    • /
    • 2022
  • One of the most intensively conducted research areas in business application study is a bankruptcy prediction model, a representative classification problem related to loan lending, investment decision making, and profitability to financial institutions. Many research demonstrated outstanding performance for bankruptcy prediction models using artificial intelligence techniques. However, since most machine learning algorithms are "black-box," AI has been identified as a prominent research topic for providing users with an explanation. Although there are many different approaches for explanations, this study focuses on explaining a bankruptcy prediction model using a counterfactual example. Users can obtain desired output from the model by using a counterfactual-based explanation, which provides an alternative case. This study introduces a counterfactual generation technique based on a genetic algorithm (GA) that leverages both domain knowledge (i.e., causal feasibility) and feature importance from a black-box model along with other critical counterfactual variables, including proximity, distribution, and sparsity. The proposed method was evaluated quantitatively and qualitatively to measure the quality and the validity.

SIEM System Performance Enhancement Mechanism Using Active Model Improvement Feedback Technology (능동형 모델 개선 피드백 기술을 활용한 보안관제 시스템 성능 개선 방안)

  • Shin, Youn-Sup;Jo, In-June
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.12
    • /
    • pp.896-905
    • /
    • 2021
  • In the field of SIEM(Security information and event management), many studies try to use a feedback system to solve lack of completeness of training data and false positives of new attack events that occur in the actual operation. However, the current feedback system requires too much human inputs to improve the running model and even so, those feedback from inexperienced analysts can affect the model performance negatively. Therefore, we propose "active model improving feedback technology" to solve the shortage of security analyst manpower, increasing false positive rates and degrading model performance. First, we cluster similar predicted events during the operation, calculate feedback priorities for those clusters and select and provide representative events from those highly prioritized clusters using XAI (eXplainable AI)-based event visualization. Once these events are feedbacked, we exclude less analogous events and then propagate the feedback throughout the clusters. Finally, these events are incrementally trained by an existing model. To verify the effectiveness of our proposal, we compared three distinct scenarios using PKDD2007 and CSIC2012. As a result, our proposal confirmed a 30% higher performance in all indicators compared to that of the model with no feedback and the current feedback system.

Trustworthy AI Framework for Malware Response (악성코드 대응을 위한 신뢰할 수 있는 AI 프레임워크)

  • Shin, Kyounga;Lee, Yunho;Bae, ByeongJu;Lee, Soohang;Hong, Heeju;Choi, Youngjin;Lee, Sangjin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.5
    • /
    • pp.1019-1034
    • /
    • 2022
  • Malware attacks become more prevalent in the hyper-connected society of the 4th industrial revolution. To respond to such malware, automation of malware detection using artificial intelligence technology is attracting attention as a new alternative. However, using artificial intelligence without collateral for its reliability poses greater risks and side effects. The EU and the United States are seeking ways to secure the reliability of artificial intelligence, and the government announced a reliable strategy for realizing artificial intelligence in 2021. The government's AI reliability has five attributes: Safety, Explainability, Transparency, Robustness and Fairness. We develop four elements of safety, explainable, transparent, and fairness, excluding robustness in the malware detection model. In particular, we demonstrated stable generalization performance, which is model accuracy, through the verification of external agencies, and developed focusing on explainability including transparency. The artificial intelligence model, of which learning is determined by changing data, requires life cycle management. As a result, demand for the MLops framework is increasing, which integrates data, model development, and service operations. EXE-executable malware and documented malware response services become data collector as well as service operation at the same time, and connect with data pipelines which obtain information for labeling and purification through external APIs. We have facilitated other security service associations or infrastructure scaling using cloud SaaS and standard APIs.

Deep Neural Network Analysis System by Visualizing Accumulated Weight Changes (누적 가중치 변화의 시각화를 통한 심층 신경망 분석시스템)

  • Taelin Yang;Jinho Park
    • Journal of the Korea Computer Graphics Society
    • /
    • v.29 no.3
    • /
    • pp.85-92
    • /
    • 2023
  • Recently, interest in artificial intelligence has increased due to the development of artificial intelligence fields such as ChatGPT and self-driving cars. However, there are still many unknown elements in training process of artificial intelligence, so that optimizing the model requires more time and effort than it needs. Therefore, there is a need for a tool or methodology that can analyze the weight changes during the training process of artificial intelligence and help out understatnding those changes. In this research, I propose a visualization system which helps people to understand the accumulated weight changes. The system calculates the weights for each training period to accumulates weight changes and stores accumulated weight changes to plot them in 3D space. This research will allow us to explore different aspect of artificial intelligence learning process, such as understanding how the model get trained and providing us an indicator on which hyperparameters should be changed for better performance. These attempts are expected to explore better in artificial intelligence learning process that is still considered as unknown and contribute to the development and application of artificial intelligence models.

The Prediction of the Helpfulness of Online Review Based on Review Content Using an Explainable Graph Neural Network (설명가능한 그래프 신경망을 활용한 리뷰 콘텐츠 기반의 유용성 예측모형)

  • Eunmi Kim;Yao Ziyan;Taeho Hong
    • Journal of Intelligence and Information Systems
    • /
    • v.29 no.4
    • /
    • pp.309-323
    • /
    • 2023
  • As the role of online reviews has become increasingly crucial, numerous studies have been conducted to utilize helpful reviews. Helpful reviews, perceived by customers, have been verified in various research studies to be influenced by factors such as ratings, review length, review content, and so on. The determination of a review's helpfulness is generally based on the number of 'helpful' votes from consumers, with more 'helpful' votes considered to have a more significant impact on consumers' purchasing decisions. However, recently written reviews that have not been exposed to many customers may have relatively few 'helpful' votes and may lack 'helpful' votes altogether due to a lack of participation. Therefore, rather than relying on the number of 'helpful' votes to assess the helpfulness of reviews, we aim to classify them based on review content. In addition, the text of the review emerges as the most influential factor in review helpfulness. This study employs text mining techniques, including topic modeling and sentiment analysis, to analyze the diverse impacts of content and emotions embedded in the review text. In this study, we propose a review helpfulness prediction model based on review content, utilizing movie reviews from IMDb, a global movie information site. We construct a review helpfulness prediction model by using an explainable Graph Neural Network (GNN), while addressing the interpretability limitations of the machine learning model. The explainable graph neural network is expected to provide more reliable information about helpful or non-helpful reviews as it can identify connections between reviews.

Domain Knowledge Incorporated Local Rule-based Explanation for ML-based Bankruptcy Prediction Model (머신러닝 기반 부도예측모형에서 로컬영역의 도메인 지식 통합 규칙 기반 설명 방법)

  • Soo Hyun Cho;Kyung-shik Shin
    • Information Systems Review
    • /
    • v.24 no.1
    • /
    • pp.105-123
    • /
    • 2022
  • Thanks to the remarkable success of Artificial Intelligence (A.I.) techniques, a new possibility for its application on the real-world problem has begun. One of the prominent applications is the bankruptcy prediction model as it is often used as a basic knowledge base for credit scoring models in the financial industry. As a result, there has been extensive research on how to improve the prediction accuracy of the model. However, despite its impressive performance, it is difficult to implement machine learning (ML)-based models due to its intrinsic trait of obscurity, especially when the field requires or values an explanation about the result obtained by the model. The financial domain is one of the areas where explanation matters to stakeholders such as domain experts and customers. In this paper, we propose a novel approach to incorporate financial domain knowledge into local rule generation to provide explanations for the bankruptcy prediction model at instance level. The result shows the proposed method successfully selects and classifies the extracted rules based on the feasibility and information they convey to the users.