• 제목/요약/키워드: Explainable Machine Learning

검색결과 38건 처리시간 0.022초

부스팅 기계 학습과 SHAP를 이용한 설명 가능한 소프트웨어 분야 대졸자 취업 모델 개발 (Explainable Software Employment Model Development of University Graduates using Boosting Machine Learning and SHAP)

  • 권준희;김성림
    • 디지털산업정보학회논문지
    • /
    • 제19권3호
    • /
    • pp.177-192
    • /
    • 2023
  • The employment rate of university graduates has been decreasing significantly recently. With the advent of the Fourth Industrial Revolution, the demand for software employment has increased. It is necessary to analyze the factors for software employment of university graduates. This paper proposes explainable software employment model of university graduates using machine learning and explainable AI. The Graduates Occupational Mobility Survey(GOMS) provided by the Korea Employment Information Service is used. The employment model uses boosting machine learning. Then, performance evaluation is performed with four algorithms of boosting model. Moreover, it explains the factors affecting the employment using SHAP. The results indicates that the top 3 factors are major, employment goal setting semester, and vocational education and training.

설명 가능한 AI를 적용한 기계 예지 정비 방법 (Explainable AI Application for Machine Predictive Maintenance)

  • 천강민;양재경
    • 산업경영시스템학회지
    • /
    • 제44권4호
    • /
    • pp.227-233
    • /
    • 2021
  • Predictive maintenance has been one of important applications of data science technology that creates a predictive model by collecting numerous data related to management targeted equipment. It does not predict equipment failure with just one or two signs, but quantifies and models numerous symptoms and historical data of actual failure. Statistical methods were used a lot in the past as this predictive maintenance method, but recently, many machine learning-based methods have been proposed. Such proposed machine learning-based methods are preferable in that they show more accurate prediction performance. However, with the exception of some learning models such as decision tree-based models, it is very difficult to explicitly know the structure of learning models (Black-Box Model) and to explain to what extent certain attributes (features or variables) of the learning model affected the prediction results. To overcome this problem, a recently proposed study is an explainable artificial intelligence (AI). It is a methodology that makes it easy for users to understand and trust the results of machine learning-based learning models. In this paper, we propose an explainable AI method to further enhance the explanatory power of the existing learning model by targeting the previously proposedpredictive model [5] that learned data from a core facility (Hyper Compressor) of a domestic chemical plant that produces polyethylene. The ensemble prediction model, which is a black box model, wasconverted to a white box model using the Explainable AI. The proposed methodology explains the direction of control for the major features in the failure prediction results through the Explainable AI. Through this methodology, it is possible to flexibly replace the timing of maintenance of the machine and supply and demand of parts, and to improve the efficiency of the facility operation through proper pre-control.

설명가능한 인공지능을 통한 마르텐사이트 변태 온도 예측 모델 및 거동 분석 연구 (Study on predictive model and mechanism analysis for martensite transformation temperatures through explainable artificial intelligence)

  • 전준협;손승배;정재길;이석재
    • 열처리공학회지
    • /
    • 제37권3호
    • /
    • pp.103-113
    • /
    • 2024
  • Martensite volume fraction significantly affects the mechanical properties of alloy steels. Martensite start temperature (Ms), transformation temperature for martensite 50 vol.% (M50), and transformation temperature for martensite 90 vol.% (M90) are important transformation temperatures to control the martensite phase fraction. Several researchers proposed empirical equations and machine learning models to predict the Ms temperature. These numerical approaches can easily predict the Ms temperature without additional experiment and cost. However, to control martensite phase fraction more precisely, we need to reduce prediction error of the Ms model and propose prediction models for other martensite transformation temperatures (M50, M90). In the present study, machine learning model was applied to suggest the predictive model for the Ms, M50, M90 temperatures. To explain prediction mechanisms and suggest feature importance on martensite transformation temperature of machine learning models, the explainable artificial intelligence (XAI) is employed. Random forest regression (RFR) showed the best performance for predicting the Ms, M50, M90 temperatures using different machine learning models. The feature importance was proposed and the prediction mechanisms were discussed by XAI.

수질자료의 특성을 고려한 앙상블 머신러닝 모형 구축 및 설명가능한 인공지능을 이용한 모형결과 해석에 대한 연구 (Development of ensemble machine learning model considering the characteristics of input variables and the interpretation of model performance using explainable artificial intelligence)

  • 박정수
    • 상하수도학회지
    • /
    • 제36권4호
    • /
    • pp.239-248
    • /
    • 2022
  • The prediction of algal bloom is an important field of study in algal bloom management, and chlorophyll-a concentration(Chl-a) is commonly used to represent the status of algal bloom. In, recent years advanced machine learning algorithms are increasingly used for the prediction of algal bloom. In this study, XGBoost(XGB), an ensemble machine learning algorithm, was used to develop a model to predict Chl-a in a reservoir. The daily observation of water quality data and climate data was used for the training and testing of the model. In the first step of the study, the input variables were clustered into two groups(low and high value groups) based on the observed value of water temperature(TEMP), total organic carbon concentration(TOC), total nitrogen concentration(TN) and total phosphorus concentration(TP). For each of the four water quality items, two XGB models were developed using only the data in each clustered group(Model 1). The results were compared to the prediction of an XGB model developed by using the entire data before clustering(Model 2). The model performance was evaluated using three indices including root mean squared error-observation standard deviation ratio(RSR). The model performance was improved using Model 1 for TEMP, TN, TP as the RSR of each model was 0.503, 0.477 and 0.493, respectively, while the RSR of Model 2 was 0.521. On the other hand, Model 2 shows better performance than Model 1 for TOC, where the RSR was 0.532. Explainable artificial intelligence(XAI) is an ongoing field of research in machine learning study. Shapley value analysis, a novel XAI algorithm, was also used for the quantitative interpretation of the XGB model performance developed in this study.

설명 가능한 인공지능(XAI)을 활용한 침입탐지 신뢰성 강화 방안 (The Enhancement of intrusion detection reliability using Explainable Artificial Intelligence(XAI))

  • 정일옥;최우빈;김수철
    • 융합보안논문지
    • /
    • 제22권3호
    • /
    • pp.101-110
    • /
    • 2022
  • 다양한 분야에서 인공지능을 활용한 사례가 증가하면서 침입탐지 분야 또한 다양한 이슈를 인공지능을 통해 해결하려는 시도가 증가하고 있다. 하지만, 머신러닝을 통한 예측된 결과에 관한 이유를 설명하거나 추적할 수 없는 블랙박스 기반이 대부분으로 이를 활용해야 하는 보안 전문가에게 어려움을 주고 있다. 이러한 문제를 해결하고자 다양한 분야에서 머신러닝의 결정을 해석하고 이해하는데 도움이 되는 설명 가능한 AI(XAI)에 대한 연구가 증가하고 있다. 이에 본 논문에서는 머신러닝 기반의 침입탐지 예측 결과에 대한 신뢰성을 강화하기 위한 설명 가능한 AI를 제안한다. 먼저, XGBoost를 통해 침입탐지 모델을 구현하고, SHAP을 활용하여 모델에 대한 설명을 구현한다. 그리고 기존의 피처 중요도와 SHAP을 활용한 결과를 비교 분석하여 보안 전문가가 결정을 수행하는데 신뢰성을 제공한다. 본 실험을 위해 PKDD2007 데이터셋을 사용하였으며 기존의 피처 중요도와 SHAP Value에 대한 연관성을 분석하였으며, 이를 통해 SHAP 기반의 설명 가능한 AI가 보안 전문가들에게 침입탐지 모델의 예측 결과에 대한 신뢰성을 주는데 타당함을 검증하였다.

Understanding Interactive and Explainable Feedback for Supporting Non-Experts with Data Preparation for Building a Deep Learning Model

  • Kim, Yeonji;Lee, Kyungyeon;Oh, Uran
    • International journal of advanced smart convergence
    • /
    • 제9권2호
    • /
    • pp.90-104
    • /
    • 2020
  • It is difficult for non-experts to build machine learning (ML) models at the level that satisfies their needs. Deep learning models are even more challenging because it is unclear how to improve the model, and a trial-and-error approach is not feasible since training these models are time-consuming. To assist these novice users, we examined how interactive and explainable feedback while training a deep learning network can contribute to model performance and users' satisfaction, focusing on the data preparation process. We conducted a user study with 31 participants without expertise, where they were asked to improve the accuracy of a deep learning model, varying feedback conditions. While no significant performance gain was observed, we identified potential barriers during the process and found that interactive and explainable feedback provide complementary benefits for improving users' understanding of ML. We conclude with implications for designing an interface for building ML models for novice users.

Explainable Machine Learning Based a Packed Red Blood Cell Transfusion Prediction and Evaluation for Major Internal Medical Condition

  • Lee, Seongbin;Lee, Seunghee;Chang, Duhyeuk;Song, Mi-Hwa;Kim, Jong-Yeup;Lee, Suehyun
    • Journal of Information Processing Systems
    • /
    • 제18권3호
    • /
    • pp.302-310
    • /
    • 2022
  • Efficient use of limited blood products is becoming very important in terms of socioeconomic status and patient recovery. To predict the appropriateness of patient-specific transfusions for the intensive care unit (ICU) patients who require real-time monitoring, we evaluated a model to predict the possibility of transfusion dynamically by using the Medical Information Mart for Intensive Care III (MIMIC-III), an ICU admission record at Harvard Medical School. In this study, we developed an explainable machine learning to predict the possibility of red blood cell transfusion for major medical diseases in the ICU. Target disease groups that received packed red blood cell transfusions at high frequency were selected and 16,222 patients were finally extracted. The prediction model achieved an area under the ROC curve of 0.9070 and an F1-score of 0.8166 (LightGBM). To explain the performance of the machine learning model, feature importance analysis and a partial dependence plot were used. The results of our study can be used as basic data for recommendations related to the adequacy of blood transfusions and are expected to ultimately contribute to the recovery of patients and prevention of excessive consumption of blood products.

해석가능한 기계학습을 적용한 소지역 인구 추정에 관한 연구: 부산광역시를 대상으로 (A Study on the Population Estimation of Small Areas using Explainable Machine Learning: Focused on the Busan Metropolitan City)

  • 김유현;김동현
    • 한국지리정보학회지
    • /
    • 제26권4호
    • /
    • pp.97-115
    • /
    • 2023
  • 최근 저출산, 고령화 등 인구의 구조가 급격히 변화하고 있고 인구 분포의 불균등성이 확대되고 있는 시점에서 인구 추정 방식의 변화가 요구되고 있으며 소지역 단위에서 보다 정확한 추정이 요구되고 있다. 본 연구는 이러한 인구 추정 방식 변화 요구에 대응하기 위해 부산광역시를 대상으로 해석가능한 기계학습 방법을 적용하여 500m 격자 단위에서 2040년 인구를 추정하는 것을 목적으로 하고 있다. 해석가능한 기계학습의 방법과 코호트 요인법을 각각 적용하여 격자별 인구추정 결과를 비교해본 결과, 기계학습 방법이 인구 구조 변동에 영향을 미칠 가능성이 있는 여러 변수의 조합 반영이 가능하여 보다 낮은 오차를 도출함으로써 소지역과 같이 인구 변화폭이 큰 지역의 추정에 있어 적용력이 높음을 확인하였다. 인구감소시대에 과대추정된 인구 값은 도시계획에서 투자의 비효율성과 특정 부문에 대한 과잉 투자에 따른 타 부문에서의 질적 저하와 같은 문제를 일으킬 가능성이 높으며, 과소추정된 인구 값 역시 도시의 축소를 가속화시켜 삶의 질을 저하시키는 문제를 초래하므로 적절한 인구 추정 방법과 대안을 마련해야 할 필요가 있을 것으로 판단된다.

설명 가능한 인공지능을 이용한 지역별 출산율 차이 요인 분석 (Analysis of Regional Fertility Gap Factors Using Explainable Artificial Intelligence)

  • 이동우;김미경;윤정윤;류동원;송재욱
    • 산업경영시스템학회지
    • /
    • 제47권1호
    • /
    • pp.41-50
    • /
    • 2024
  • Korea is facing a significant problem with historically low fertility rates, which is becoming a major social issue affecting the economy, labor force, and national security. This study analyzes the factors contributing to the regional gap in fertility rates and derives policy implications. The government and local authorities are implementing a range of policies to address the issue of low fertility. To establish an effective strategy, it is essential to identify the primary factors that contribute to regional disparities. This study identifies these factors and explores policy implications through machine learning and explainable artificial intelligence. The study also examines the influence of media and public opinion on childbirth in Korea by incorporating news and online community sentiment, as well as sentiment fear indices, as independent variables. To establish the relationship between regional fertility rates and factors, the study employs four machine learning models: multiple linear regression, XGBoost, Random Forest, and Support Vector Regression. Support Vector Regression, XGBoost, and Random Forest significantly outperform linear regression, highlighting the importance of machine learning models in explaining non-linear relationships with numerous variables. A factor analysis using SHAP is then conducted. The unemployment rate, Regional Gross Domestic Product per Capita, Women's Participation in Economic Activities, Number of Crimes Committed, Average Age of First Marriage, and Private Education Expenses significantly impact regional fertility rates. However, the degree of impact of the factors affecting fertility may vary by region, suggesting the need for policies tailored to the characteristics of each region, not just an overall ranking of factors.

디지털 헬스케어 데이터 분석을 위한 머신 러닝 기술 활용 동향 (Trend of Utilization of Machine Learning Technology for Digital Healthcare Data Analysis)

  • 우영춘;이성엽;최완;안창원;백옥기
    • 전자통신동향분석
    • /
    • 제34권1호
    • /
    • pp.98-110
    • /
    • 2019
  • Machine learning has been applied to medical imaging and has shown an excellent recognition rate. Recently, there has been much interest in preventive medicine. If data are accessible, machine learning packages can be used easily in digital healthcare fields. However, it is necessary to prepare the data in advance, and model evaluation and tuning are required to construct a reliable model. On average, these processes take more than 80% of the total effort required. In this study, we describe the basic concepts of machine learning, pre-processing and visualization of datasets, feature engineering for reliable models, model evaluation and tuning, and the latest trends in popular machine learning frameworks. Finally, we survey a explainable machine learning analysis tool and will discuss the future direction of machine learning.