• 제목/요약/키워드: Model interpretability

검색결과 53건 처리시간 0.02초

A Study on Explainable Artificial Intelligence-based Sentimental Analysis System Model

  • Song, Mi-Hwa
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제14권1호
    • /
    • pp.142-151
    • /
    • 2022
  • In this paper, a model combined with explanatory artificial intelligence (xAI) models was presented to secure the reliability of machine learning-based sentiment analysis and prediction. The applicability of the proposed model was tested and described using the IMDB dataset. This approach has an advantage in that it can explain how the data affects the prediction results of the model from various perspectives. In various applications of sentiment analysis such as recommendation system, emotion analysis through facial expression recognition, and opinion analysis, it is possible to gain trust from users of the system by presenting more specific and evidence-based analysis results to users.

혼파초지에서 모형의 단계적 적용을 통한 수량예측 연구 (A Research on Yield Prediction of Mixed Pastures in Korea via Model Construction in Stages)

  • 오승민;김문주;팽경룬;이배훈;김지융;김병완;조무환;성경일
    • 한국초지조사료학회지
    • /
    • 제37권1호
    • /
    • pp.80-91
    • /
    • 2017
  • 본 연구는 기후요인을 이용한 혼파초지 수량예측모형을 기초로 하여 시비, 파종 및 조성연차 요인을 단계적으로 적용하여 해석력이 높은 모형을 선정하는데 목적이 있다. 혼파초지 수량예측모형 구축 과정은 자료(풀사료 및 기상자료)수집, 가공, 분석 및 모형 구축의 순이었다. 여기서 수량예측모형은 기후, 시비, 파종 및 조성연차 요인을 고려하여 6가지를 구축하였으며, 해석력 및 풀사료 생산 이론 측면의 검토를 통해 최적의 모형을 선택하였다. 그 결과 기후, 시비 및 파종과 조성연차(조성연차의 그룹화) 요인을 고려한 Model VI이 선택되었다(해석력=53.8%). Model VI의 요인 별 해석력은 기후요인이 가장 크고(24.5%) 시비(17.8%), 파종(10.7%) 및 조성연차(0.8%) 요인의 순이었다. 그러나 건물수량과 하고일수 간에 나타난 정(+)의 상관관계는 지역별 및 적산변수 등의 관점에서 검토가 필요하다. 또한 시비량 및 파종량은 특정값에 집중적으로 분포하고 있어 이차항(Quadratic term)을 이용하여 적정 수준에 관한 연구가 요구된다.

Generalized Partially Linear Additive Models for Credit Scoring

  • Shim, Ju-Hyun;Lee, Young-K.
    • 응용통계연구
    • /
    • 제24권4호
    • /
    • pp.587-595
    • /
    • 2011
  • Credit scoring is an objective and automatic system to assess the credit risk of each customer. The logistic regression model is one of the popular methods of credit scoring to predict the default probability; however, it may not detect possible nonlinear features of predictors despite the advantages of interpretability and low computation cost. In this paper, we propose to use a generalized partially linear model as an alternative to logistic regression. We also introduce modern ensemble technologies such as bagging, boosting and random forests. We compare these methods via a simulation study and illustrate them through a German credit dataset.

Regression Models for Haplotype-Based Association Studies

  • Oh, So-Hee;NamKung, Jung-Hyun;Park, Tae-Sung
    • Genomics & Informatics
    • /
    • 제5권1호
    • /
    • pp.19-23
    • /
    • 2007
  • In this paper, we provide an overview of statistical models for haplotype-based association studies, and summarize their features based on the design matrix. We classify the design matrix into the two types: direct and indirect. For these two kinds of matrices, we present and compare characteristics using a simple hypothetical example, and a real data set. The motivation behind this study was to provide practitioners with an improved understanding, to facilitate the informed selection of the appropriate haplotype-based model and to improve the interpretability of the models.

비관측요인모형을 이용한 한국의 국내총생산 분석 (Analysis of Korean GDP by unobserved components model)

  • 성병찬;이승경
    • Journal of the Korean Data and Information Science Society
    • /
    • 제22권5호
    • /
    • pp.829-837
    • /
    • 2011
  • 본 논문에서는 비관측요인모형을 이용하여 한국의 국내총생산 시계열 자료를 분석한다. 이 모형이 확률적 및 결정적 요인들을 모두 포괄할 수 있다는 점을 이용하여, 보다 다양한 형태로 시계열 자료의 모형화를 시도하였으며, 지수평활법 및 박스-젠킨스의 ARIMA모형과 예측력을 비교하였다. 국내 총생산 자료에 대한 2년간의 미래 예측에서 비관측요인모형이 보다 우수함을 보인다.

Prediction models of rock quality designation during TBM tunnel construction using machine learning algorithms

  • Byeonghyun Hwang;Hangseok Choi;Kibeom Kwon;Young Jin Shin;Minkyu Kang
    • Geomechanics and Engineering
    • /
    • 제38권5호
    • /
    • pp.507-515
    • /
    • 2024
  • An accurate estimation of the geotechnical parameters in front of tunnel faces is crucial for the safe construction of underground infrastructure using tunnel boring machines (TBMs). This study was aimed at developing a data-driven model for predicting the rock quality designation (RQD) of the ground formation ahead of tunnel faces. The dataset used for the machine learning (ML) model comprises seven geological and mechanical features and 564 RQD values, obtained from an earth pressure balance (EPB) shield TBM tunneling project beneath the Han River in the Republic of Korea. Four ML algorithms were employed in developing the RQD prediction model: k-nearest neighbor (KNN), support vector regression (SVR), random forest (RF), and extreme gradient boosting (XGB). The grid search and five-fold cross-validation techniques were applied to optimize the prediction performance of the developed model by identifying the optimal hyperparameter combinations. The prediction results revealed that the RF algorithm-based model exhibited superior performance, achieving a root mean square error of 7.38% and coefficient of determination of 0.81. In addition, the Shapley additive explanations (SHAP) approach was adopted to determine the most relevant features, thereby enhancing the interpretability and reliability of the developed model with the RF algorithm. It was concluded that the developed model can successfully predict the RQD of the ground formation ahead of tunnel faces, contributing to safe and efficient tunnel excavation.

Enhancing prediction accuracy of concrete compressive strength using stacking ensemble machine learning

  • Yunpeng Zhao;Dimitrios Goulias;Setare Saremi
    • Computers and Concrete
    • /
    • 제32권3호
    • /
    • pp.233-246
    • /
    • 2023
  • Accurate prediction of concrete compressive strength can minimize the need for extensive, time-consuming, and costly mixture optimization testing and analysis. This study attempts to enhance the prediction accuracy of compressive strength using stacking ensemble machine learning (ML) with feature engineering techniques. Seven alternative ML models of increasing complexity were implemented and compared, including linear regression, SVM, decision tree, multiple layer perceptron, random forest, Xgboost and Adaboost. To further improve the prediction accuracy, a ML pipeline was proposed in which the feature engineering technique was implemented, and a two-layer stacked model was developed. The k-fold cross-validation approach was employed to optimize model parameters and train the stacked model. The stacked model showed superior performance in predicting concrete compressive strength with a correlation of determination (R2) of 0.985. Feature (i.e., variable) importance was determined to demonstrate how useful the synthetic features are in prediction and provide better interpretability of the data and the model. The methodology in this study promotes a more thorough assessment of alternative ML algorithms and rather than focusing on any single ML model type for concrete compressive strength prediction.

다중 목적 입자 군집 최적화 알고리즘 이용한 방사형 기저 함수 기반 다항식 신경회로망 구조 설계 (Structural Design of Radial Basis Function-based Polynomial Neural Networks by Using Multiobjective Particle Swarm Optimization)

  • 김욱동;오성권
    • 전기학회논문지
    • /
    • 제61권1호
    • /
    • pp.135-142
    • /
    • 2012
  • In this paper, we proposed a new architecture called radial basis function-based polynomial neural networks classifier that consists of heterogeneous neural networks such as radial basis function neural networks and polynomial neural networks. The underlying architecture of the proposed model equals to polynomial neural networks(PNNs) while polynomial neurons in PNNs are composed of Fuzzy-c means-based radial basis function neural networks(FCM-based RBFNNs) instead of the conventional polynomial function. We consider PNNs to find the optimal local models and use RBFNNs to cover the high dimensionality problems. Also, in the hidden layer of RBFNNs, FCM algorithm is used to produce some clusters based on the similarity of given dataset. The proposed model depends on some parameters such as the number of input variables in PNNs, the number of clusters and fuzzification coefficient in FCM and polynomial type in RBFNNs. A multiobjective particle swarm optimization using crowding distance (MoPSO-CD) is exploited in order to carry out both structural and parametric optimization of the proposed networks. MoPSO is introduced for not only the performance of model but also complexity and interpretability. The usefulness of the proposed model as a classifier is evaluated with the aid of some benchmark datasets such as iris and liver.

Aeroengine performance degradation prediction method considering operating conditions

  • Bangcheng Zhang;Shuo Gao;Zhong Zheng;Guanyu Hu
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권9호
    • /
    • pp.2314-2333
    • /
    • 2023
  • It is significant to predict the performance degradation of complex electromechanical systems. Among the existing performance degradation prediction models, belief rule base (BRB) is a model that deal with quantitative data and qualitative information with uncertainty. However, when analyzing dynamic systems where observable indicators change frequently over time and working conditions, the traditional belief rule base (BRB) can not adapt to frequent changes in working conditions, such as the prediction of aeroengine performance degradation considering working condition. For the sake of settling this problem, this paper puts forward a new hidden belief rule base (HBRB) prediction method, in which the performance of aeroengines is regarded as hidden behavior, and operating conditions are used as observable indicators of the HBRB model to describe the hidden behavior to solve the problem of performance degradation prediction under different times and operating conditions. The performance degradation prediction case study of turbofan aeroengine simulation experiments proves the advantages of HBRB model, and the results testify the effectiveness and practicability of this method. Furthermore, it is compared with other advanced forecasting methods. The results testify this model can generate better predictions in aspects of accuracy and interpretability.

공간 탐색 최적화 알고리즘을 이용한 K-Means 클러스터링 기반 다항식 방사형 기저 함수 신경회로망: 설계 및 비교 해석 (K-Means-Based Polynomial-Radial Basis Function Neural Network Using Space Search Algorithm: Design and Comparative Studies)

  • 김욱동;오성권
    • 제어로봇시스템학회논문지
    • /
    • 제17권8호
    • /
    • pp.731-738
    • /
    • 2011
  • In this paper, we introduce an advanced architecture of K-Means clustering-based polynomial Radial Basis Function Neural Networks (p-RBFNNs) designed with the aid of SSOA (Space Search Optimization Algorithm) and develop a comprehensive design methodology supporting their construction. In order to design the optimized p-RBFNNs, a center value of each receptive field is determined by running the K-Means clustering algorithm and then the center value and the width of the corresponding receptive field are optimized through SSOA. The connections (weights) of the proposed p-RBFNNs are of functional character and are realized by considering three types of polynomials. In addition, a WLSE (Weighted Least Square Estimation) is used to estimate the coefficients of polynomials (serving as functional connections of the network) of each node from output node. Therefore, a local learning capability and an interpretability of the proposed model are improved. The proposed model is illustrated with the use of nonlinear function, NOx called Machine Learning dataset. A comparative analysis reveals that the proposed model exhibits higher accuracy and superb predictive capability in comparison to some previous models available in the literature.