• Title/Summary/Keyword: Ensemble models

Search Result 365, Processing Time 0.03 seconds

Research on Insurance Claim Prediction Using Ensemble Learning-Based Dynamic Weighted Allocation Model (앙상블 러닝 기반 동적 가중치 할당 모델을 통한 보험금 예측 인공지능 연구)

  • Jong-Seok Choi
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.17 no.4
    • /
    • pp.221-228
    • /
    • 2024
  • Predicting insurance claims is a key task for insurance companies to manage risks and maintain financial stability. Accurate insurance claim predictions enable insurers to set appropriate premiums, reduce unexpected losses, and improve the quality of customer service. This study aims to enhance the performance of insurance claim prediction models by applying ensemble learning techniques. The predictive performance of models such as Random Forest, Gradient Boosting Machine (GBM), XGBoost, Stacking, and the proposed Dynamic Weighted Ensemble (DWE) model were compared and analyzed. Model performance was evaluated using Mean Absolute Error (MAE), Mean Squared Error (MSE), and the Coefficient of Determination (R2). Experimental results showed that the DWE model outperformed others in terms of evaluation metrics, achieving optimal predictive performance by combining the prediction results of Random Forest, XGBoost, LR, and LightGBM. This study demonstrates that ensemble learning techniques are effective in improving the accuracy of insurance claim predictions and suggests the potential utilization of AI-based predictive models in the insurance industry.

Bankruptcy prediction using ensemble SVM model (앙상블 SVM 모형을 이용한 기업 부도 예측)

  • Choi, Ha Na;Lim, Dong Hoon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.6
    • /
    • pp.1113-1125
    • /
    • 2013
  • Corporate bankruptcy prediction has been an important topic in the accounting and finance field for a long time. Several data mining techniques have been used for bankruptcy prediction. However, there are many limits for application to real classification problem with a single model. This study proposes ensemble SVM (support vector machine) model which assembles different SVM models with each different kernel functions. Our ensemble model is made and evaluated by v-fold cross-validation approach. The k top performing models are recruited into the ensemble. The classification is then carried out using the majority voting opinion of the ensemble. In this paper, we investigate the performance of ensemble SVM classifier in terms of accuracy, error rate, sensitivity, specificity, ROC curve, and AUC to compare with single SVM classifiers based on financial ratios dataset and simulation dataset. The results confirmed the advantages of our method: It is robust while providing good performance.

Performance-based drift prediction of reinforced concrete shear wall using bagging ensemble method

  • Bu-Seog Ju;Shinyoung Kwag;Sangwoo Lee
    • Nuclear Engineering and Technology
    • /
    • v.55 no.8
    • /
    • pp.2747-2756
    • /
    • 2023
  • Reinforced Concrete (RC) shear walls are one of the civil structures in nuclear power plants to resist lateral loads such as earthquakes and wind loads effectively. Risk-informed and performance-based regulation in the nuclear industry requires considering possible accidents and determining desirable performance on structures. As a result, rather than predicting only the ultimate capacity of structures, the prediction of performances on structures depending on different damage states or various accident scenarios have increasingly needed. This study aims to develop machine-learning models predicting drifts of the RC shear walls according to the damage limit states. The damage limit states are divided into four categories: the onset of cracking, yielding of rebars, crushing of concrete, and structural failure. The data on the drift of shear walls at each damage state are collected from the existing studies, and four regression machine-learning models are used to train the datasets. In addition, the bagging ensemble method is applied to improve the accuracy of the individual machine-learning models. The developed models are to predict the drifts of shear walls consisting of various cross-sections based on designated damage limit states in advance and help to determine the repairing methods according to damage levels to shear walls.

Wood Species Classification Utilizing Ensembles of Convolutional Neural Networks Established by Near-Infrared Spectra and Images Acquired from Korean Softwood Lumber

  • Yang, Sang-Yun;Lee, Hyung Gu;Park, Yonggun;Chung, Hyunwoo;Kim, Hyunbin;Park, Se-Yeong;Choi, In-Gyu;Kwon, Ohkyung;Yeo, Hwanmyeong
    • Journal of the Korean Wood Science and Technology
    • /
    • v.47 no.4
    • /
    • pp.385-392
    • /
    • 2019
  • In our previous study, we investigated the use of ensemble models based on LeNet and MiniVGGNet to classify the images of transverse and longitudinal surfaces of five Korean softwoods (cedar, cypress, Korean pine, Korean red pine, and larch). It had accomplished an average F1 score of more than 98%; the classification performance of the longitudinal surface image was still less than that of the transverse surface image. In this study, ensemble methods of two different convolutional neural network models (LeNet3 for smartphone camera images and NIRNet for NIR spectra) were applied to lumber species classification. Experimentally, the best classification performance was obtained by the averaging ensemble method of LeNet3 and NIRNet. The average F1 scores of the individual LeNet3 model and the individual NIRNet model were 91.98% and 85.94%, respectively. By the averaging ensemble method of LeNet3 and NIRNet, an average F1 score was increased to 95.31%.

Two Stage Deep Learning Based Stacked Ensemble Model for Web Application Security

  • Sevri, Mehmet;Karacan, Hacer
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.2
    • /
    • pp.632-657
    • /
    • 2022
  • Detecting web attacks is a major challenge, and it is observed that the use of simple models leads to low sensitivity or high false positive problems. In this study, we aim to develop a robust two-stage deep learning based stacked ensemble web application firewall. Normal and abnormal classification is carried out in the first stage of the proposed WAF model. The classification process of the types of abnormal traffics is postponed to the second stage and carried out using an integrated stacked ensemble model. By this way, clients' requests can be served without time delay, and attack types can be detected with high sensitivity. In addition to the high accuracy of the proposed model, by using the statistical similarity and diversity analyses in the study, high generalization for the ensemble model is achieved. Within the study, a comprehensive, up-to-date, and robust multi-class web anomaly dataset named GAZI-HTTP is created in accordance with the real-world situations. The performance of the proposed WAF model is compared to state-of-the-art deep learning models and previous studies using the benchmark dataset. The proposed two-stage model achieved multi-class detection rates of 97.43% and 94.77% for GAZI-HTTP and ECML-PKDD, respectively.

Accuracy Assessment of Land-Use Land-Cover Classification Using Semantic Segmentation-Based Deep Learning Model and RapidEye Imagery (RapidEye 위성영상과 Semantic Segmentation 기반 딥러닝 모델을 이용한 토지피복분류의 정확도 평가)

  • Woodam Sim;Jong Su Yim;Jung-Soo Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.3
    • /
    • pp.269-282
    • /
    • 2023
  • The purpose of this study was to construct land cover maps using a deep learning model and to select the optimal deep learning model for land cover classification by adjusting the dataset such as input image size and Stride application. Two types of deep learning models, the U-net model and the DeeplabV3+ model with an Encoder-Decoder network, were utilized. Also, the combination of the two deep learning models, which is an Ensemble model, was used in this study. The dataset utilized RapidEye satellite images as input images and the label images used Raster images based on the six categories of the land use of Intergovernmental Panel on Climate Change as true value. This study focused on the problem of the quality improvement of the dataset to enhance the accuracy of deep learning model and constructed twelve land cover maps using the combination of three deep learning models (U-net, DeeplabV3+, and Ensemble), two input image sizes (64 × 64 pixel and 256 × 256 pixel), and two Stride application rates (50% and 100%). The evaluation of the accuracy of the label images and the deep learning-based land cover maps showed that the U-net and DeeplabV3+ models had high accuracy, with overall accuracy values of approximately 87.9% and 89.8%, and kappa coefficients of over 72%. In addition, applying the Ensemble and Stride to the deep learning models resulted in a maximum increase of approximately 3% in accuracy and an improvement in the issue of boundary inconsistency, which is a problem associated with Semantic Segmentation based deep learning models.

Corporate Innovation and Business Performance Prediction Using Ensemble Learning (앙상블 학습을 이용한 기업혁신과 경영성과 예측)

  • An, Kyung Min;Lee, Young Chan
    • The Journal of Information Systems
    • /
    • v.30 no.4
    • /
    • pp.247-275
    • /
    • 2021
  • Purpose This study attempted to predict corporate innovation and business performance using ensemble learning. Design/methodology/approach The ensemble techniques uses weak learning to create robust learning, which combines several weak models to derive improved performance. In this study, XGboost, LightGBM, and Catboost were used among ensemble techniques. It was compared and evaluated with traditional machine learning methods. Findings The summary of the research results is as follows. First, the type of innovation is expanding from technical innovation to non-technical areas. Second, it was confirmed that LightGBM performed best for radical innovation prediction, and XGboost performed best for incremental innovation prediction. Third, Catboost performed best for firm performance prediction. Although there was no significant difference in predictive power between ensemble techniques, we found that comparative analysis was necessary to confirm better prediction performance.

Sparsity Increases Uncertainty Estimation in Deep Ensemble

  • Dorjsembe, Uyanga;Lee, Ju Hong;Choi, Bumghi;Song, Jae Won
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2021.05a
    • /
    • pp.373-376
    • /
    • 2021
  • Deep neural networks have achieved almost human-level results in various tasks and have become popular in the broad artificial intelligence domains. Uncertainty estimation is an on-demand task caused by the black-box point estimation behavior of deep learning. The deep ensemble provides increased accuracy and estimated uncertainty; however, linearly increasing the size makes the deep ensemble unfeasible for memory-intensive tasks. To address this problem, we used model pruning and quantization with a deep ensemble and analyzed the effect in the context of uncertainty metrics. We empirically showed that the ensemble members' disagreement increases with pruning, making models sparser by zeroing irrelevant parameters. Increased disagreement implies increased uncertainty, which helps in making more robust predictions. Accordingly, an energy-efficient compressed deep ensemble is appropriate for memory-intensive and uncertainty-aware tasks.

Time Series Forecasting Based on Modified Ensemble Algorithm (시계열 예측의 변형된 ENSEMBLE ALGORITHM)

  • Kim Yon Hyong;Kim Jae Hoon
    • The Korean Journal of Applied Statistics
    • /
    • v.18 no.1
    • /
    • pp.137-146
    • /
    • 2005
  • Neural network is one of the most notable technique. It usually provides more powerful forecasting models than the traditional time series techniques. Employing the Ensemble technique in forecasting model, one should provide a initial distribution. Usually the uniform distribution is assumed so that the initialization is noninformative. However, it would be expected a sequential informative initialization based on data rather than the uniform initialization gives further reduction in forecasting error. In this note, a modified Ensemble algorithm using sequential initial probability is developed. The sequential distribution is designed to have much weight on the recent data.

A New Ensemble Machine Learning Technique with Multiple Stacking (다중 스태킹을 가진 새로운 앙상블 학습 기법)

  • Lee, Su-eun;Kim, Han-joon
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.3
    • /
    • pp.1-13
    • /
    • 2020
  • Machine learning refers to a model generation technique that can solve specific problems from the generalization process for given data. In order to generate a high performance model, high quality training data and learning algorithms for generalization process should be prepared. As one way of improving the performance of model to be learned, the Ensemble technique generates multiple models rather than a single model, which includes bagging, boosting, and stacking learning techniques. This paper proposes a new Ensemble technique with multiple stacking that outperforms the conventional stacking technique. The learning structure of multiple stacking ensemble technique is similar to the structure of deep learning, in which each layer is composed of a combination of stacking models, and the number of layers get increased so as to minimize the misclassification rate of each layer. Through experiments using four types of datasets, we have showed that the proposed method outperforms the exiting ones.