• Title/Summary/Keyword: Ensemble Learning

Search Result 369, Processing Time 0.026 seconds

Optimal Selection of Classifier Ensemble Using Genetic Algorithms (유전자 알고리즘을 이용한 분류자 앙상블의 최적 선택)

  • Kim, Myung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.99-112
    • /
    • 2010
  • Ensemble learning is a method for improving the performance of classification and prediction algorithms. It is a method for finding a highly accurateclassifier on the training set by constructing and combining an ensemble of weak classifiers, each of which needs only to be moderately accurate on the training set. Ensemble learning has received considerable attention from machine learning and artificial intelligence fields because of its remarkable performance improvement and flexible integration with the traditional learning algorithms such as decision tree (DT), neural networks (NN), and SVM, etc. In those researches, all of DT ensemble studies have demonstrated impressive improvements in the generalization behavior of DT, while NN and SVM ensemble studies have not shown remarkable performance as shown in DT ensembles. Recently, several works have reported that the performance of ensemble can be degraded where multiple classifiers of an ensemble are highly correlated with, and thereby result in multicollinearity problem, which leads to performance degradation of the ensemble. They have also proposed the differentiated learning strategies to cope with performance degradation problem. Hansen and Salamon (1990) insisted that it is necessary and sufficient for the performance enhancement of an ensemble that the ensemble should contain diverse classifiers. Breiman (1996) explored that ensemble learning can increase the performance of unstable learning algorithms, but does not show remarkable performance improvement on stable learning algorithms. Unstable learning algorithms such as decision tree learners are sensitive to the change of the training data, and thus small changes in the training data can yield large changes in the generated classifiers. Therefore, ensemble with unstable learning algorithms can guarantee some diversity among the classifiers. To the contrary, stable learning algorithms such as NN and SVM generate similar classifiers in spite of small changes of the training data, and thus the correlation among the resulting classifiers is very high. This high correlation results in multicollinearity problem, which leads to performance degradation of the ensemble. Kim,s work (2009) showedthe performance comparison in bankruptcy prediction on Korea firms using tradition prediction algorithms such as NN, DT, and SVM. It reports that stable learning algorithms such as NN and SVM have higher predictability than the unstable DT. Meanwhile, with respect to their ensemble learning, DT ensemble shows the more improved performance than NN and SVM ensemble. Further analysis with variance inflation factor (VIF) analysis empirically proves that performance degradation of ensemble is due to multicollinearity problem. It also proposes that optimization of ensemble is needed to cope with such a problem. This paper proposes a hybrid system for coverage optimization of NN ensemble (CO-NN) in order to improve the performance of NN ensemble. Coverage optimization is a technique of choosing a sub-ensemble from an original ensemble to guarantee the diversity of classifiers in coverage optimization process. CO-NN uses GA which has been widely used for various optimization problems to deal with the coverage optimization problem. The GA chromosomes for the coverage optimization are encoded into binary strings, each bit of which indicates individual classifier. The fitness function is defined as maximization of error reduction and a constraint of variance inflation factor (VIF), which is one of the generally used methods to measure multicollinearity, is added to insure the diversity of classifiers by removing high correlation among the classifiers. We use Microsoft Excel and the GAs software package called Evolver. Experiments on company failure prediction have shown that CO-NN is effectively applied in the stable performance enhancement of NNensembles through the choice of classifiers by considering the correlations of the ensemble. The classifiers which have the potential multicollinearity problem are removed by the coverage optimization process of CO-NN and thereby CO-NN has shown higher performance than a single NN classifier and NN ensemble at 1% significance level, and DT ensemble at 5% significance level. However, there remain further research issues. First, decision optimization process to find optimal combination function should be considered in further research. Secondly, various learning strategies to deal with data noise should be introduced in more advanced further researches in the future.

Corporate Innovation and Business Performance Prediction Using Ensemble Learning (앙상블 학습을 이용한 기업혁신과 경영성과 예측)

  • An, Kyung Min;Lee, Young Chan
    • The Journal of Information Systems
    • /
    • v.30 no.4
    • /
    • pp.247-275
    • /
    • 2021
  • Purpose This study attempted to predict corporate innovation and business performance using ensemble learning. Design/methodology/approach The ensemble techniques uses weak learning to create robust learning, which combines several weak models to derive improved performance. In this study, XGboost, LightGBM, and Catboost were used among ensemble techniques. It was compared and evaluated with traditional machine learning methods. Findings The summary of the research results is as follows. First, the type of innovation is expanding from technical innovation to non-technical areas. Second, it was confirmed that LightGBM performed best for radical innovation prediction, and XGboost performed best for incremental innovation prediction. Third, Catboost performed best for firm performance prediction. Although there was no significant difference in predictive power between ensemble techniques, we found that comparative analysis was necessary to confirm better prediction performance.

A Study on Classification Performance Analysis of Convolutional Neural Network using Ensemble Learning Algorithm (앙상블 학습 알고리즘을 이용한 컨벌루션 신경망의 분류 성능 분석에 관한 연구)

  • Park, Sung-Wook;Kim, Jong-Chan;Kim, Do-Yeon
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.6
    • /
    • pp.665-675
    • /
    • 2019
  • In this paper, we compare and analyze the classification performance of deep learning algorithm Convolutional Neural Network(CNN) ac cording to ensemble generation and combining techniques. We used several CNN models(VGG16, VGG19, DenseNet121, DenseNet169, DenseNet201, ResNet18, ResNet34, ResNet50, ResNet101, ResNet152, GoogLeNet) to create 10 ensemble generation combinations and applied 6 combine techniques(average, weighted average, maximum, minimum, median, product) to the optimal combination. Experimental results, DenseNet169-VGG16-GoogLeNet combination in ensemble generation, and the product rule in ensemble combination showed the best performance. Based on this, it was concluded that ensemble in different models of high benchmarking scores is another way to get good results.

Prediction of electricity consumption in A hotel using ensemble learning with temperature (앙상블 학습과 온도 변수를 이용한 A 호텔의 전력소모량 예측)

  • Kim, Jaehwi;Kim, Jaehee
    • The Korean Journal of Applied Statistics
    • /
    • v.32 no.2
    • /
    • pp.319-330
    • /
    • 2019
  • Forecasting the electricity consumption through analyzing the past electricity consumption a advantageous for energy planing and policy. Machine learning is widely used as a method to predict electricity consumption. Among them, ensemble learning is a method to avoid the overfitting of models and reduce variance to improve prediction accuracy. However, ensemble learning applied to daily data shows the disadvantages of predicting a center value without showing a peak due to the characteristics of ensemble learning. In this study, we overcome the shortcomings of ensemble learning by considering the temperature trend. We compare nine models and propose a model using random forest with the linear trend of temperature.

An ensemble learning based Bayesian model updating approach for structural damage identification

  • Guangwei Lin;Yi Zhang;Enjian Cai;Taisen Zhao;Zhaoyan Li
    • Smart Structures and Systems
    • /
    • v.32 no.1
    • /
    • pp.61-81
    • /
    • 2023
  • This study presents an ensemble learning based Bayesian model updating approach for structural damage diagnosis. In the developed framework, the structure is initially decomposed into a set of substructures. The autoregressive moving average (ARMAX) model is established first for structural damage localization based structural motion equation. The wavelet packet decomposition is utilized to extract the damage-sensitive node energy in different frequency bands for constructing structural surrogate models. Four methods, including Kriging predictor (KRG), radial basis function neural network (RBFNN), support vector regression (SVR), and multivariate adaptive regression splines (MARS), are selected as candidate structural surrogate models. These models are then resampled by bootstrapping and combined to obtain an ensemble model by probabilistic ensemble. Meanwhile, the maximum entropy principal is adopted to search for new design points for sample space updating, yielding a more robust ensemble model. Through the iterations, a framework of surrogate ensemble learning based model updating with high model construction efficiency and accuracy is proposed. The specificities of the method are discussed and investigated in a case study.

Transfer Learning-Based Feature Fusion Model for Classification of Maneuver Weapon Systems

  • Jinyong Hwang;You-Rak Choi;Tae-Jin Park;Ji-Hoon Bae
    • Journal of Information Processing Systems
    • /
    • v.19 no.5
    • /
    • pp.673-687
    • /
    • 2023
  • Convolutional neural network-based deep learning technology is the most commonly used in image identification, but it requires large-scale data for training. Therefore, application in specific fields in which data acquisition is limited, such as in the military, may be challenging. In particular, the identification of ground weapon systems is a very important mission, and high identification accuracy is required. Accordingly, various studies have been conducted to achieve high performance using small-scale data. Among them, the ensemble method, which achieves excellent performance through the prediction average of the pre-trained models, is the most representative method; however, it requires considerable time and effort to find the optimal combination of ensemble models. In addition, there is a performance limitation in the prediction results obtained by using an ensemble method. Furthermore, it is difficult to obtain the ensemble effect using models with imbalanced classification accuracies. In this paper, we propose a transfer learning-based feature fusion technique for heterogeneous models that extracts and fuses features of pre-trained heterogeneous models and finally, fine-tunes hyperparameters of the fully connected layer to improve the classification accuracy. The experimental results of this study indicate that it is possible to overcome the limitations of the existing ensemble methods by improving the classification accuracy through feature fusion between heterogeneous models based on transfer learning.

A New Ensemble Machine Learning Technique with Multiple Stacking (다중 스태킹을 가진 새로운 앙상블 학습 기법)

  • Lee, Su-eun;Kim, Han-joon
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.3
    • /
    • pp.1-13
    • /
    • 2020
  • Machine learning refers to a model generation technique that can solve specific problems from the generalization process for given data. In order to generate a high performance model, high quality training data and learning algorithms for generalization process should be prepared. As one way of improving the performance of model to be learned, the Ensemble technique generates multiple models rather than a single model, which includes bagging, boosting, and stacking learning techniques. This paper proposes a new Ensemble technique with multiple stacking that outperforms the conventional stacking technique. The learning structure of multiple stacking ensemble technique is similar to the structure of deep learning, in which each layer is composed of a combination of stacking models, and the number of layers get increased so as to minimize the misclassification rate of each layer. Through experiments using four types of datasets, we have showed that the proposed method outperforms the exiting ones.

Extreme Learning Machine Ensemble Using Bagging for Facial Expression Recognition

  • Ghimire, Deepak;Lee, Joonwhoan
    • Journal of Information Processing Systems
    • /
    • v.10 no.3
    • /
    • pp.443-458
    • /
    • 2014
  • An extreme learning machine (ELM) is a recently proposed learning algorithm for a single-layer feed forward neural network. In this paper we studied the ensemble of ELM by using a bagging algorithm for facial expression recognition (FER). Facial expression analysis is widely used in the behavior interpretation of emotions, for cognitive science, and social interactions. This paper presents a method for FER based on the histogram of orientation gradient (HOG) features using an ELM ensemble. First, the HOG features were extracted from the face image by dividing it into a number of small cells. A bagging algorithm was then used to construct many different bags of training data and each of them was trained by using separate ELMs. To recognize the expression of the input face image, HOG features were fed to each trained ELM and the results were combined by using a majority voting scheme. The ELM ensemble using bagging improves the generalized capability of the network significantly. The two available datasets (JAFFE and CK+) of facial expressions were used to evaluate the performance of the proposed classification system. Even the performance of individual ELM was smaller and the ELM ensemble using a bagging algorithm improved the recognition performance significantly.

Ensemble Design of Machine Learning Technigues: Experimental Verification by Prediction of Drifter Trajectory (앙상블을 이용한 기계학습 기법의 설계: 뜰개 이동경로 예측을 통한 실험적 검증)

  • Lee, Chan-Jae;Kim, Yong-Hyuk
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.8 no.3
    • /
    • pp.57-67
    • /
    • 2018
  • The ensemble is a unified approach used for getting better performance by using multiple algorithms in machine learning. In this paper, we introduce boosting and bagging, which have been widely used in ensemble techniques, and design a method using support vector regression, radial basis function network, Gaussian process, and multilayer perceptron. In addition, our experiment was performed by adding a recurrent neural network and MOHID numerical model. The drifter data used for our experimental verification consist of 683 observations in seven regions. The performance of our ensemble technique is verified by comparison with four algorithms each. As verification, mean absolute error was adapted. The presented methods are based on ensemble models using bagging, boosting, and machine learning. The error rate was calculated by assigning the equal weight value and different weight value to each unit model in ensemble. The ensemble model using machine learning showed 61.7% improvement compared to the average of four machine learning technique.

On successive machine learning process for predicting strength and displacement of rectangular reinforced concrete columns subjected to cyclic loading

  • Bu-seog Ju;Shinyoung Kwag;Sangwoo Lee
    • Computers and Concrete
    • /
    • v.32 no.5
    • /
    • pp.513-525
    • /
    • 2023
  • Recently, research on predicting the behavior of reinforced concrete (RC) columns using machine learning methods has been actively conducted. However, most studies have focused on predicting the ultimate strength of RC columns using a regression algorithm. Therefore, this study develops a successive machine learning process for predicting multiple nonlinear behaviors of rectangular RC columns. This process consists of three stages: single machine learning, bagging ensemble, and stacking ensemble. In the case of strength prediction, sufficient prediction accuracy is confirmed even in the first stage. In the case of displacement, although sufficient accuracy is not achieved in the first and second stages, the stacking ensemble model in the third stage performs better than the machine learning models in the first and second stages. In addition, the performance of the final prediction models is verified by comparing the backbone curves and hysteresis loops obtained from predicted outputs with actual experimental data.