• 제목/요약/키워드: Ensemble approach

검색결과 175건 처리시간 0.023초

Hierarchical Bayesian Model을 이용한 GCMs 의 최적 Multi-Model Ensemble 모형 구축 (Optimal Multi-Model Ensemble Model Development Using Hierarchical Bayesian Model Based)

  • 권현한;민영미
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2009년도 학술발표회 초록집
    • /
    • pp.1147-1151
    • /
    • 2009
  • In this study, we address the problem of producing probability forecasts of summer seasonal rainfall, on the basis of Hindcast experiments from a ensemble of GCMs(cwb, gcps, gdaps, metri, msc_gem, msc_gm2, msc_gm3, msc_sef and ncep). An advanced Hierarchical Bayesian weighting scheme is developed and used to combine nine GCMs seasonal hindcast ensembles. Hindcast period is 23 years from 1981 to 2003. The simplest approach for combining GCM forecasts is to weight each model equally, and this approach is referred to as pooled ensemble. This study proposes a more complex approach which weights the models spatially and seasonally based on past model performance for rainfall. The Bayesian approach to multi-model combination of GCMs determines the relative weights of each GCM with climatology as the prior. The weights are chosen to maximize the likelihood score of the posterior probabilities. The individual GCM ensembles, simple poolings of three and six models, and the optimally combined multimodel ensemble are compared.

  • PDF

Remaining Useful Life Estimation based on Noise Injection and a Kalman Filter Ensemble of modified Bagging Predictors

  • Hung-Cuong Trinh;Van-Huy Pham;Anh H. Vo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권12호
    • /
    • pp.3242-3265
    • /
    • 2023
  • Ensuring reliability of a machinery system involve the prediction of remaining useful life (RUL). In most RUL prediction approaches, noise is always considered for removal. Nevertheless, noise could be properly utilized to enhance the prediction capabilities. In this paper, we proposed a novel RUL prediction approach based on noise injection and a Kalman filter ensemble of modified bagging predictors. Firstly, we proposed a new method to insert Gaussian noises into both observation and feature spaces of an original training dataset, named GN-DAFC. Secondly, we developed a modified bagging method based on Kalman filter averaging, named KBAG. Then, we developed a new ensemble method which is a Kalman filter ensemble of KBAGs, named DKBAG. Finally, we proposed a novel RUL prediction approach GN-DAFC-DKBAG in which the optimal noise-injected training dataset was determined by a GN-DAFC-based searching strategy and then inputted to a DKBAG model. Our approach is validated on the NASA C-MAPSS dataset of aero-engines. Experimental results show that our approach achieves significantly better performance than a traditional Kalman filter ensemble of single learning models (KESLM) and the original DKBAG approaches. We also found that the optimal noise-injected data could improve the prediction performance of both KESLM and DKBAG. We further compare our approach with two advanced ensemble approaches, and the results indicate that the former also has better performance than the latters. Thus, our approach of combining optimal noise injection and DKBAG provides an effective solution for RUL estimation of machinery systems.

Ensemble approach for improving prediction in kernel regression and classification

  • Han, Sunwoo;Hwang, Seongyun;Lee, Seokho
    • Communications for Statistical Applications and Methods
    • /
    • 제23권4호
    • /
    • pp.355-362
    • /
    • 2016
  • Ensemble methods often help increase prediction ability in various predictive models by combining multiple weak learners and reducing the variability of the final predictive model. In this work, we demonstrate that ensemble methods also enhance the accuracy of prediction under kernel ridge regression and kernel logistic regression classification. Here we apply bagging and random forests to two kernel-based predictive models; and present the procedure of how bagging and random forests can be embedded in kernel-based predictive models. Our proposals are tested under numerous synthetic and real datasets; subsequently, they are compared with plain kernel-based predictive models and their subsampling approach. Numerical studies demonstrate that ensemble approach outperforms plain kernel-based predictive models.

Enhancing Heart Disease Prediction Accuracy through Soft Voting Ensemble Techniques

  • Byung-Joo Kim
    • International Journal of Internet, Broadcasting and Communication
    • /
    • 제16권3호
    • /
    • pp.290-297
    • /
    • 2024
  • We investigate the efficacy of ensemble learning methods, specifically the soft voting technique, for enhancing heart disease prediction accuracy. Our study uniquely combines Logistic Regression, SVM with RBF Kernel, and Random Forest models in a soft voting ensemble to improve predictive performance. We demonstrate that this approach outperforms individual models in diagnosing heart disease. Our research contributes to the field by applying a well-curated dataset with normalization and optimization techniques, conducting a comprehensive comparative analysis of different machine learning models, and showcasing the superior performance of the soft voting ensemble in medical diagnosis. This multifaceted approach allows us to provide a thorough evaluation of the soft voting ensemble's effectiveness in the context of heart disease prediction. We evaluate our models based on accuracy, precision, recall, F1 score, and Area Under the ROC Curve (AUC). Our results indicate that the soft voting ensemble technique achieves higher accuracy and robustness in heart disease prediction compared to individual classifiers. This study advances the application of machine learning in medical diagnostics, offering a novel approach to improve heart disease prediction. Our findings have significant implications for early detection and management of heart disease, potentially contributing to better patient outcomes and more efficient healthcare resource allocation.

A Comprehensive Approach for Tamil Handwritten Character Recognition with Feature Selection and Ensemble Learning

  • Manoj K;Iyapparaja M
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제18권6호
    • /
    • pp.1540-1561
    • /
    • 2024
  • This research proposes a novel approach for Tamil Handwritten Character Recognition (THCR) that combines feature selection and ensemble learning techniques. The Tamil script is complex and highly variable, requiring a robust and accurate recognition system. Feature selection is used to reduce dimensionality while preserving discriminative features, improving classification performance and reducing computational complexity. Several feature selection methods are compared, and individual classifiers (support vector machines, neural networks, and decision trees) are evaluated through extensive experiments. Ensemble learning techniques such as bagging, and boosting are employed to leverage the strengths of multiple classifiers and enhance recognition accuracy. The proposed approach is evaluated on the HP Labs Dataset, achieving an impressive 95.56% accuracy using an ensemble learning framework based on support vector machines. The dataset consists of 82,928 samples with 247 distinct classes, contributed by 500 participants from Tamil Nadu. It includes 40,000 characters with 500 user variations. The results surpass or rival existing methods, demonstrating the effectiveness of the approach. The research also offers insights for developing advanced recognition systems for other complex scripts. Future investigations could explore the integration of deep learning techniques and the extension of the proposed approach to other Indic scripts and languages, advancing the field of handwritten character recognition.

Leave-one-out Bayesian model averaging for probabilistic ensemble forecasting

  • Kim, Yongdai;Kim, Woosung;Ohn, Ilsang;Kim, Young-Oh
    • Communications for Statistical Applications and Methods
    • /
    • 제24권1호
    • /
    • pp.67-80
    • /
    • 2017
  • Over the last few decades, ensemble forecasts based on global climate models have become an important part of climate forecast due to the ability to reduce uncertainty in prediction. Moreover in ensemble forecast, assessing the prediction uncertainty is as important as estimating the optimal weights, and this is achieved through a probabilistic forecast which is based on the predictive distribution of future climate. The Bayesian model averaging has received much attention as a tool of probabilistic forecasting due to its simplicity and superior prediction. In this paper, we propose a new Bayesian model averaging method for probabilistic ensemble forecasting. The proposed method combines a deterministic ensemble forecast based on a multivariate regression approach with Bayesian model averaging. We demonstrate that the proposed method is better in prediction than the standard Bayesian model averaging approach by analyzing monthly average precipitations and temperatures for ten cities in Korea.

Anomaly-Based Network Intrusion Detection: An Approach Using Ensemble-Based Machine Learning Algorithm

  • Kashif Gul Chachar;Syed Nadeem Ahsan
    • International Journal of Computer Science & Network Security
    • /
    • 제24권1호
    • /
    • pp.107-118
    • /
    • 2024
  • With the seamless growth of the technology, network usage requirements are expanding day by day. The majority of electronic devices are capable of communication, which strongly requires a secure and reliable network. Network-based intrusion detection systems (NIDS) is a new method for preventing and alerting computers and networks from attacks. Machine Learning is an emerging field that provides a variety of ways to implement effective network intrusion detection systems (NIDS). Bagging and Boosting are two ensemble ML techniques, renowned for better performance in the learning and classification process. In this paper, the study provides a detailed literature review of the past work done and proposed a novel ensemble approach to develop a NIDS system based on the voting method using bagging and boosting ensemble techniques. The test results demonstrate that the ensemble of bagging and boosting through voting exhibits the highest classification accuracy of 99.98% and a minimum false positive rate (FPR) on both datasets. Although the model building time is average which can be a tradeoff by processor speed.

Gaussian noise addition approaches for ensemble optimal interpolation implementation in a distributed hydrological model

  • Manoj Khaniya;Yasuto Tachikawa;Kodai Yamamoto;Takahiro Sayama;Sunmin Kim
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2023년도 학술발표회
    • /
    • pp.25-25
    • /
    • 2023
  • The ensemble optimal interpolation (EnOI) scheme is a sub-optimal alternative to the ensemble Kalman filter (EnKF) with a reduced computational demand making it potentially more suitable for operational applications. Since only one model is integrated forward instead of an ensemble of model realizations, online estimation of the background error covariance matrix is not possible in the EnOI scheme. In this study, we investigate two Gaussian noise based ensemble generation strategies to produce dynamic covariance matrices for assimilation of water level observations into a distributed hydrological model. In the first approach, spatially correlated noise, sampled from a normal distribution with a fixed fractional error parameter (which controls its standard deviation), is added to the model forecast state vector to prepare the ensembles. In the second method, we use an adaptive error estimation technique based on the innovation diagnostics to estimate this error parameter within the assimilation framework. The results from a real and a set of synthetic experiments indicate that the EnOI scheme can provide better results when an optimal EnKF is not identified, but performs worse than the ensemble filter when the true error characteristics are known. Furthermore, while the adaptive approach is able to reduce the sensitivity to the fractional error parameter affecting the first (non-adaptive) approach, results are usually worse at ungauged locations with the former.

  • PDF

An ensemble learning based Bayesian model updating approach for structural damage identification

  • Guangwei Lin;Yi Zhang;Enjian Cai;Taisen Zhao;Zhaoyan Li
    • Smart Structures and Systems
    • /
    • 제32권1호
    • /
    • pp.61-81
    • /
    • 2023
  • This study presents an ensemble learning based Bayesian model updating approach for structural damage diagnosis. In the developed framework, the structure is initially decomposed into a set of substructures. The autoregressive moving average (ARMAX) model is established first for structural damage localization based structural motion equation. The wavelet packet decomposition is utilized to extract the damage-sensitive node energy in different frequency bands for constructing structural surrogate models. Four methods, including Kriging predictor (KRG), radial basis function neural network (RBFNN), support vector regression (SVR), and multivariate adaptive regression splines (MARS), are selected as candidate structural surrogate models. These models are then resampled by bootstrapping and combined to obtain an ensemble model by probabilistic ensemble. Meanwhile, the maximum entropy principal is adopted to search for new design points for sample space updating, yielding a more robust ensemble model. Through the iterations, a framework of surrogate ensemble learning based model updating with high model construction efficiency and accuracy is proposed. The specificities of the method are discussed and investigated in a case study.

A Feature Selection-based Ensemble Method for Arrhythmia Classification

  • Namsrai, Erdenetuya;Munkhdalai, Tsendsuren;Li, Meijing;Shin, Jung-Hoon;Namsrai, Oyun-Erdene;Ryu, Keun Ho
    • Journal of Information Processing Systems
    • /
    • 제9권1호
    • /
    • pp.31-40
    • /
    • 2013
  • In this paper, a novel method is proposed to build an ensemble of classifiers by using a feature selection schema. The feature selection schema identifies the best feature sets that affect the arrhythmia classification. Firstly, a number of feature subsets are extracted by applying the feature selection schema to the original dataset. Then classification models are built by using the each feature subset. Finally, we combine the classification models by adopting a voting approach to form a classification ensemble. The voting approach in our method involves both classification error rate and feature selection rate to calculate the score of the each classifier in the ensemble. In our method, the feature selection rate depends on the extracting order of the feature subsets. In the experiment, we applied our method to arrhythmia dataset and generated three top disjointed feature sets. We then built three classifiers based on the top-three feature subsets and formed the classifier ensemble by using the voting approach. Our method can improve the classification accuracy in high dimensional dataset. The performance of each classifier and the performance of their ensemble were higher than the performance of the classifier that was based on whole feature space of the dataset. The classification performance was improved and a more stable classification model could be constructed with the proposed approach.