• Title/Summary/Keyword: Ensemble Approach

Search Result 175, Processing Time 0.025 seconds

Projecting the spatial-temporal trends of extreme climatology in South Korea based on optimal multi-model ensemble members

  • Mirza Junaid Ahmad;Kyung-sook Choi
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2023.05a
    • /
    • pp.314-314
    • /
    • 2023
  • Extreme climate events can have a large impact on human life by hampering social, environmental, and economic development. Global circulation models (GCMs) are the widely used numerical models to understand the anticipated future climate change. However, different GCMs can project different future climates due to structural differences, varying initial boundary conditions and assumptions about the physical phenomena. The multi-model ensemble (MME) approach can improve the uncertainties associated with the different GCM outcomes. In this study, a comprehensive rating metric was used to select the best-performing GCMs out of 11 CMIP5 and 13 CMIP6 GCMs, according to their skills in terms of four temporal and five spatial performance indices, in replicating the 21 extreme climate indices during the baseline (1975-2017) in South Korea. The MME data were derived by averaging the simulations from all selected GCMs and three top-ranked GCMs. The random forest (RF) algorithm was also used to derive the MME data from the three top-ranked GCMs. The RF-derived MME data of the three top-ranked GCMs showed the highest performance in simulating the baseline extreme climate which was subsequently used to project the future extreme climate indices under both the representative concentration pathway (RCP) and the socioeconomic concentration pathway scenarios (SSP). The extreme cold and warming indices had declining and increasing trends, respectively, and most extreme precipitation indices had increasing trends over the period 2031-2100. Compared to all scenarios, RCP8.5 showed drastic changes in future extreme climate indices. The coasts in the east, south and west had stronger warming than the rest of the country, while mountain areas in the north experienced more extreme cold. While extreme cold climatology gradually declined from north to south, extreme warming climatology continuously grew from coastal to inland and northern mountainous regions. The results showed that the socially, environmentally and agriculturally important regions of South Korea were at increased risk of facing the detrimental impacts of extreme climatology.

  • PDF

Infrastructure Anomaly Analysis for Data-center Failure Prevention: Based on RRCF and Prophet Ensemble Analysis (데이터센터 장애 예방을 위한 인프라 이상징후 분석: RRCF와 Prophet Ensemble 분석 기반)

  • Hyun-Jong Kim;Sung-Keun Kim;Byoung-Whan Chun;Kyong-Bog, Jin;Seung-Jeong Yang
    • The Journal of Bigdata
    • /
    • v.7 no.1
    • /
    • pp.113-124
    • /
    • 2022
  • Various methods using machine learning and big data have been applied to prevent failures in Data Centers. However, there are many limitations to referencing individual equipment-based performance indicators or to being practically utilized as an approach that does not consider the infrastructure operating environment. In this study, the performance indicators of individual infrastructure equipment are integrated monitoring and the performance indicators of various equipment are segmented and graded to make a single numerical value. Data pre-processing based on experience in infrastructure operation. And an ensemble of RRCF (Robust Random Cut Forest) analysis and Prophet analysis model led to reliable analysis results in detecting anomalies. A failure analysis system was implemented to facilitate the use of Data Center operators. It can provide a preemptive response to Data Center failures and an appropriate tuning time.

A generalized explainable approach to predict the hardened properties of self-compacting geopolymer concrete using machine learning techniques

  • Endow Ayar Mazumder;Sanjog Chhetri Sapkota;Sourav Das;Prasenjit Saha;Pijush Samui
    • Computers and Concrete
    • /
    • v.34 no.3
    • /
    • pp.279-296
    • /
    • 2024
  • In this study, ensemble machine learning (ML) models are employed to estimate the hardened properties of Self-Compacting Geopolymer Concrete (SCGC). The input variables affecting model development include the content of the SCGC such as the binder material, the age of the specimen, and the ratio of alkaline solution. On the other hand, the output parameters examined includes compressive strength, flexural strength, and split tensile strength. The ensemble machine learning models are trained and validated using a database comprising 396 records compiled from 132 unique mix trials performed in the laboratory. Diverse machine learning techniques, notably K-nearest neighbours (KNN), Random Forest, and Extreme Gradient Boosting (XGBoost), have been employed to construct the models coupled with Bayesian optimisation (BO) for the purpose of hyperparameter tuning. Furthermore, the application of nested cross-validation has been employed in order to mitigate the risk of overfitting. The findings of this study reveal that the BO-XGBoost hybrid model confirms better predictive accuracy in comparison to other models. The R2 values for compressive strength, flexural strength, and split tensile strength are 0.9974, 0.9978, and 0.9937, respectively. Additionally, the BO-XGBoost hybrid model exhibits the lowest RMSE values of 0.8712, 0.0773, and 0.0799 for compressive strength, flexural strength, and split tensile strength, respectively. Furthermore, a SHAP dependency analysis was conducted to ascertain the significance of each parameter. It is observed from this study that GGBS, Flyash, and the age of specimens exhibit a substantial level of influence when predicting the strengths of geopolymers.

An Ensemble Classifier Based Method to Select Optimal Image Features for License Plate Recognition (차량 번호판 인식을 위한 앙상블 학습기 기반의 최적 특징 선택 방법)

  • Jo, Jae-Ho;Kang, Dong-Joong
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.1
    • /
    • pp.142-149
    • /
    • 2016
  • This paper proposes a method to detect LP(License Plate) of vehicles in indoor and outdoor parking lots. In restricted environment, there are many conventional methods for detecting LP. But, it is difficult to detect LP in natural and complex scenes with background clutters because several patterns similar with text or LP always exist in complicated backgrounds. To verify the performance of LP text detection in natural images, we apply MB-LGP feature by combining with ensemble machine learning algorithm in purpose of selecting optimal features of small number in huge pool. The feature selection is performed by adaptive boosting algorithm that shows great performance in minimum false positive detection ratio and in computing time when combined with cascade approach. MSER is used to provide initial text regions of vehicle LP. Throughout the experiment using real images, the proposed method functions robustly extracting LP in natural scene as well as the controlled environment.

Outlier detection of main engine data of a ship using ensemble method (앙상블 기법을 이용한 선박 메인엔진 빅데이터의 이상치 탐지)

  • KIM, Dong-Hyun;LEE, Ji-Hwan;LEE, Sang-Bong;JUNG, Bong-Kyu
    • Journal of the Korean Society of Fisheries and Ocean Technology
    • /
    • v.56 no.4
    • /
    • pp.384-394
    • /
    • 2020
  • This paper proposes an outlier detection model based on machine learning that can diagnose the presence or absence of major engine parts through unsupervised learning analysis of main engine big data of a ship. Engine big data of the ship was collected for more than seven months, and expert knowledge and correlation analysis were performed to select features that are closely related to the operation of the main engine. For unsupervised learning analysis, ensemble model wherein many predictive models are strategically combined to increase the model performance, is used for anomaly detection. As a result, the proposed model successfully detected the anomalous engine status from the normal status. To validate our approach, clustering analysis was conducted to find out the different patterns of anomalies the anomalous point. By examining distribution of each cluster, we could successfully find the patterns of anomalies.

Investigating Regions Vulnerable to Recurring Landslide Damage Using Time Series-Based Susceptibility Analysis: Case Study for Jeolla Region, Republic of Korea

  • Ho Gul Kim
    • Journal of Forest and Environmental Science
    • /
    • v.39 no.4
    • /
    • pp.213-224
    • /
    • 2023
  • As abnormal weather events due to climate change continue to rise, landslide damage is also increasing. Given the substantial time and financial resources required for post-landslide recovery, it becomes imperative to formulate a proactive response plan. In this regard, landslide susceptibility analysis has emerged as a valuable tool for establishing preemptive measures against landslides. Accordingly, this study conducted an annual landslide susceptibility analysis using the history of landslides that occurred over many years in the Jeolla region, and analyzed areas with a high potential for landslides in the Jeolla region. The analysis employed an ensemble model that amalgamated 10 data-based models, aiming to mitigate uncertainties associated with a single-model approach. Furthermore, based on the cumulative data regarding landslide susceptible areas, this research identified regions vulnerable to recurring landslide damage in Jeolla region and proposed specific strategies for utilizing this information at various levels, including local government initiatives, adaptation plan development, and development approval processes. In particular, this study outlined approaches for local government utilization, the determination of adaptation plan types, and considerations for development permits. It is anticipated that this research will serve as a valuable opportunity to underscore the significance of information concerning regions vulnerable to recurring landslide damage.

Preemptive Failure Detection using Contamination-Based Stacking Ensemble in Missiles

  • Seong-Mok Kim;Ye-Eun Jeong;Yong Soo Kim;Youn-Ho Lee;Seung Young Lee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.5
    • /
    • pp.1301-1316
    • /
    • 2024
  • In modern warfare, missiles play a pivotal role but typically spend the majority of their lifecycle in long-term storage or standby mode, making it difficult to detect failures. Preemptive detection of missiles that will fail is crucial to preventing severe consequences, including safety hazards and mission failures. This study proposes a contamination-based stacking ensemble model, employing the local outlier factor (LOF), to detect such missiles. The proposed model creates multiple base LOF models with different contamination values and combines their anomaly scores to achieve a robust anomaly detection. A comparative performance analysis was conducted between the proposed model and the traditional single LOF model, using production-related inspection data from missiles deployed in the military. The experimental results showed that, with the contamination parameter set to 0.1, the proposed model exhibited an increase of approximately 22 percentage points in accuracy and 71 percentage points in F1-score compared to the single LOF model. This approach enables the preemptive identification of potential failures, undetectable through traditional statistical quality control methods. Consequently, it contributes to lower missile failure rates in real battlefield scenarios, leading to significant time and cost savings in the military industry.

Streamflow Forecast Model on Nakdong River Basin (낙동강유역 하천유량 예측모형 구축)

  • Lee, Byong-Ju;Bae, Deg-Hyo
    • Journal of Korea Water Resources Association
    • /
    • v.44 no.11
    • /
    • pp.853-861
    • /
    • 2011
  • The objective of this study is to assess Sejong University River Forecast (SURF) model which consists of a continuous rainfall-runoff model and measured streamflow assimilation using ensemble Kalman filter technique for streamflow forecast on Nakdong river basin. The study area is divided into 43 subbasins. The forecasted streamflows are evaluated at 12 measurement sites during flood season from 2006 to 2007. The forecasted ones are improved due to the impact of the measured streamflows assimilation. In effectiveness indices corresponding to 1~5 h forecast lead times, the accuracy of the forecasted streamflows with the assimilation approach is improved by 46.2~30.1% compared with that using only the rainfall-runoff model. The mean normalized absolute error of forecasted peak flow without and with data assimilation approach in entering 50% of the measured rainfall, respectively, the accuracy of the latter is improved about 40% than that of the former. From these results, SURF model is able to be used as a real-time river forecast model.

Credit Card Bad Debt Prediction Model based on Support Vector Machine (신용카드 대손회원 예측을 위한 SVM 모형)

  • Kim, Jin Woo;Jhee, Won Chul
    • Journal of Information Technology Services
    • /
    • v.11 no.4
    • /
    • pp.233-250
    • /
    • 2012
  • In this paper, credit card delinquency means the possibility of occurring bad debt within the certain near future from the normal accounts that have no debt and the problem is to predict, on the monthly basis, the occurrence of delinquency 3 months in advance. This prediction is typical binary classification problem but suffers from the issue of data imbalance that means the instances of target class is very few. For the effective prediction of bad debt occurrence, Support Vector Machine (SVM) with kernel trick is adopted using credit card usage and payment patterns as its inputs. SVM is widely accepted in the data mining society because of its prediction accuracy and no fear of overfitting. However, it is known that SVM has the limitation in its ability to processing the large-scale data. To resolve the difficulties in applying SVM to bad debt occurrence prediction, two stage clustering is suggested as an effective data reduction method and ensembles of SVM models are also adopted to mitigate the difficulty due to data imbalance intrinsic to the target problem of this paper. In the experiments with the real world data from one of the major domestic credit card companies, the suggested approach reveals the superior prediction accuracy to the traditional data mining approaches that use neural networks, decision trees or logistics regressions. SVM ensemble model learned from T2 training set shows the best prediction results among the alternatives considered and it is noteworthy that the performance of neural networks with T2 is better than that of SVM with T1. These results prove that the suggested approach is very effective for both SVM training and the classification problem of data imbalance.

Modeling and Selecting Optimal Features for Machine Learning Based Detections of Android Malwares (머신러닝 기반 안드로이드 모바일 악성 앱의 최적 특징점 선정 및 모델링 방안 제안)

  • Lee, Kye Woong;Oh, Seung Taek;Yoon, Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.11
    • /
    • pp.427-432
    • /
    • 2019
  • In this paper, we propose three approaches to modeling Android malware. The first method involves human security experts for meticulously selecting feature sets. With the second approach, we choose 300 features with the highest importance among the top 99% features in terms of occurrence rate. The third approach is to combine multiple models and identify malware through weighted voting. In addition, we applied a novel method of eliminating permission information which used to be regarded as a critical factor for distinguishing malware. With our carefully generated feature sets and the weighted voting by the ensemble algorithm, we were able to reach the highest malware detection accuracy of 97.8%. We also verified that discarding the permission information lead to the improvement in terms of false positive and false negative rates.