• Title/Summary/Keyword: Model Ensemble

Search Result 638, Processing Time 0.032 seconds

Applying Ensemble Model for Identifying Uncertainty in the Species Distribution Models (종분포모형의 불확실성 확인을 위한 앙상블모형 적용)

  • Kwon, Hyuk Soo
    • Journal of Korean Society for Geospatial Information Science
    • /
    • v.22 no.4
    • /
    • pp.47-52
    • /
    • 2014
  • Species distribution models have been widely applied in order to assess biodiversity, design reserve, manage habitat and predict climate change. However, SDMs has been used restrictively to the public and policy sectors owing to model uncertainty. Recent studies on ensemble and consensus models have been increased to reduce model uncertainty. This paper was carried out single model and multi model for Corylopsis coreana and compares two models. First, model evaluation was used AUC, kappa and TSS. TSS was the most effective method because it was easy to compare several models and convert binary maps. Second, both single and ensemble model show good performance and RF, Maxent and GBM was evaluated higher, GAM and SRE was evaluated lower relatively. Third, ensemble model tended to overestimate over single model. This problem can be solved by the suitable model selection and weighting through collaboration between field experts and modeler. Finally, we should identify causes and magnitude of model uncertainty and improve data quality and model methods in order to apply special decision-making support system and conservation planning, and when we make policy decisions using SDMs, we should recognize uncertainty and risk.

Scoring models to detect foreign exchange money laundering (외국환 거래의 자금세탁 혐의도 점수모형 개발에 관한 연구)

  • Hong, Seong-Ik;Moon, Tae-Hee;Sohn, So-Young
    • IE interfaces
    • /
    • v.18 no.3
    • /
    • pp.268-276
    • /
    • 2005
  • In recent years, the money Laundering crimes are increasing by means of foreign exchange transactions. Our study proposes four scoring models to provide early warning of the laundering in foreign exchange transactions for both inward and outward remittances: logistic regression model, decision tree, neural network, and ensemble model which combines the three models. In terms of accuracy of test data, decision tree model is selected for the inward remittance and an ensemble model for the outward remittance. From our study results, the accumulated number of transaction turns out to be the most important predictor variable. The proposed scoring models deal with the transaction level and is expected to help the bank teller to detect the laundering related transactions in the early stage.

A Study on Korean Sentiment Analysis Rate Using Neural Network and Ensemble Combination

  • Sim, YuJeong;Moon, Seok-Jae;Lee, Jong-Youg
    • International Journal of Advanced Culture Technology
    • /
    • v.9 no.4
    • /
    • pp.268-273
    • /
    • 2021
  • In this paper, we propose a sentiment analysis model that improves performance on small-scale data. A sentiment analysis model for small-scale data is proposed and verified through experiments. To this end, we propose Bagging-Bi-GRU, which combines Bi-GRU, which learns GRU, which is a variant of LSTM (Long Short-Term Memory) with excellent performance on sequential data, in both directions and the bagging technique, which is one of the ensembles learning methods. In order to verify the performance of the proposed model, it is applied to small-scale data and large-scale data. And by comparing and analyzing it with the existing machine learning algorithm, Bi-GRU, it shows that the performance of the proposed model is improved not only for small data but also for large data.

Intrusion Detection using Attribute Subset Selector Bagging (ASUB) to Handle Imbalance and Noise

  • Priya, A.Sagaya;Kumar, S.Britto Ramesh
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.5
    • /
    • pp.97-102
    • /
    • 2022
  • Network intrusion detection is becoming an increasing necessity for both organizations and individuals alike. Detecting intrusions is one of the major components that aims to prevent information compromise. Automated systems have been put to use due to the voluminous nature of the domain. The major challenge for automated models is the noise and data imbalance components contained in the network transactions. This work proposes an ensemble model, Attribute Subset Selector Bagging (ASUB) that can be used to effectively handle noise and data imbalance. The proposed model performs attribute subset based bag creation, leading to reduction of the influence of the noise factor. The constructed bagging model is heterogeneous in nature, hence leading to effective imbalance handling. Experiments were conducted on the standard intrusion detection datasets KDD CUP 99, Koyoto 2006 and NSL KDD. Results show effective performances, showing the high performance of the model.

Development of Multisite Spatio-Temporal Downscaling Model for Rainfall Using GCM Multi Model Ensemble (다중 기상모델 앙상블을 활용한 다지점 강우시나리오 상세화 기법 개발)

  • Kim, Tae-Jeong;Kim, Ki-Young;Kwon, Hyun-Han
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.35 no.2
    • /
    • pp.327-340
    • /
    • 2015
  • General Circulation Models (GCMs) are the basic tool used for modelling climate. However, the spatio-temporal discrepancy between GCM and observed value, therefore, the models deliver output that are generally required calibration for applied studies. Which is generally done by Multi-Model Ensemble (MME) approach. Stochastic downscaling methods have been used extensively to generate long-term weather sequences from finite observed records. A primary objective of this study is to develop a forecasting scheme which is able to make use of a MME of different GCMs. This study employed a Nonstationary Hidden Markov Chain Model (NHMM) as a main tool for downscaling seasonal ensemble forecasts over 3 month period, providing daily forecasts. Our results showed that the proposed downscaling scheme can provide the skillful forecasts as inputs for hydrologic modeling, which in turn may improve water resources management. An application to the Nakdong watershed in South Korea illustrates how the proposed approach can lead to potentially reliable information for water resources management.

Deep Neural Network Based Prediction of Daily Spectators for Korean Baseball League : Focused on Gwangju-KIA Champions Field (Deep Neural Network 기반 프로야구 일일 관중 수 예측 : 광주-기아 챔피언스 필드를 중심으로)

  • Park, Dong Ju;Kim, Byeong Woo;Jeong, Young-Seon;Ahn, Chang Wook
    • Smart Media Journal
    • /
    • v.7 no.1
    • /
    • pp.16-23
    • /
    • 2018
  • In this paper, we used the Deep Neural Network (DNN) to predict the number of daily spectators of Gwangju - KIA Champions Field in order to provide marketing data for the team and related businesses and for managing the inventories of the facilities in the stadium. In this study, the DNN model, which is based on an artificial neural network (ANN), was used, and four kinds of DNN model were designed along with dropout and batch normalization model to prevent overfitting. Each of four models consists of 10 DNNs, and we added extra models with ensemble model. Each model was evaluated by Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE). The learning data from the model randomly selected 80% of the collected data from 2008 to 2017, and the other 20% were used as test data. With the result of 100 data selection, model configuration, and learning and prediction, we concluded that the predictive power of the DNN model with ensemble model is the best, and RMSE and MAPE are 15.17% and 14.34% higher, correspondingly, than the prediction value of the multiple linear regression model.

Ensemble Downscaling of Soil Moisture Data Using BMA and ATPRK

  • Youn, Youjeong;Kim, Kwangjin;Chung, Chu-Yong;Park, No-Wook;Lee, Yangwon
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.4
    • /
    • pp.587-607
    • /
    • 2020
  • Soil moisture is essential information for meteorological and hydrological analyses. To date, many efforts have been made to achieve the two goals for soil moisture data, i.e., the improvement of accuracy and resolution, which is very challenging. We presented an ensemble downscaling method for quality improvement of gridded soil moisture data in terms of the accuracy and the spatial resolution by the integration of BMA (Bayesian model averaging) and ATPRK (area-to-point regression kriging). In the experiments, the BMA ensemble showed a 22% better accuracy than the data sets from ESA CCI (European Space Agency-Climate Change Initiative), ERA5 (ECMWF Reanalysis 5), and GLDAS (Global Land Data Assimilation System) in terms of RMSE (root mean square error). Also, the ATPRK downscaling could enhance the spatial resolution from 0.25° to 0.05° while preserving the improved accuracy and the spatial pattern of the BMA ensemble, without under- or over-estimation. The quality-improved data sets can contribute to a variety of local and regional applications related to soil moisture, such as agriculture, forest, hydrology, and meteorology. Because the ensemble downscaling method can be applied to the other land surface variables such as temperature, humidity, precipitation, and evapotranspiration, it can be a viable option to complement the accuracy and the spatial resolution of satellite images and numerical models.

An Efficient Deep Learning Ensemble Using a Distribution of Label Embedding

  • Park, Saerom
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.1
    • /
    • pp.27-35
    • /
    • 2021
  • In this paper, we propose a new stacking ensemble framework for deep learning models which reflects the distribution of label embeddings. Our ensemble framework consists of two phases: training the baseline deep learning classifier, and training the sub-classifiers based on the clustering results of label embeddings. Our framework aims to divide a multi-class classification problem into small sub-problems based on the clustering results. The clustering is conducted on the label embeddings obtained from the weight of the last layer of the baseline classifier. After clustering, sub-classifiers are constructed to classify the sub-classes in each cluster. From the experimental results, we found that the label embeddings well reflect the relationships between classification labels, and our ensemble framework can improve the classification performance on a CIFAR 100 dataset.

Developing efficient model updating approaches for different structural complexity - an ensemble learning and uncertainty quantifications

  • Lin, Guangwei;Zhang, Yi;Liao, Qinzhuo
    • Smart Structures and Systems
    • /
    • v.29 no.2
    • /
    • pp.321-336
    • /
    • 2022
  • Model uncertainty is a key factor that could influence the accuracy and reliability of numerical model-based analysis. It is necessary to acquire an appropriate updating approach which could search and determine the realistic model parameter values from measurements. In this paper, the Bayesian model updating theory combined with the transitional Markov chain Monte Carlo (TMCMC) method and K-means cluster analysis is utilized in the updating of the structural model parameters. Kriging and polynomial chaos expansion (PCE) are employed to generate surrogate models to reduce the computational burden in TMCMC. The selected updating approaches are applied to three structural examples with different complexity, including a two-storey frame, a ten-storey frame, and the national stadium model. These models stand for the low-dimensional linear model, the high-dimensional linear model, and the nonlinear model, respectively. The performances of updating in these three models are assessed in terms of the prediction uncertainty, numerical efforts, and prior information. This study also investigates the updating scenarios using the analytical approach and surrogate models. The uncertainty quantification in the Bayesian approach is further discussed to verify the validity and accuracy of the surrogate models. Finally, the advantages and limitations of the surrogate model-based updating approaches are discussed for different structural complexity. The possibility of utilizing the boosting algorithm as an ensemble learning method for improving the surrogate models is also presented.

Abnormal Detection for Industrial Control Systems Using Ensemble Recurrent Neural Networks Model (산업제어시스템에서 앙상블 순환신경망 모델을 이용한 비정상 탐지)

  • Kim, HyoSeok;Kim, Yong-Min
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.3
    • /
    • pp.401-410
    • /
    • 2021
  • Recently, as cyber attacks targeting industrial control systems increase, various studies are being conducted on the detection of abnormalities in industrial processes. Considering that the industrial process is deterministic and regular, It is appropriate to determine abnormality by comparing the predicted value of the detection model from which normal data is trained and the actual value. In this paper, HAI Datasets 20.07 and 21.03 are used. In addition, an ensemble model is created by combining models that have applied different time steps to Gated Recurrent Units. Then, the detection performance of the single model and the ensemble recurrent neural networks model were compared through various performance evaluation analysis, and It was confirmed that the proposed model is more suitable for abnormal detection in industrial control systems.