• Title/Summary/Keyword: Ensemble Modeling

Search Result 81, Processing Time 0.031 seconds

Modeling and Selecting Optimal Features for Machine Learning Based Detections of Android Malwares (머신러닝 기반 안드로이드 모바일 악성 앱의 최적 특징점 선정 및 모델링 방안 제안)

  • Lee, Kye Woong;Oh, Seung Taek;Yoon, Young
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.11
    • /
    • pp.427-432
    • /
    • 2019
  • In this paper, we propose three approaches to modeling Android malware. The first method involves human security experts for meticulously selecting feature sets. With the second approach, we choose 300 features with the highest importance among the top 99% features in terms of occurrence rate. The third approach is to combine multiple models and identify malware through weighted voting. In addition, we applied a novel method of eliminating permission information which used to be regarded as a critical factor for distinguishing malware. With our carefully generated feature sets and the weighted voting by the ensemble algorithm, we were able to reach the highest malware detection accuracy of 97.8%. We also verified that discarding the permission information lead to the improvement in terms of false positive and false negative rates.

Energy Efficient Design of a Jet Pump by Ensemble of Surrogates and Evolutionary Approach

  • Husain, Afzal;Sonawat, Arihant;Mohan, Sarath;Samad, Abdus
    • International Journal of Fluid Machinery and Systems
    • /
    • v.9 no.3
    • /
    • pp.265-276
    • /
    • 2016
  • Energy systems working coherently in different conditions may not have a specific design which can provide optimal performance. A system working for a longer period at lower efficiency implies higher energy consumption. In this effort, a methodology demonstrated by a jet pump design and optimization via numerical modeling for fluid dynamics and implementation of an evolutionary algorithm for the optimization shows a reduction in computational costs. The jet pump inherently has a low efficiency because of improper mixing of primary and secondary fluids, and multiple momentum and energy transfer phenomena associated with it. The high fidelity solutions were obtained through a validated numerical model to construct an approximate function through surrogate analysis. Pareto-optimal solutions for two objective functions, i.e., secondary fluid pressure head and primary fluid pressure-drop, were generated through a multi-objective genetic algorithm. For the jet pump geometry, a design space of several design variables was discretized using the Latin hypercube sampling method for the optimization. The performance analysis of the surrogate models shows that the combined surrogates perform better than a single surrogate and the optimized jet pump shows a higher performance. The approach can be implemented in other energy systems to find a better design.

Uncertainty quantification for structural health monitoring applications

  • Nasr, Dana E.;Slika, Wael G.;Saad, George A.
    • Smart Structures and Systems
    • /
    • v.22 no.4
    • /
    • pp.399-411
    • /
    • 2018
  • The difficulty in modeling complex nonlinear structures lies in the presence of significant sources of uncertainties mainly attributed to sudden changes in the structure's behavior caused by regular aging factors or extreme events. Quantifying these uncertainties and accurately representing them within the complex mathematical framework of Structural Health Monitoring (SHM) are significantly essential for system identification and damage detection purposes. This study highlights the importance of uncertainty quantification in SHM frameworks, and presents a comparative analysis between intrusive and non-intrusive techniques in quantifying uncertainties for SHM purposes through two different variations of the Kalman Filter (KF) method, the Ensemble Kalman filter (EnKF) and the Polynomial Chaos Kalman Filter (PCKF). The comparative analysis is based on a numerical example that consists of a four degrees-of-freedom (DOF) system, comprising Bouc-Wen hysteretic behavior and subjected to El-Centro earthquake excitation. The comparison is based on the ability of each technique to quantify the different sources of uncertainty for SHM purposes and to accurately approximate the system state and parameters when compared to the true state with the least computational burden. While the results show that both filters are able to locate the damage in space and time and to accurately estimate the system responses and unknown parameters, the computational cost of PCKF is shown to be less than that of EnKF for a similar level of numerical accuracy.

Improving streamflow prediction with assimilating the SMAP soil moisture data in WRF-Hydro

  • Kim, Yeri;Kim, Yeonjoo
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2021.06a
    • /
    • pp.205-205
    • /
    • 2021
  • Surface soil moisture, which governs the partitioning of precipitation into infiltration and runoff, plays an important role in the hydrological cycle. The assimilation of satellite soil moisture retrievals into a land surface model or hydrological model has been shown to improve the predictive skill of hydrological variables. This study aims to improve streamflow prediction with Weather Research and Forecasting model-Hydrological modeling system (WRF-Hydro) by assimilating Soil Moisture Active and Passive (SMAP) data at 3 km and analyze its impacts on hydrological components. We applied Cumulative Distribution Function (CDF) technique to remove the bias of SMAP data and assimilate SMAP data (April to July 2015-2019) into WRF-Hydro by using an Ensemble Kalman Filter (EnKF) with a total 12 ensembles. Daily inflow and soil moisture estimates of major dams (Soyanggang, Chungju, Sumjin dam) of South Korea were evaluated. We investigated how hydrologic variables such as runoff, evaporation and soil moisture were better simulated with the data assimilation than without the data assimilation. The result shows that the correlation coefficient of topsoil moisture can be improved, however a change of dam inflow was not outstanding. It may attribute to the fact that soil moisture memory and the respective memory of runoff play on different time scales. These findings demonstrate that the assimilation of satellite soil moisture retrievals can improve the predictive skill of hydrological variables for a better understanding of the water cycle.

  • PDF

Assessment of modal parameters considering measurement and modeling errors

  • Huang, Qindan;Gardoni, Paolo;Hurlebaus, Stefan
    • Smart Structures and Systems
    • /
    • v.15 no.3
    • /
    • pp.717-733
    • /
    • 2015
  • Modal parameters of a structure are commonly used quantities for system identification and damage detection. With a limited number of studies on the statistics assessment of modal parameters, this paper presents procedures to properly account for the uncertainties present in the process of extracting modal parameters. Particularly, this paper focuses on how to deal with the measurement error in an ambient vibration test and the modeling error resulting from a modal parameter extraction process. A bootstrap approach is adopted, when an ensemble of a limited number of noised time-history response recordings is available. To estimate the modeling error associated with the extraction process, a model prediction expansion approach is adopted where the modeling error is considered as an "adjustment" to the prediction obtained from the extraction process. The proposed procedures can be further incorporated into the probabilistic analysis of applications where the modal parameters are used. This study considers the effects of the measurement and modeling errors and can provide guidance in allocating resources to improve the estimation accuracy of the modal data. As an illustration, the proposed procedures are applied to extract the modal data of a damaged beam, and the extracted modal data are used to detect potential damage locations using a damage detection method. It is shown that the variability in the modal parameters can be considered to be quite low due to the measurement and modeling errors; however, this low variability has a significant impact on the damage detection results for the studied beam.

Enhancing Medium-Range Forecast Accuracy of Temperature and Relative Humidity over South Korea using Minimum Continuous Ranked Probability Score (CRPS) Statistical Correction Technique (연속 순위 확률 점수를 활용한 통합 앙상블 모델에 대한 기온 및 습도 후처리 모델 개발)

  • Hyejeong Bok;Junsu Kim;Yeon-Hee Kim;Eunju Cho;Seungbum Kim
    • Atmosphere
    • /
    • v.34 no.1
    • /
    • pp.23-34
    • /
    • 2024
  • The Korea Meteorological Administration has improved medium-range weather forecasts by implementing post-processing methods to minimize numerical model errors. In this study, we employ a statistical correction technique known as the minimum continuous ranked probability score (CRPS) to refine medium-range forecast guidance. This technique quantifies the similarity between the predicted values and the observed cumulative distribution function of the Unified Model Ensemble Prediction System for Global (UM EPSG). We evaluated the performance of the medium-range forecast guidance for surface air temperature and relative humidity, noting significant enhancements in seasonal bias and root mean squared error compared to observations. Notably, compared to the existing the medium-range forecast guidance, temperature forecasts exhibit 17.5% improvement in summer and 21.5% improvement in winter. Humidity forecasts also show 12% improvement in summer and 23% improvement in winter. The results indicate that utilizing the minimum CRPS for medium-range forecast guidance provide more reliable and improved performance than UM EPSG.

Modeling of Convolutional Neural Network-based Recommendation System

  • Kim, Tae-Yeun
    • Journal of Integrative Natural Science
    • /
    • v.14 no.4
    • /
    • pp.183-188
    • /
    • 2021
  • Collaborative filtering is one of the commonly used methods in the web recommendation system. Numerous researches on the collaborative filtering proposed the numbers of measures for enhancing the accuracy. This study suggests the movie recommendation system applied with Word2Vec and ensemble convolutional neural networks. First, user sentences and movie sentences are made from the user, movie, and rating information. Then, the user sentences and movie sentences are input into Word2Vec to figure out the user vector and movie vector. The user vector is input on the user convolutional model while the movie vector is input on the movie convolutional model. These user and movie convolutional models are connected to the fully-connected neural network model. Ultimately, the output layer of the fully-connected neural network model outputs the forecasts for user, movie, and rating. The test result showed that the system proposed in this study showed higher accuracy than the conventional cooperative filtering system and Word2Vec and deep neural network-based system suggested in the similar researches. The Word2Vec and deep neural network-based recommendation system is expected to help in enhancing the satisfaction while considering about the characteristics of users.

Study of protein loop conformational changes by free energy estimation using colony energy

  • Kang, Beom Chang;Lee, Gyu Rie;Seok, Chaok
    • Proceeding of EDISON Challenge
    • /
    • 2014.03a
    • /
    • pp.63-74
    • /
    • 2014
  • Predicting protein loop structures is an important modeling problem since protein loops are often involved in diverse biological functions by participating in enzyme active sites, ligand binding sites, etc. However, loop structure prediction is difficult even when structures of homologous proteins are known due to large sequence and structure variability among loops of homologous proteins. Therefore, an ab initio approach is necessary to solve loop modeling problems. One of the difficulties in the development of ab initio loop modeling method is to derive an accurate scoring function that closely approximates the true free energy function. In particular, entropy as well as energy contribution have to be considered adequately for loops because loops tend to be flexible compared to other parts of protein. In this study, the contribution of conformational entropy is considered in scoring loop conformations by employing "colony energy" which was previously proposed to estimate the free energy for an ensemble of conformations. Loop conformations were generated by using two EDISON_Chem programs GalaxyFill and GalaxySC, and colony energy was designed for this sampling by tuning relevant parameters. On a test set of 40 loops, the accuracy of predicted loop structure improved on average by scoring with the colony energy compared to scoring by energy alone. In addition, high correlation between colony energy and deviation from the native structure suggested that more extensive sampling can further improve the prediction accuracy. In another test on 6 ligand-binding loops that show conformational changes by ligand binding, both ligand-free and ligand-bound states could be identified by using colony energy when no information on the ligand-bound conformation is used.

  • PDF

AutoFe-Sel: A Meta-learning based methodology for Recommending Feature Subset Selection Algorithms

  • Irfan Khan;Xianchao Zhang;Ramesh Kumar Ayyasam;Rahman Ali
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.7
    • /
    • pp.1773-1793
    • /
    • 2023
  • Automated machine learning, often referred to as "AutoML," is the process of automating the time-consuming and iterative procedures that are associated with the building of machine learning models. There have been significant contributions in this area across a number of different stages of accomplishing a data-mining task, including model selection, hyper-parameter optimization, and preprocessing method selection. Among them, preprocessing method selection is a relatively new and fast growing research area. The current work is focused on the recommendation of preprocessing methods, i.e., feature subset selection (FSS) algorithms. One limitation in the existing studies regarding FSS algorithm recommendation is the use of a single learner for meta-modeling, which restricts its capabilities in the metamodeling. Moreover, the meta-modeling in the existing studies is typically based on a single group of data characterization measures (DCMs). Nonetheless, there are a number of complementary DCM groups, and their combination will allow them to leverage their diversity, resulting in improved meta-modeling. This study aims to address these limitations by proposing an architecture for preprocess method selection that uses ensemble learning for meta-modeling, namely AutoFE-Sel. To evaluate the proposed method, we performed an extensive experimental evaluation involving 8 FSS algorithms, 3 groups of DCMs, and 125 datasets. Results show that the proposed method achieves better performance compared to three baseline methods. The proposed architecture can also be easily extended to other preprocessing method selections, e.g., noise-filter selection and imbalance handling method selection.

Molecular dynamics simulation of bulk silicon under strain

  • Zhao, H.;Aluru, N.R.
    • Interaction and multiscale mechanics
    • /
    • v.1 no.2
    • /
    • pp.303-315
    • /
    • 2008
  • In this paper, thermodynamical properties of crystalline silicon under strain are calculated using classical molecular dynamics (MD) simulations based on the Tersoff interatomic potential. The Helmholtz free energy of the silicon crystal under strain is calculated by using the ensemble method developed by Frenkel and Ladd (1984). To account for quantum corrections under strain in the classical MD simulations, we propose an approach where the quantum corrections to the internal energy and the Helmholtz free energy are obtained by using the corresponding energy deviation between the classical and quantum harmonic oscillators. We calculate the variation of thermodynamic properties with temperature and strain and compare them with results obtained by using the quasi-harmonic model in the reciprocal space.