• Title/Summary/Keyword: Random Forest (RF)

Search Result 182, Processing Time 0.023 seconds

Machine Learning-Based Detection of Cache Side Channel Attack Using Performance Counter Monitor of CPU (Performance Counter Monitor를 이용한 머신 러닝 기반 캐시 부채널 공격 탐지)

  • Hwang, Jongbae;Bae, Daehyeon;Ha, Jaecheol
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.30 no.6
    • /
    • pp.1237-1246
    • /
    • 2020
  • Recently, several cache side channel attacks have been proposed to extract secret information by exploiting design flaws of the microarchitecture. The Flush+Reload attack, one of the cache side channel attack, can be applied to malicious application attacks due to its properties of high resolution and low noise. In this paper, we proposed a detection system, which detects the cache-based attacks using the PCM(Performance Counter Monitor) for monitoring CPU cache activity. Especially, we observed the variation of each counter value of PCM in case of two kinds of attacks, Spectre attack and secret recovering attack during AES encryption. As a result, we found that four hardware counters were sensitive to cache side channel attacks. Our detector based on machine learning including SVM(Support Vector Machine), RF(Random Forest) and MLP(Multi Level Perceptron) can detect the cache side channel attacks with high detection accuracy.

Intelligent System for the Prediction of Heart Diseases Using Machine Learning Algorithms with Anew Mixed Feature Creation (MFC) technique

  • Rawia Elarabi;Abdelrahman Elsharif Karrar;Murtada El-mukashfi El-taher
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.5
    • /
    • pp.148-162
    • /
    • 2023
  • Classification systems can significantly assist the medical sector by allowing for the precise and quick diagnosis of diseases. As a result, both doctors and patients will save time. A possible way for identifying risk variables is to use machine learning algorithms. Non-surgical technologies, such as machine learning, are trustworthy and effective in categorizing healthy and heart-disease patients, and they save time and effort. The goal of this study is to create a medical intelligent decision support system based on machine learning for the diagnosis of heart disease. We have used a mixed feature creation (MFC) technique to generate new features from the UCI Cleveland Cardiology dataset. We select the most suitable features by using Least Absolute Shrinkage and Selection Operator (LASSO), Recursive Feature Elimination with Random Forest feature selection (RFE-RF) and the best features of both LASSO RFE-RF (BLR) techniques. Cross-validated and grid-search methods are used to optimize the parameters of the estimator used in applying these algorithms. and classifier performance assessment metrics including classification accuracy, specificity, sensitivity, precision, and F1-Score, of each classification model, along with execution time and RMSE the results are presented independently for comparison. Our proposed work finds the best potential outcome across all available prediction models and improves the system's performance, allowing physicians to diagnose heart patients more accurately.

Performance Comparison of Machine Learning Models for Grid-Based Flood Risk Mapping - Focusing on the Case of Typhoon Chaba in 2016 - (격자 기반 침수위험지도 작성을 위한 기계학습 모델별 성능 비교 연구 - 2016 태풍 차바 사례를 중심으로 -)

  • Jihye Han;Changjae Kwak;Kuyoon Kim;Miran Lee
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_2
    • /
    • pp.771-783
    • /
    • 2023
  • This study aims to compare the performance of each machine learning model for preparing a grid-based disaster risk map related to flooding in Jung-gu, Ulsan, for Typhoon Chaba which occurred in 2016. Dynamic data such as rainfall and river height, and static data such as building, population, and land cover data were used to conduct a risk analysis of flooding disasters. The data were constructed as 10 m-sized grid data based on the national point number, and a sample dataset was constructed using the risk value calculated for each grid as a dependent variable and the value of five influencing factors as an independent variable. The total number of sample datasets is 15,910, and the training, verification, and test datasets are randomly extracted at a 6:2:2 ratio to build a machine-learning model. Machine learning used random forest (RF), support vector machine (SVM), and k-nearest neighbor (KNN) techniques, and prediction accuracy by the model was found to be excellent in the order of SVM (91.05%), RF (83.08%), and KNN (76.52%). As a result of deriving the priority of influencing factors through the RF model, it was confirmed that rainfall and river water levels greatly influenced the risk.

Seismic Vulnerability Assessment and Mapping for 9.12 Gyeongju Earthquake Based on Machine Learning (기계학습을 이용한 지진 취약성 평가 및 매핑: 9.12 경주지진을 대상으로)

  • Han, Jihye;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.36 no.6_1
    • /
    • pp.1367-1377
    • /
    • 2020
  • The purpose of this study is to assess the seismic vulnerability of buildings in Gyeongju city starting with the earthquake that occurred in the city on September 12, 2016, and produce a seismic vulnerability map. 11 influence factors related to geotechnical, physical, and structural indicators were selected to assess the seismic vulnerability, and these were applied as independent variables. For a dependent variable, location data of the buildings that were actually damaged in the 9.12 Gyeongju Earthquake was used. The assessment model was constructed based on random forest (RF) as a mechanic study method and support vector machine (SVM), and the training and test dataset were randomly selected with a ratio of 70:30. For accuracy verification, the receiver operating characteristic (ROC) curve was used to select an optimum model, and the accuracy of each model appeared to be 1.000 for RF and 0.998 for SVM, respectively. In addition, the prediction accuracy was shown as 0.947 and 0.926 for RF and SVM, respectively. The prediction values of the entire buildings in Gyeongju were derived on the basis of the RF model, and these were graded and used to produce the seismic vulnerability map. As a result of reviewing the distribution of building classes as an administrative unit, Hwangnam, Wolseong, Seondo, and Naenam turned out to be highly vulnerable regions, and Yangbuk, Gangdong, Yangnam, and Gampo turned out to be relatively safer regions.

Prediction and Analysis of PM2.5 Concentration in Seoul Using Ensemble-based Model (앙상블 기반 모델을 이용한 서울시 PM2.5 농도 예측 및 분석)

  • Ryu, Minji;Son, Sanghun;Kim, Jinsoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_1
    • /
    • pp.1191-1205
    • /
    • 2022
  • Particulate matter(PM) among air pollutants with complex and widespread causes is classified according to particle size. Among them, PM2.5 is very small in size and can cause diseases in the human respiratory tract or cardiovascular system if inhaled by humans. In order to prepare for these risks, state-centered management and preventable monitoring and forecasting are important. This study tried to predict PM2.5 in Seoul, where high concentrations of fine dust occur frequently, using two ensemble models, random forest (RF) and extreme gradient boosting (XGB) using 15 local data assimilation and prediction system (LDAPS) weather-related factors, aerosol optical depth (AOD) and 4 chemical factors as independent variables. Performance evaluation and factor importance evaluation of the two models used for prediction were performed, and seasonal model analysis was also performed. As a result of prediction accuracy, RF showed high prediction accuracy of R2 = 0.85 and XGB R2 = 0.91, and it was confirmed that XGB was a more suitable model for PM2.5 prediction than RF. As a result of the seasonal model analysis, it can be said that the prediction performance was good compared to the observed values with high concentrations in spring. In this study, PM2.5 of Seoul was predicted using various factors, and an ensemble-based PM2.5 prediction model showing good performance was constructed.

A Comparative Study of Reservoir Surface Area Detection Algorithm Using SAR Image (SAR 영상을 활용한 저수지 수표면적 탐지 알고리즘 비교 연구)

  • Jeong, Hagyu;Park, Jongsoo;Lee, Dalgeun;Lee, Junwoo
    • Korean Journal of Remote Sensing
    • /
    • v.38 no.6_3
    • /
    • pp.1777-1788
    • /
    • 2022
  • The reservoir is a major water supply source in the domestic agricultural environment, and the monitoring of water storage of reservoirs is important for the utilization and management of agricultural water resource. Remote sensing via satellite imagery can be an effective method for regular monitoring of widely distributed objects such as reservoirs, and in this study, image classification and image segmentation algorithms are applied to Sentinel-1 Synthetic Aperture Radar (SAR) imagery for water body detection in 53 reservoirs in South Korea. Six algorithms are used: Neural Network (NN), Support Vector Machine (SVM), Random Forest (RF), Otsu, Watershed (WS), and Chan-Vese (CV), and the results of water body detection are evaluated with in-situ images taken by drones. The correlations between the in-situ water surface area and detected water surface area from each algorithm are NN 0.9941, SVM 0.9942, RF 0.9940, Otsu 0.9922, WS 0.9709, and CV 0.9736, and the larger the scale of reservoir, the higher the linear correlation was. WS showed low recall due to the undetected water bodies, and NN, SVM, and RF showed low precision due to over-detection. For water body detection through SAR imagery, we found that aquatic plants and artificial structures can be the error factors causing undetection of water body.

The Estimation of Arctic Air Temperature in Summer Based on Machine Learning Approaches Using IABP Buoy and AMSR2 Satellite Data (기계학습 기반의 IABP 부이 자료와 AMSR2 위성영상을 이용한 여름철 북극 대기 온도 추정)

  • Han, Daehyeon;Kim, Young Jun;Im, Jungho;Lee, Sanggyun;Lee, Yeonsu;Kim, Hyun-cheol
    • Korean Journal of Remote Sensing
    • /
    • v.34 no.6_2
    • /
    • pp.1261-1272
    • /
    • 2018
  • It is important to measure the Arctic surface air temperature because it plays a key-role in the exchange of energy between the ocean, sea ice, and the atmosphere. Although in-situ observations provide accurate measurements of air temperature, they are spatially limited to show the distribution of Arctic surface air temperature. In this study, we proposed machine learning-based models to estimate the Arctic surface air temperature in summer based on buoy data and Advanced Microwave Scanning Radiometer 2 (AMSR2)satellite data. Two machine learning approaches-random forest (RF) and support vector machine (SVM)-were used to estimate the air temperature twice a day according to AMSR2 observation time. Both RF and SVM showed $R^2$ of 0.84-0.88 and RMSE of $1.31-1.53^{\circ}C$. The results were compared to the surface air temperature and spatial distribution of the ERA-Interim reanalysis data from the European Center for Medium-Range Weather Forecasts (ECMWF). They tended to underestimate the Barents Sea, the Kara Sea, and the Baffin Bay region where no IABP buoy observations exist. This study showed both possibility and limitations of the empirical estimation of Arctic surface temperature using AMSR2 data.

Data Mining based Forest Fires Prediction Models using Meteorological Data (기상 데이터를 이용한 데이터 마이닝 기반의 산불 예측 모델)

  • Kim, Sam-Keun;Ahn, Jae-Geun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.21 no.8
    • /
    • pp.521-529
    • /
    • 2020
  • Forest fires are one of the most important environmental risks that have adverse effects on many aspects of life, such as the economy, environment, and health. The early detection, quick prediction, and rapid response of forest fires can play an essential role in saving property and life from forest fire risks. For the rapid discovery of forest fires, there is a method using meteorological data obtained from local sensors installed in each area by the Meteorological Agency. Meteorological conditions (e.g., temperature, wind) influence forest fires. This study evaluated a Data Mining (DM) approach to predict the burned area of forest fires. Five DM models, e.g., Stochastic Gradient Descent (SGD), Support Vector Machines (SVM), Decision Tree (DT), Random Forests (RF), and Deep Neural Network (DNN), and four feature selection setups (using spatial, temporal, and weather attributes), were tested on recent real-world data collected from Gyeonggi-do area over the last five years. As a result of the experiment, a DNN model using only meteorological data showed the best performance. The proposed model was more effective in predicting the burned area of small forest fires, which are more frequent. This knowledge derived from the proposed prediction model is particularly useful for improving firefighting resource management.

Prediction of Distillation Column Temperature Using Machine Learning and Data Preprocessing (머신 러닝과 데이터 전처리를 활용한 증류탑 온도 예측)

  • Lee, Yechan;Choi, Yeongryeol;Cho, Hyungtae;Kim, Junghwan
    • Korean Chemical Engineering Research
    • /
    • v.59 no.2
    • /
    • pp.191-199
    • /
    • 2021
  • A distillation column, which is a main facility of the chemical process, separates the desired product from a mixture by using the difference of boiling points. The distillation process requires the optimization and the prediction of operation because it consumes much energy. The target process of this study is difficult to operate efficiently because the composition of feed flow is not steady according to the supplier. To deal with this problem, we could develop a data-driven model to predict operating conditions. However, data preprocessing is essential to improve the predictive performance of the model because the raw data contains outlier and noise. In this study, after optimizing the predictive model based long-short term memory (LSTM) and Random forest (RF), we used a low-pass filter and one-class support vector machine for data preprocessing and compared predictive performance according to the method and range of the preprocessing. The performance of the predictive model and the effect of the preprocessing is compared by using R2 and RMSE. In the case of LSTM, R2 increased from 0.791 to 0.977 by 23.5%, and RMSE decreased from 0.132 to 0.029 by 78.0%. In the case of RF, R2 increased from 0.767 to 0.938 by 22.3%, and RMSE decreased from 0.140 to 0.050 by 64.3%.

Estimation of Surface fCO2 in the Southwest East Sea using Machine Learning Techniques (기계학습법을 이용한 동해 남서부해역의 표층 이산화탄소분압(fCO2) 추정)

  • HAHM, DOSHIK;PARK, SOYEONA;CHOI, SANG-HWA;KANG, DONG-JIN;RHO, TAEKEUN;LEE, TONGSUP
    • The Sea:JOURNAL OF THE KOREAN SOCIETY OF OCEANOGRAPHY
    • /
    • v.24 no.3
    • /
    • pp.375-388
    • /
    • 2019
  • Accurate evaluation of sea-to-air $CO_2$ flux and its variability is crucial information to the understanding of global carbon cycle and the prediction of atmospheric $CO_2$ concentration. $fCO_2$ observations are sparse in space and time in the East Sea. In this study, we derived high resolution time series of surface $fCO_2$ values in the southwest East Sea, by feeding sea surface temperature (SST), salinity (SSS), chlorophyll-a (CHL), and mixed layer depth (MLD) values, from either satellite-observations or numerical model outputs, to three machine learning models. The root mean square error of the best performing model, a Random Forest (RF) model, was $7.1{\mu}atm$. Important parameters in predicting $fCO_2$ in the RF model were SST and SSS along with time information; CHL and MLD were much less important than the other parameters. The net $CO_2$ flux in the southwest East Sea, calculated from the $fCO_2$ predicted by the RF model, was $-0.76{\pm}1.15mol\;m^{-2}yr^{-1}$, close to the lower bound of the previous estimates in the range of $-0.66{\sim}-2.47mol\;m^{-2}yr^{-1}$. The time series of $fCO_2$ predicted by the RF model showed a significant variation even in a short time interval of a week. For accurate evaluation of the $CO_2$ flux in the Ulleung Basin, it is necessary to conduct high resolution in situ observations in spring when $fCO_2$ changes rapidly.