• Title/Summary/Keyword: Permutation Feature Importance

Search Result 7, Processing Time 0.024 seconds

A DDoS Attack Detection Technique through CNN Model in Software Define Network (소프트웨어-정의 네트워크에서 CNN 모델을 이용한 DDoS 공격 탐지 기술)

  • Ko, Kwang-Man
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.13 no.6
    • /
    • pp.605-610
    • /
    • 2020
  • Software Defined Networking (SDN) is setting the standard for the management of networks due to its scalability, flexibility and functionality to program the network. The Distributed Denial of Service (DDoS) attack is most widely used to attack the SDN controller to bring down the network. Different methodologies have been utilized to detect DDoS attack previously. In this paper, first the dataset is obtained by Kaggle with 84 features, and then according to the rank, the 20 highest rank features are selected using Permutation Importance Algorithm. Then, the datasets are trained and tested with Convolution Neural Network (CNN) classifier model by utilizing deep learning techniques. Our proposed solution has achieved the best results, which will allow the critical systems which need more security to adopt and take full advantage of the SDN paradigm without compromising their security.

Truck Weight Estimation using Operational Statistics at 3rd Party Logistics Environment (운영 데이터를 활용한 제3자 물류 환경에서의 배송 트럭 무게 예측)

  • Yu-jin Lee;Kyung Min Choi;Song-eun Kim;Kyungsu Park;Seung Hwan Jung
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.45 no.4
    • /
    • pp.127-133
    • /
    • 2022
  • Many manufacturers applying third party logistics (3PLs) have some challenges to increase their logistics efficiency. This study introduces an effort to estimate the weight of the delivery trucks provided by 3PL providers, which allows the manufacturer to package and load products in trailers in advance to reduce delivery time. The accuracy of the weigh estimation is more important due to the total weight regulation. This study uses not only the data from the company but also many general prediction variables such as weather, oil prices and population of destinations. In addition, operational statistics variables are developed to indicate the availabilities of the trucks in a specific weight category for each 3PL provider. The prediction model using XGBoost regressor and permutation feature importance method provides highly acceptable performance with MAPE of 2.785% and shows the effectiveness of the developed operational statistics variables.

Predicting Forest Fires Using Machine Learning Considering Human Factors (인적요인을 고려한 머신러닝 활용 산림화재 예측)

  • Jin-Myeong Jang;Joo-Chan Kim;Hwa-Joong Kim;Kwang-Tae Kim
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.5
    • /
    • pp.109-126
    • /
    • 2023
  • Early detection of forest fires is essential in preventing large-scale forest fires. Predicting forest fires serves as a vital early detection method, leading to various related studies. However, many previous studies focused solely on climate and geographic factors, overlooking human factors, which significantly contribute to forest fires. This study aims to develop forest fire prediction models that take into account human, weather and geographical factors. This study conducted a comparative analysis of four machine learning models alongside the logistic regression model, using forest fire data from Gangwon-do spanning 2003 to 2020. The results indicate that XG Boost models performed the best (AUC=0.925), closely followed by Random Forest (AUC=0.920), both of which are machine learning techniques. Lastly, the study analyzed the relative importance of various factors through permutation feature importance analysis to derive operational insights. While meteorological factors showed a greater impact compared to human factors, various human factors were also found to be significant.

A study on fault diagnosis of marine engine using a neural network with dimension-reduced vibration signals (차원 축소 진동 신호를 이용한 신경망 기반 선박 엔진 고장진단에 관한 연구)

  • Sim, Kichan;Lee, Kangsu;Byun, Sung-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.41 no.5
    • /
    • pp.492-499
    • /
    • 2022
  • This study experimentally investigates the effect of dimensionality reduction of vibration signal on fault diagnosis of a marine engine. By using the principal component analysis, a vibration signal having the dimension of 513 is converted into a low-dimensional signal having the dimension of 1 to 15, and the variation in fault diagnosis accuracy according to the dimensionality change is observed. The vibration signal measured from a full-scale marine generator diesel engine is used, and the contribution of the dimension-reduced signal is quantitatively evaluated using two kinds of variable importance analysis algorithms which are the integrated gradients and the feature permutation methods. As a result of experimental data analysis, the accuracy of the fault diagnosis is shown to improve as the number of dimensions used increases, and when the dimension approaches 10, near-perfect fault classification accuracy is achieved. This shows that the dimension of the vibration signal can be considerably reduced without degrading fault diagnosis accuracy. In the variable importance analysis, the dimension-reduced principal components show higher contribution than the conventional statistical features, which supports the effectiveness of the dimension-reduced signals on fault diagnosis.

Analysis of Occupational Injury and Feature Importance of Fall Accidents on the Construction Sites using Adaboost (에이다 부스트를 활용한 건설현장 추락재해의 강도 예측과 영향요인 분석)

  • Choi, Jaehyun;Ryu, HanGuk
    • Journal of the Architectural Institute of Korea Structure & Construction
    • /
    • v.35 no.11
    • /
    • pp.155-162
    • /
    • 2019
  • The construction industry is the highest safety accident causing industry as 28.55% portion of all industries' accidents in Korea. In particular, falling is the highest accidents type composed of 60.16% among the construction field accidents. Therefore, we analyzed the factors of major disaster affecting the fall accident and then derived feature importances by considering various variables. We used data collected from Korea Occupational Safety & Health Agency (KOSHA) for learning and predicting in the proposed model. We have an effort to predict the degree of occupational fall accidents by using the machine learning model, i.e., Adaboost, short for Adaptive Boosting. Adaboost is a machine learning meta-algorithm which can be used in conjunction with many other types of learning algorithms to improve performance. Decision trees were combined with AdaBoost in this model to predict and classify the degree of occupational fall accidents. HyOperpt was also used to optimize hyperparameters and to combine k-fold cross validation by hierarchy. We extracted and analyzed feature importances and affecting fall disaster by permutation technique. In this study, we verified the degree of fall accidents with predictive accuracy. The machine learning model was also confirmed to be applicable to the safety accident analysis in construction site. In the future, if the safety accident data is accumulated automatically in the network system using IoT(Internet of things) technology in real time in the construction site, it will be possible to analyze the factors and types of accidents according to the site conditions from the real time data.

Comparison Analysis of Machine Learning for Concrete Crack Depths Prediction Using Thermal Image and Environmental Parameters (열화상 이미지와 환경변수를 이용한 콘크리트 균열 깊이 예측 머신 러닝 분석)

  • Kim, Jihyung;Jang, Arum;Park, Min Jae;Ju, Young K.
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.2
    • /
    • pp.99-110
    • /
    • 2021
  • This study presents the estimation of crack depth by analyzing temperatures extracted from thermal images and environmental parameters such as air temperature, air humidity, illumination. The statistics of all acquired features and the correlation coefficient among thermal images and environmental parameters are presented. The concrete crack depths were predicted by four different machine learning models: Multi-Layer Perceptron (MLP), Random Forest (RF), Gradient Boosting (GB), and AdaBoost (AB). The machine learning algorithms are validated by the coefficient of determination, accuracy, and Mean Absolute Percentage Error (MAPE). The AB model had a great performance among the four models due to the non-linearity of features and weak learner aggregation with weights on misclassified data. The maximum depth 11 of the base estimator in the AB model is efficient with high performance with 97.6% of accuracy and 0.07% of MAPE. Feature importances, permutation importance, and partial dependence are analyzed in the AB model. The results show that the marginal effect of air humidity, crack depth, and crack temperature in order is higher than that of the others.

Machine learning-based analysis and prediction model on the strengthening mechanism of biopolymer-based soil treatment

  • Haejin Lee;Jaemin Lee;Seunghwa Ryu;Ilhan Chang
    • Geomechanics and Engineering
    • /
    • v.36 no.4
    • /
    • pp.381-390
    • /
    • 2024
  • The introduction of bio-based materials has been recommended in the geotechnical engineering field to reduce environmental pollutants such as heavy metals and greenhouse gases. However, bio-treated soil methods face limitations in field application due to short research periods and insufficient verification of engineering performance, especially when compared to conventional materials like cement. Therefore, this study aimed to develop a machine learning model for predicting the unconfined compressive strength, a representative soil property, of biopolymer-based soil treatment (BPST). Four machine learning algorithms were compared to determine a suitable model, including linear regression (LR), support vector regression (SVR), random forest (RF), and neural network (NN). Except for LR, the SVR, RF, and NN algorithms exhibited high predictive performance with an R2 value of 0.98 or higher. The permutation feature importance technique was used to identify the main factors affecting the strength enhancement of BPST. The results indicated that the unconfined compressive strength of BPST is affected by mean particle size, followed by biopolymer content and water content. With a reliable prediction model, the proposed model can present guidelines prior to laboratory testing and field application, thereby saving a significant amount of time and money.