• 제목/요약/키워드: Gradient Boosting Algorithm

검색결과 73건 처리시간 0.021초

머신러닝을 이용한 권한 기반 안드로이드 악성코드 탐지 (Android Malware Detection Using Permission-Based Machine Learning Approach)

  • 강성은;응웬부렁;정수환
    • 정보보호학회논문지
    • /
    • 제28권3호
    • /
    • pp.617-623
    • /
    • 2018
  • 본 연구는 안드로이드 정적분석을 기반으로 추출된 AndroidManifest 권한 특징을 통해 악성코드를 탐지하고자 한다. 특징들은 AndroidManifest의 권한을 기반으로 분석에 대한 자원과 시간을 줄였다. 악성코드 탐지 모델은 1500개의 정상어플리케이션과 500개의 악성코드들을 학습한 SVM(support vector machine), NB(Naive Bayes), GBC(Gradient Boosting Classifier), Logistic Regression 모델로 구성하여 98%의 탐지율을 기록했다. 또한, 악성앱 패밀리 식별은 알고리즘 SVM과 GPC (Gaussian Process Classifier), GBC를 이용하여 multi-classifiers모델을 구현하였다. 학습된 패밀리 식별 머신러닝 모델은 악성코드패밀리를 92% 분류했다.

Study on Fault Detection of a Gas Pressure Regulator Based on Machine Learning Algorithms

  • Seo, Chan-Yang;Suh, Young-Joo;Kim, Dong-Ju
    • 한국컴퓨터정보학회논문지
    • /
    • 제25권4호
    • /
    • pp.19-27
    • /
    • 2020
  • 본 논문에서는 정압기의 이상 상태 진단을 위한 기계학습 방법을 제안한다. 일반적으로 설비의 이상 상태 탐지를 위한 기계학습 모델 구현에는 관련 센서의 설치와 데이터 수집 과정이 동반되나, 정압기는 설비 특성상 안전문제에 매우 민감하여 추가적인 센서 설치가 매우 까다롭다. 이에 본 논문에서는 센서의 추가 설치 없이 정압기 설비에서 자체 수집되는 유량과 유압 데이터만을 가지고 정압기의 이상 상태를 조기에 판단하는 기계학습 모델을 제안한다. 본 논문에서는 정압기의 비정상데이터가 충분하지 않은 관계로, 모델 학습 시 오버 샘플링(Over-Sampling)을 적용하여 모델이 모든 클래스에 균형적으로 학습하도록 하였다. 또한, 그레이디언트 부스팅(Gradient Boosting), 1차원 합성곱 신경망(1D Convolutional Neural Networks), LSTM(Long Short-Term Memory) 등의 기계학습 알고리즘을 적용하여 정압기의 이상 상태를 판단하는 분류모델을 구현하였고, 실험 결과 그레이디언트 부스팅 알고리즘이 정확도 99.975%로 가장 성능이 우수함을 확인하였다.

Gradient Boosting을 이용한 가축분뇨 인계관리시스템 인계서 자동 검증 (Automated Verification of Livestock Manure Transfer Management System Handover Document using Gradient Boosting)

  • 황종휘;김화경;류재학;김태호;신용태
    • 한국IT서비스학회지
    • /
    • 제22권4호
    • /
    • pp.97-110
    • /
    • 2023
  • In this study, we propose a technique to automatically generate transfer documents using sensor data from livestock manure transfer systems. The research involves analyzing sensor data and applying machine learning techniques to derive optimized outcomes for livestock manure transfer documents. By comparing and contrasting with existing documents, we present a method for automatic document generation. Specifically, we propose the utilization of Gradient Boosting, a machine learning algorithm. The objective of this research is to enhance the efficiency of livestock manure and liquid byproduct management. Currently, stakeholders including producers, transporters, and processors manually input data into the livestock manure transfer management system during the disposal of manure and liquid byproducts. This manual process consumes additional labor, leads to data inconsistency, and complicates the management of distribution and treatment. Therefore, the aim of this study is to leverage data to automatically generate transfer documents, thereby increasing the efficiency of livestock manure and liquid byproduct management. By utilizing sensor data from livestock manure and liquid byproduct transport vehicles and employing machine learning algorithms, we establish a system that automates the validation of transfer documents, reducing the burden on producers, transporters, and processors. This efficient management system is anticipated to create a transparent environment for the distribution and treatment of livestock manure and liquid byproducts.

Hybrid machine learning with moth-flame optimization methods for strength prediction of CFDST columns under compression

  • Quang-Viet Vu;Dai-Nhan Le;Thai-Hoan Pham;Wei Gao;Sawekchai Tangaramvong
    • Steel and Composite Structures
    • /
    • 제51권6호
    • /
    • pp.679-695
    • /
    • 2024
  • This paper presents a novel technique that combines machine learning (ML) with moth-flame optimization (MFO) methods to predict the axial compressive strength (ACS) of concrete filled double skin steel tubes (CFDST) columns. The proposed model is trained and tested with a dataset containing 125 tests of the CFDST column subjected to compressive loading. Five ML models, including extreme gradient boosting (XGBoost), gradient tree boosting (GBT), categorical gradient boosting (CAT), support vector machines (SVM), and decision tree (DT) algorithms, are utilized in this work. The MFO algorithm is applied to find optimal hyperparameters of these ML models and to determine the most effective model in predicting the ACS of CFDST columns. Predictive results given by some performance metrics reveal that the MFO-CAT model provides superior accuracy compared to other considered models. The accuracy of the MFO-CAT model is validated by comparing its predictive results with existing design codes and formulae. Moreover, the significance and contribution of each feature in the dataset are examined by employing the SHapley Additive exPlanations (SHAP) method. A comprehensive uncertainty quantification on probabilistic characteristics of the ACS of CFDST columns is conducted for the first time to examine the models' responses to variations of input variables in the stochastic environments. Finally, a web-based application is developed to predict ACS of the CFDST column, enabling rapid practical utilization without requesting any programing or machine learning expertise.

Hybrid machine learning with HHO method for estimating ultimate shear strength of both rectangular and circular RC columns

  • Quang-Viet Vu;Van-Thanh Pham;Dai-Nhan Le;Zhengyi Kong;George Papazafeiropoulos;Viet-Ngoc Pham
    • Steel and Composite Structures
    • /
    • 제52권2호
    • /
    • pp.145-163
    • /
    • 2024
  • This paper presents six novel hybrid machine learning (ML) models that combine support vector machines (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting (GB), extreme gradient boosting (XGB), and categorical gradient boosting (CGB) with the Harris Hawks Optimization (HHO) algorithm. These models, namely HHO-SVM, HHO-DT, HHO-RF, HHO-GB, HHO-XGB, and HHO-CGB, are designed to predict the ultimate strength of both rectangular and circular reinforced concrete (RC) columns. The prediction models are established using a comprehensive database consisting of 325 experimental data for rectangular columns and 172 experimental data for circular columns. The ML model hyperparameters are optimized through a combination of cross-validation technique and the HHO. The performance of the hybrid ML models is evaluated and compared using various metrics, ultimately identifying the HHO-CGB model as the top-performing model for predicting the ultimate shear strength of both rectangular and circular RC columns. The mean R-value and mean a20-index are relatively high, reaching 0.991 and 0.959, respectively, while the mean absolute error and root mean square error are low (10.302 kN and 27.954 kN, respectively). Another comparison is conducted with four existing formulas to further validate the efficiency of the proposed HHO-CGB model. The Shapely Additive Explanations method is applied to analyze the contribution of each variable to the output within the HHO-CGB model, providing insights into the local and global influence of variables. The analysis reveals that the depth of the column, length of the column, and axial loading exert the most significant influence on the ultimate shear strength of RC columns. A user-friendly graphical interface tool is then developed based on the HHO-CGB to facilitate practical and cost-effective usage.

Modeling with Thin Film Thickness using Machine Learning

  • Kim, Dong Hwan;Choi, Jeong Eun;Ha, Tae Min;Hong, Sang Jeen
    • 반도체디스플레이기술학회지
    • /
    • 제18권2호
    • /
    • pp.48-52
    • /
    • 2019
  • Virtual metrology, which is one of APC techniques, is a method to predict characteristics of manufactured films using machine learning with saving time and resources. As the photoresist is no longer a mask material for use in high aspect ratios as the CD is reduced, hard mask is introduced to solve such problems. Among many types of hard mask materials, amorphous carbon layer(ACL) is widely investigated due to its advantages of high etch selectivity than conventional photoresist, high optical transmittance, easy deposition process, and removability by oxygen plasma. In this study, VM using different machine learning algorithms is applied to predict the thickness of ACL and trained models are evaluated which model shows best prediction performance. ACL specimens are deposited by plasma enhanced chemical vapor deposition(PECVD) with four different process parameters(Pressure, RF power, $C_3H_6$ gas flow, $N_2$ gas flow). Gradient boosting regression(GBR) algorithm, random forest regression(RFR) algorithm, and neural network(NN) are selected for modeling. The model using gradient boosting algorithm shows most proper performance with higher R-squared value. A model for predicting the thickness of the ACL film within the abovementioned conditions has been successfully constructed.

JAYA-GBRT model for predicting the shear strength of RC slender beams without stirrups

  • Tran, Viet-Linh;Kim, Jin-Kook
    • Steel and Composite Structures
    • /
    • 제44권5호
    • /
    • pp.691-705
    • /
    • 2022
  • Shear failure in reinforced concrete (RC) structures is very hazardous. This failure is rarely predicted and may occur without any prior signs. Accurate shear strength prediction of the RC members is challenging, and traditional methods have difficulty solving it. This study develops a JAYA-GBRT model based on the JAYA algorithm and the gradient boosting regression tree (GBRT) to predict the shear strength of RC slender beams without stirrups. Firstly, 484 tests are carefully collected and divided into training and test sets. Then, the hyperparameters of the GBRT model are determined using the JAYA algorithm and 10-fold cross-validation. The performance of the JAYA-GBRT model is compared with five well-known empirical models. The comparative results show that the JAYA-GBRT model (R2 = 0.982, RMSE = 9.466 kN, MAE = 6.299 kN, µ = 1.018, and Cov = 0.116) outperforms the other models. Moreover, the predictions of the JAYA-GBRT model are globally and locally explained using the Shapley Additive exPlanation (SHAP) method. The effective depth is determined as the most crucial parameter influencing the shear strength through the SHAP method. Finally, a Graphic User Interface (GUI) tool and a web application (WA) are developed to apply the JAYA-GBRT model for rapidly predicting the shear strength of RC slender beams without stirrups.

Performance Analysis of Trading Strategy using Gradient Boosting Machine Learning and Genetic Algorithm

  • Jang, Phil-Sik
    • 한국컴퓨터정보학회논문지
    • /
    • 제27권11호
    • /
    • pp.147-155
    • /
    • 2022
  • 본 연구에서는 그래디언트 부스팅 기계학습과 유전 알고리즘을 이용하여 일별 주식 포트폴리오를 동적으로 구성하는 시스템을 구축하고 트레이딩 시뮬레이션을 통해 성능을 분석하였다. 이를 위해 유가증권시장과 코스닥시장에 상장된 종목들의 가격 데이터 및 투자자별 거래정보를 포함한 다양한 데이터를 수집하고, 전처리 과정과 변수가공을 통해 학습-예측에 이용될 변수들을 생성하였다. 첫 번째 실험에서는 예측정확도와 정밀도, 재현율 및 F1 점수 등 네 가지 지표를 활용하여 그래디언트 부스팅 기법들(XGBoost, LightGBM, CatBoost)의 성능을 비교 평가하였다. 두 번째 실험에서는 전 단계에서 선택된 LightGBM과 유전 알고리즘을 적용하여 상장 종목들의 일별 수익 여부를 학습-예측하였다. 그리고 예측된 수익 발생확률을 바탕으로 종목을 선별하여 트레이딩 시뮬레이션을 시행하고, CAGR, MDD, 사프지수 및 변동성 측면에서 코스피, 코스닥 지수와의 성능을 비교 평가하였다. 분석 결과, 제안된 전략들 모두 네 가지 성능평가 지표상에서 시장 평균을 넘어서는 것으로 나타났으며, 그래디언트 부스팅과 유전 알고리즘의 결합이 주식 가격 예측에 효과적으로 이용될 수 있음을 보여주었다.

A Design and Implement of Efficient Agricultural Product Price Prediction Model

  • Im, Jung-Ju;Kim, Tae-Wan;Lim, Ji-Seoup;Kim, Jun-Ho;Yoo, Tae-Yong;Lee, Won Joo
    • 한국컴퓨터정보학회논문지
    • /
    • 제27권5호
    • /
    • pp.29-36
    • /
    • 2022
  • 본 논문에서는 DACON에서 제공하는 데이터셋을 기반으로 한 효과적인 농산물 가격 예측 모델을 제안한다. 이 모델은 XGBoost와 CatBoost 이며 Gradient Boosting 계열의 알고리즘으로써 기존의 Logistic Regression과 Random Forest보다 평균정확도 및 수행시간이 우수하다. 이러한 장점들을 기반으로 농산물의 이전 가격들을 기반으로 1주, 2주, 4주뒤 가격을 예측하는 머신러닝 모델을 설계한다. XGBoost 모델은 회귀 방식의 모델링인 XGBoost Regressor 라이브러리를 사용하여 하이퍼 파라미터를 조정함으로써 가장 우수한 성능을 도출할 수 있다. CatBoost 모델은 CatBoost Regressor를 사용하여 모델을 구현한다. 구현한 모델은 DACON에서 제공하는 API를 이용하여 검증하고, 모델 별 성능평가를 실시한다. XGBoost는 자체적인 과적합 규제를 진행하기 때문에 적은 데이터셋에도 불구하고 우수한 성능을 도출하지만, 학습시간, 예측시간 등 시간적인 성능 면에서는 LGBM보다 성능이 낮다는 것을 알 수 있었다.

Decision based uncertainty model to predict rockburst in underground engineering structures using gradient boosting algorithms

  • Kidega, Richard;Ondiaka, Mary Nelima;Maina, Duncan;Jonah, Kiptanui Arap Too;Kamran, Muhammad
    • Geomechanics and Engineering
    • /
    • 제30권3호
    • /
    • pp.259-272
    • /
    • 2022
  • Rockburst is a dynamic, multivariate, and non-linear phenomenon that occurs in underground mining and civil engineering structures. Predicting rockburst is challenging since conventional models are not standardized. Hence, machine learning techniques would improve the prediction accuracies. This study describes decision based uncertainty models to predict rockburst in underground engineering structures using gradient boosting algorithms (GBM). The model input variables were uniaxial compressive strength (UCS), uniaxial tensile strength (UTS), maximum tangential stress (MTS), excavation depth (D), stress ratio (SR), and brittleness coefficient (BC). Several models were trained using different combinations of the input variables and a 3-fold cross-validation resampling procedure. The hyperparameters comprising learning rate, number of boosting iterations, tree depth, and number of minimum observations were tuned to attain the optimum models. The performance of the models was tested using classification accuracy, Cohen's kappa coefficient (k), sensitivity and specificity. The best-performing model showed a classification accuracy, k, sensitivity and specificity values of 98%, 93%, 1.00 and 0.957 respectively by optimizing model ROC metrics. The most and least influential input variables were MTS and BC, respectively. The partial dependence plots revealed the relationship between the changes in the input variables and model predictions. The findings reveal that GBM can be used to anticipate rockburst and guide decisions about support requirements before mining development.