• 제목/요약/키워드: Gradient Boosting Algorithm

검색결과 73건 처리시간 0.032초

대청호 Chl-a 예측을 위한 random forest와 gradient boosting 알고리즘 적용 연구 (A study on applying random forest and gradient boosting algorithm for Chl-a prediction of Daecheong lake)

  • 이상민;김일규
    • 상하수도학회지
    • /
    • 제35권6호
    • /
    • pp.507-516
    • /
    • 2021
  • In this study, the machine learning which has been widely used in prediction algorithms recently was used. the research point was the CD(chudong) point which was a representative point of Daecheong Lake. Chlorophyll-a(Chl-a) concentration was used as a target variable for algae prediction. to predict the Chl-a concentration, a data set of water quality and quantity factors was consisted. we performed algorithms about random forest and gradient boosting with Python. to perform the algorithms, at first the correlation analysis between Chl-a and water quality and quantity data was studied. we extracted ten factors of high importance for water quality and quantity data. as a result of the algorithm performance index, the gradient boosting showed that RMSE was 2.72 mg/m3 and MSE was 7.40 mg/m3 and R2 was 0.66. as a result of the residual analysis, the analysis result of gradient boosting was excellent. as a result of the algorithm execution, the gradient boosting algorithm was excellent. the gradient boosting algorithm was also excellent with 2.44 mg/m3 of RMSE in the machine learning hyperparameter adjustment result.

A robust approach in prediction of RCFST columns using machine learning algorithm

  • Van-Thanh Pham;Seung-Eock Kim
    • Steel and Composite Structures
    • /
    • 제46권2호
    • /
    • pp.153-173
    • /
    • 2023
  • Rectangular concrete-filled steel tubular (RCFST) column, a type of concrete-filled steel tubular (CFST), is widely used in compression members of structures because of its advantages. This paper proposes a robust machine learning-based framework for predicting the ultimate compressive strength of RCFST columns under both concentric and eccentric loading. The gradient boosting neural network (GBNN), an efficient and up-to-date ML algorithm, is utilized for developing a predictive model in the proposed framework. A total of 890 experimental data of RCFST columns, which is categorized into two datasets of concentric and eccentric compression, is carefully collected to serve as training and testing purposes. The accuracy of the proposed model is demonstrated by comparing its performance with seven state-of-the-art machine learning methods including decision tree (DT), random forest (RF), support vector machines (SVM), deep learning (DL), adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), and categorical gradient boosting (CatBoost). Four available design codes, including the European (EC4), American concrete institute (ACI), American institute of steel construction (AISC), and Australian/New Zealand (AS/NZS) are refereed in another comparison. The results demonstrate that the proposed GBNN method is a robust and powerful approach to obtain the ultimate strength of RCFST columns.

GBGNN: Gradient Boosted Graph Neural Networks

  • Eunjo Jang;Ki Yong Lee
    • Journal of Information Processing Systems
    • /
    • 제20권4호
    • /
    • pp.501-513
    • /
    • 2024
  • In recent years, graph neural networks (GNNs) have been extensively used to analyze graph data across various domains because of their powerful capabilities in learning complex graph-structured data. However, recent research has focused on improving the performance of a single GNN with only two or three layers. This is because stacking layers deeply causes the over-smoothing problem of GNNs, which degrades the performance of GNNs significantly. On the other hand, ensemble methods combine individual weak models to obtain better generalization performance. Among them, gradient boosting is a powerful supervised learning algorithm that adds new weak models in the direction of reducing the errors of the previously created weak models. After repeating this process, gradient boosting combines the weak models to produce a strong model with better performance. Until now, most studies on GNNs have focused on improving the performance of a single GNN. In contrast, improving the performance of GNNs using multiple GNNs has not been studied much yet. In this paper, we propose gradient boosted graph neural networks (GBGNN) that combine multiple shallow GNNs with gradient boosting. We use shallow GNNs as weak models and create new weak models using the proposed gradient boosting-based loss function. Our empirical evaluations on three real-world datasets demonstrate that GBGNN performs much better than a single GNN. Specifically, in our experiments using graph convolutional network (GCN) and graph attention network (GAT) as weak models on the Cora dataset, GBGNN achieves performance improvements of 12.3%p and 6.1%p in node classification accuracy compared to a single GCN and a single GAT, respectively.

AN OPTIMAL BOOSTING ALGORITHM BASED ON NONLINEAR CONJUGATE GRADIENT METHOD

  • CHOI, JOOYEON;JEONG, BORA;PARK, YESOM;SEO, JIWON;MIN, CHOHONG
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제22권1호
    • /
    • pp.1-13
    • /
    • 2018
  • Boosting, one of the most successful algorithms for supervised learning, searches the most accurate weighted sum of weak classifiers. The search corresponds to a convex programming with non-negativity and affine constraint. In this article, we propose a novel Conjugate Gradient algorithm with the Modified Polak-Ribiera-Polyak conjugate direction. The convergence of the algorithm is proved and we report its successful applications to boosting.

Using Machine Learning Technique for Analytical Customer Loyalty

  • Mohamed M. Abbassy
    • International Journal of Computer Science & Network Security
    • /
    • 제23권8호
    • /
    • pp.190-198
    • /
    • 2023
  • To enhance customer satisfaction for higher profits, an e-commerce sector can establish a continuous relationship and acquire new customers. Utilize machine-learning models to analyse their customer's behavioural evidence to produce their competitive advantage to the e-commerce platform by helping to improve overall satisfaction. These models will forecast customers who will churn and churn causes. Forecasts are used to build unique business strategies and services offers. This work is intended to develop a machine-learning model that can accurately forecast retainable customers of the entire e-commerce customer data. Developing predictive models classifying different imbalanced data effectively is a major challenge in collected data and machine learning algorithms. Build a machine learning model for solving class imbalance and forecast customers. The satisfaction accuracy is used for this research as evaluation metrics. This paper aims to enable to evaluate the use of different machine learning models utilized to forecast satisfaction. For this research paper are selected three analytical methods come from various classifications of learning. Classifier Selection, the efficiency of various classifiers like Random Forest, Logistic Regression, SVM, and Gradient Boosting Algorithm. Models have been used for a dataset of 8000 records of e-commerce websites and apps. Results indicate the best accuracy in determining satisfaction class with both gradient-boosting algorithm classifications. The results showed maximum accuracy compared to other algorithms, including Gradient Boosting Algorithm, Support Vector Machine Algorithm, Random Forest Algorithm, and logistic regression Algorithm. The best model developed for this paper to forecast satisfaction customers and accuracy achieve 88 %.

Investigating the performance of different decomposition methods in rainfall prediction from LightGBM algorithm

  • Narimani, Roya;Jun, Changhyun;Nezhad, Somayeh Moghimi;Parisouj, Peiman
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2022년도 학술발표회
    • /
    • pp.150-150
    • /
    • 2022
  • This study investigates the roles of decomposition methods on high accuracy in daily rainfall prediction from light gradient boosting machine (LightGBM) algorithm. Here, empirical mode decomposition (EMD) and singular spectrum analysis (SSA) methods were considered to decompose and reconstruct input time series into trend terms, fluctuating terms, and noise components. The decomposed time series from EMD and SSA methods were used as input data for LightGBM algorithm in two hybrid models, including empirical mode-based light gradient boosting machine (EMDGBM) and singular spectrum analysis-based light gradient boosting machine (SSAGBM), respectively. A total of four parameters (i.e., temperature, humidity, wind speed, and rainfall) at a daily scale from 2003 to 2017 is used as input data for daily rainfall prediction. As results from statistical performance indicators, it indicates that the SSAGBM model shows a better performance than the EMDGBM model and the original LightGBM algorithm with no decomposition methods. It represents that the accuracy of LightGBM algorithm in rainfall prediction was improved with the SSA method when using multivariate dataset.

  • PDF

Dynamic Caching Routing Strategy for LEO Satellite Nodes Based on Gradient Boosting Regression Tree

  • Yang Yang;Shengbo Hu;Guiju Lu
    • Journal of Information Processing Systems
    • /
    • 제20권1호
    • /
    • pp.131-147
    • /
    • 2024
  • A routing strategy based on traffic prediction and dynamic cache allocation for satellite nodes is proposed to address the issues of high propagation delay and overall delay of inter-satellite and satellite-to-ground links in low Earth orbit (LEO) satellite systems. The spatial and temporal correlations of satellite network traffic were analyzed, and the relevant traffic through the target satellite was extracted as raw input for traffic prediction. An improved gradient boosting regression tree algorithm was used for traffic prediction. Based on the traffic prediction results, a dynamic cache allocation routing strategy is proposed. The satellite nodes periodically monitor the traffic load on inter-satellite links (ISLs) and dynamically allocate cache resources for each ISL with neighboring nodes. Simulation results demonstrate that the proposed routing strategy effectively reduces packet loss rate and average end-to-end delay and improves the distribution of services across the entire network.

머신러닝 기법을 활용한 낙동강 중류 지역의 Chl-a 예측 알고리즘 비교 연구(수질인자 및 수량 중심으로) (Comparison of machine learning algorithms for Chl-a prediction in the middle of Nakdong River (focusing on water quality and quantity factors))

  • 이상민;박경덕;김일규
    • 상하수도학회지
    • /
    • 제34권4호
    • /
    • pp.277-288
    • /
    • 2020
  • In this study, we performed algorithms to predict algae of Chlorophyll-a (Chl-a). Water quality and quantity data of the middle Nakdong River area were used. At first, the correlation analysis between Chl-a and water quality and quantity data was studied. We extracted ten factors of high importance for water quality and quantity data about the two weirs. Algorithms predicted how ten factors affected Chl-a occurrence. We performed algorithms about decision tree, random forest, elastic net, gradient boosting with Python. The root mean square error (RMSE) value was used to evaluate excellent algorithms. The gradient boosting showed 10.55 of RMSE value for the Gangjeonggoryeong (GG) site and 11.43 of RMSE value for the Dalsung (DS) site. The gradient boosting algorithm showed excellent results for GG and DS sites. Prediction value for the four algorithms was also evaluated through the Receiver operating characteristic (ROC) curve and Area under curve (AUC). As a result of the evaluation, the AUC value was 0.877 at GG site and the AUC value was 0.951 at DS site. So the algorithm's ability to interpret seemed to be excellent.

Gradient Boosting 기법을 활용한 다크넷 트래픽 탐지 및 분류 (Darknet Traffic Detection and Classification Using Gradient Boosting Techniques)

  • 김지혜;이수진
    • 정보보호학회논문지
    • /
    • 제32권2호
    • /
    • pp.371-379
    • /
    • 2022
  • 다크넷(Darknet)은 익명성과 보안을 바탕으로 하고 있어 각종 범죄 및 불법 활동에 지속적으로 악용되고 있으며, 이러한 오·남용을 막기 위해 다크넷 트래픽을 정확하게 탐지하고 분류하는 연구는 매우 중요하다. 본 논문에서는 그레디언트 부스팅 기법을 활용한 다크넷 트래픽 탐지 및 분류 기법을 제안하였다. CIC-Darknet2020 데이터셋에 XGBoost와 LightGBM 알고리즘을 적용한 결과, 99.99%의 탐지율과 99% 이상의 분류 성능을 나타내어 기존 연구에 비해 3% 이상 높은 탐지 성능과 13% 이상의 높은 분류 성능을 달성할 수 있었다. 특히, LightGBM 알고리즘의 경우, XGBoost보다 약 1.6배의 학습 시간과 10배의 하이퍼 파라미터 튜닝 실행시간을 단축하여 월등히 우수한 성능으로 다크넷 트래픽 탐지 및 분류를 수행하였다.

Machine learning-based prediction of wind forces on CAARC standard tall buildings

  • Yi Li;Jie-Ting Yin;Fu-Bin Chen;Qiu-Sheng Li
    • Wind and Structures
    • /
    • 제36권6호
    • /
    • pp.355-366
    • /
    • 2023
  • Although machine learning (ML) techniques have been widely used in various fields of engineering practice, their applications in the field of wind engineering are still at the initial stage. In order to evaluate the feasibility of machine learning algorithms for prediction of wind loads on high-rise buildings, this study took the exposure category type, wind direction and the height of local wind force as the input features and adopted four different machine learning algorithms including k-nearest neighbor (KNN), support vector machine (SVM), gradient boosting regression tree (GBRT) and extreme gradient (XG) boosting to predict wind force coefficients of CAARC standard tall building model. All the hyper-parameters of four ML algorithms are optimized by tree-structured Parzen estimator (TPE). The result shows that mean drag force coefficients and RMS lift force coefficients can be well predicted by the GBRT algorithm model while the RMS drag force coefficients can be forecasted preferably by the XG boosting algorithm model. The proposed machine learning based algorithms for wind loads prediction can be an alternative of traditional wind tunnel tests and computational fluid dynamic simulations.