• Title/Summary/Keyword: Gradient Boosting Algorithm

Search Result 73, Processing Time 0.024 seconds

A study on applying random forest and gradient boosting algorithm for Chl-a prediction of Daecheong lake (대청호 Chl-a 예측을 위한 random forest와 gradient boosting 알고리즘 적용 연구)

  • Lee, Sang-Min;Kim, Il-Kyu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.35 no.6
    • /
    • pp.507-516
    • /
    • 2021
  • In this study, the machine learning which has been widely used in prediction algorithms recently was used. the research point was the CD(chudong) point which was a representative point of Daecheong Lake. Chlorophyll-a(Chl-a) concentration was used as a target variable for algae prediction. to predict the Chl-a concentration, a data set of water quality and quantity factors was consisted. we performed algorithms about random forest and gradient boosting with Python. to perform the algorithms, at first the correlation analysis between Chl-a and water quality and quantity data was studied. we extracted ten factors of high importance for water quality and quantity data. as a result of the algorithm performance index, the gradient boosting showed that RMSE was 2.72 mg/m3 and MSE was 7.40 mg/m3 and R2 was 0.66. as a result of the residual analysis, the analysis result of gradient boosting was excellent. as a result of the algorithm execution, the gradient boosting algorithm was excellent. the gradient boosting algorithm was also excellent with 2.44 mg/m3 of RMSE in the machine learning hyperparameter adjustment result.

A robust approach in prediction of RCFST columns using machine learning algorithm

  • Van-Thanh Pham;Seung-Eock Kim
    • Steel and Composite Structures
    • /
    • v.46 no.2
    • /
    • pp.153-173
    • /
    • 2023
  • Rectangular concrete-filled steel tubular (RCFST) column, a type of concrete-filled steel tubular (CFST), is widely used in compression members of structures because of its advantages. This paper proposes a robust machine learning-based framework for predicting the ultimate compressive strength of RCFST columns under both concentric and eccentric loading. The gradient boosting neural network (GBNN), an efficient and up-to-date ML algorithm, is utilized for developing a predictive model in the proposed framework. A total of 890 experimental data of RCFST columns, which is categorized into two datasets of concentric and eccentric compression, is carefully collected to serve as training and testing purposes. The accuracy of the proposed model is demonstrated by comparing its performance with seven state-of-the-art machine learning methods including decision tree (DT), random forest (RF), support vector machines (SVM), deep learning (DL), adaptive boosting (AdaBoost), extreme gradient boosting (XGBoost), and categorical gradient boosting (CatBoost). Four available design codes, including the European (EC4), American concrete institute (ACI), American institute of steel construction (AISC), and Australian/New Zealand (AS/NZS) are refereed in another comparison. The results demonstrate that the proposed GBNN method is a robust and powerful approach to obtain the ultimate strength of RCFST columns.

GBGNN: Gradient Boosted Graph Neural Networks

  • Eunjo Jang;Ki Yong Lee
    • Journal of Information Processing Systems
    • /
    • v.20 no.4
    • /
    • pp.501-513
    • /
    • 2024
  • In recent years, graph neural networks (GNNs) have been extensively used to analyze graph data across various domains because of their powerful capabilities in learning complex graph-structured data. However, recent research has focused on improving the performance of a single GNN with only two or three layers. This is because stacking layers deeply causes the over-smoothing problem of GNNs, which degrades the performance of GNNs significantly. On the other hand, ensemble methods combine individual weak models to obtain better generalization performance. Among them, gradient boosting is a powerful supervised learning algorithm that adds new weak models in the direction of reducing the errors of the previously created weak models. After repeating this process, gradient boosting combines the weak models to produce a strong model with better performance. Until now, most studies on GNNs have focused on improving the performance of a single GNN. In contrast, improving the performance of GNNs using multiple GNNs has not been studied much yet. In this paper, we propose gradient boosted graph neural networks (GBGNN) that combine multiple shallow GNNs with gradient boosting. We use shallow GNNs as weak models and create new weak models using the proposed gradient boosting-based loss function. Our empirical evaluations on three real-world datasets demonstrate that GBGNN performs much better than a single GNN. Specifically, in our experiments using graph convolutional network (GCN) and graph attention network (GAT) as weak models on the Cora dataset, GBGNN achieves performance improvements of 12.3%p and 6.1%p in node classification accuracy compared to a single GCN and a single GAT, respectively.

AN OPTIMAL BOOSTING ALGORITHM BASED ON NONLINEAR CONJUGATE GRADIENT METHOD

  • CHOI, JOOYEON;JEONG, BORA;PARK, YESOM;SEO, JIWON;MIN, CHOHONG
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • v.22 no.1
    • /
    • pp.1-13
    • /
    • 2018
  • Boosting, one of the most successful algorithms for supervised learning, searches the most accurate weighted sum of weak classifiers. The search corresponds to a convex programming with non-negativity and affine constraint. In this article, we propose a novel Conjugate Gradient algorithm with the Modified Polak-Ribiera-Polyak conjugate direction. The convergence of the algorithm is proved and we report its successful applications to boosting.

Using Machine Learning Technique for Analytical Customer Loyalty

  • Mohamed M. Abbassy
    • International Journal of Computer Science & Network Security
    • /
    • v.23 no.8
    • /
    • pp.190-198
    • /
    • 2023
  • To enhance customer satisfaction for higher profits, an e-commerce sector can establish a continuous relationship and acquire new customers. Utilize machine-learning models to analyse their customer's behavioural evidence to produce their competitive advantage to the e-commerce platform by helping to improve overall satisfaction. These models will forecast customers who will churn and churn causes. Forecasts are used to build unique business strategies and services offers. This work is intended to develop a machine-learning model that can accurately forecast retainable customers of the entire e-commerce customer data. Developing predictive models classifying different imbalanced data effectively is a major challenge in collected data and machine learning algorithms. Build a machine learning model for solving class imbalance and forecast customers. The satisfaction accuracy is used for this research as evaluation metrics. This paper aims to enable to evaluate the use of different machine learning models utilized to forecast satisfaction. For this research paper are selected three analytical methods come from various classifications of learning. Classifier Selection, the efficiency of various classifiers like Random Forest, Logistic Regression, SVM, and Gradient Boosting Algorithm. Models have been used for a dataset of 8000 records of e-commerce websites and apps. Results indicate the best accuracy in determining satisfaction class with both gradient-boosting algorithm classifications. The results showed maximum accuracy compared to other algorithms, including Gradient Boosting Algorithm, Support Vector Machine Algorithm, Random Forest Algorithm, and logistic regression Algorithm. The best model developed for this paper to forecast satisfaction customers and accuracy achieve 88 %.

Investigating the performance of different decomposition methods in rainfall prediction from LightGBM algorithm

  • Narimani, Roya;Jun, Changhyun;Nezhad, Somayeh Moghimi;Parisouj, Peiman
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.150-150
    • /
    • 2022
  • This study investigates the roles of decomposition methods on high accuracy in daily rainfall prediction from light gradient boosting machine (LightGBM) algorithm. Here, empirical mode decomposition (EMD) and singular spectrum analysis (SSA) methods were considered to decompose and reconstruct input time series into trend terms, fluctuating terms, and noise components. The decomposed time series from EMD and SSA methods were used as input data for LightGBM algorithm in two hybrid models, including empirical mode-based light gradient boosting machine (EMDGBM) and singular spectrum analysis-based light gradient boosting machine (SSAGBM), respectively. A total of four parameters (i.e., temperature, humidity, wind speed, and rainfall) at a daily scale from 2003 to 2017 is used as input data for daily rainfall prediction. As results from statistical performance indicators, it indicates that the SSAGBM model shows a better performance than the EMDGBM model and the original LightGBM algorithm with no decomposition methods. It represents that the accuracy of LightGBM algorithm in rainfall prediction was improved with the SSA method when using multivariate dataset.

  • PDF

Dynamic Caching Routing Strategy for LEO Satellite Nodes Based on Gradient Boosting Regression Tree

  • Yang Yang;Shengbo Hu;Guiju Lu
    • Journal of Information Processing Systems
    • /
    • v.20 no.1
    • /
    • pp.131-147
    • /
    • 2024
  • A routing strategy based on traffic prediction and dynamic cache allocation for satellite nodes is proposed to address the issues of high propagation delay and overall delay of inter-satellite and satellite-to-ground links in low Earth orbit (LEO) satellite systems. The spatial and temporal correlations of satellite network traffic were analyzed, and the relevant traffic through the target satellite was extracted as raw input for traffic prediction. An improved gradient boosting regression tree algorithm was used for traffic prediction. Based on the traffic prediction results, a dynamic cache allocation routing strategy is proposed. The satellite nodes periodically monitor the traffic load on inter-satellite links (ISLs) and dynamically allocate cache resources for each ISL with neighboring nodes. Simulation results demonstrate that the proposed routing strategy effectively reduces packet loss rate and average end-to-end delay and improves the distribution of services across the entire network.

Comparison of machine learning algorithms for Chl-a prediction in the middle of Nakdong River (focusing on water quality and quantity factors) (머신러닝 기법을 활용한 낙동강 중류 지역의 Chl-a 예측 알고리즘 비교 연구(수질인자 및 수량 중심으로))

  • Lee, Sang-Min;Park, Kyeong-Deok;Kim, Il-Kyu
    • Journal of Korean Society of Water and Wastewater
    • /
    • v.34 no.4
    • /
    • pp.277-288
    • /
    • 2020
  • In this study, we performed algorithms to predict algae of Chlorophyll-a (Chl-a). Water quality and quantity data of the middle Nakdong River area were used. At first, the correlation analysis between Chl-a and water quality and quantity data was studied. We extracted ten factors of high importance for water quality and quantity data about the two weirs. Algorithms predicted how ten factors affected Chl-a occurrence. We performed algorithms about decision tree, random forest, elastic net, gradient boosting with Python. The root mean square error (RMSE) value was used to evaluate excellent algorithms. The gradient boosting showed 10.55 of RMSE value for the Gangjeonggoryeong (GG) site and 11.43 of RMSE value for the Dalsung (DS) site. The gradient boosting algorithm showed excellent results for GG and DS sites. Prediction value for the four algorithms was also evaluated through the Receiver operating characteristic (ROC) curve and Area under curve (AUC). As a result of the evaluation, the AUC value was 0.877 at GG site and the AUC value was 0.951 at DS site. So the algorithm's ability to interpret seemed to be excellent.

Darknet Traffic Detection and Classification Using Gradient Boosting Techniques (Gradient Boosting 기법을 활용한 다크넷 트래픽 탐지 및 분류)

  • Kim, Jihye;Lee, Soo Jin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.32 no.2
    • /
    • pp.371-379
    • /
    • 2022
  • Darknet is based on the characteristics of anonymity and security, and this leads darknet to be continuously abused for various crimes and illegal activities. Therefore, it is very important to detect and classify darknet traffic to prevent the misuse and abuse of darknet. This work proposes a novel approach, which uses the Gradient Boosting techniques for darknet traffic detection and classification. XGBoost and LightGBM algorithm achieve detection accuracy of 99.99%, and classification accuracy of over 99%, which could get more than 3% higher detection accuracy and over 13% higher classification accuracy, compared to the previous research. In particular, LightGBM algorithm could detect and classify darknet traffic in a way that is superior to XGBoost by reducing the learning time by about 1.6 times and hyperparameter tuning time by more than 10 times.

Machine learning-based prediction of wind forces on CAARC standard tall buildings

  • Yi Li;Jie-Ting Yin;Fu-Bin Chen;Qiu-Sheng Li
    • Wind and Structures
    • /
    • v.36 no.6
    • /
    • pp.355-366
    • /
    • 2023
  • Although machine learning (ML) techniques have been widely used in various fields of engineering practice, their applications in the field of wind engineering are still at the initial stage. In order to evaluate the feasibility of machine learning algorithms for prediction of wind loads on high-rise buildings, this study took the exposure category type, wind direction and the height of local wind force as the input features and adopted four different machine learning algorithms including k-nearest neighbor (KNN), support vector machine (SVM), gradient boosting regression tree (GBRT) and extreme gradient (XG) boosting to predict wind force coefficients of CAARC standard tall building model. All the hyper-parameters of four ML algorithms are optimized by tree-structured Parzen estimator (TPE). The result shows that mean drag force coefficients and RMS lift force coefficients can be well predicted by the GBRT algorithm model while the RMS drag force coefficients can be forecasted preferably by the XG boosting algorithm model. The proposed machine learning based algorithms for wind loads prediction can be an alternative of traditional wind tunnel tests and computational fluid dynamic simulations.