• Title/Summary/Keyword: gradient boosting algorithm

Search Result 69, Processing Time 0.023 seconds

Vehicle Detection Scheme Based on a Boosting Classifier with Histogram of Oriented Gradient (HOG) Features and Image Segmentation] (HOG 특징 및 영상분할을 이용한 부스팅분류 기반 자동차 검출 기법)

  • Choi, Mi-Soon;Lee, Jeong-Hwan;Roh, Tae-Moon;Shim, Jae-Chang
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.10
    • /
    • pp.955-961
    • /
    • 2010
  • In this paper, we describe a study of a vehicle detection method based on a Boosting Classifier which uses Histogram of Oriented Gradient (HOG) features and Image Segmentation techniques. An input image is segmented by means of a split and merge algorithm. Then, the two largest segmented regions are removed in order to reduce the search region and speed up processing time. The HOG features are then calculated for each pixel in the search region. In order to detect the vehicle region we used the AdaBoost (adaptive boost) method, which is well known for classifying samples with two classes. To evaluate the performance of the proposed method, 537 training images were used to train and learn the classifier, followed by 500 non-training images to provide the recognition rate. From these experiments we were able to detect the proper image 98.34% of the time for the 500 non-training images. In conclusion, the proposed method can be used for detecting the location of a vehicle in an intelligent vehicle control system.

Machine Learning-Based Atmospheric Correction Based on Radiative Transfer Modeling Using Sentinel-2 MSI Data and ItsValidation Focusing on Forest (농림위성을 위한 기계학습을 활용한 복사전달모델기반 대기보정 모사 알고리즘 개발 및 검증: 식생 지역을 위주로)

  • Yoojin Kang;Yejin Kim ;Jungho Im;Joongbin Lim
    • Korean Journal of Remote Sensing
    • /
    • v.39 no.5_3
    • /
    • pp.891-907
    • /
    • 2023
  • Compact Advanced Satellite 500-4 (CAS500-4) is scheduled to be launched to collect high spatial resolution data focusing on vegetation applications. To achieve this goal, accurate surface reflectance retrieval through atmospheric correction is crucial. Therefore, a machine learning-based atmospheric correction algorithm was developed to simulate atmospheric correction from a radiative transfer model using Sentinel-2 data that have similarspectral characteristics as CAS500-4. The algorithm was then evaluated mainly for forest areas. Utilizing the atmospheric correction parameters extracted from Sentinel-2 and GEOKOMPSAT-2A (GK-2A), the atmospheric correction algorithm was developed based on Random Forest and Light Gradient Boosting Machine (LGBM). Between the two machine learning techniques, LGBM performed better when considering both accuracy and efficiency. Except for one station, the results had a correlation coefficient of more than 0.91 and well-reflected temporal variations of the Normalized Difference Vegetation Index (i.e., vegetation phenology). GK-2A provides Aerosol Optical Depth (AOD) and water vapor, which are essential parameters for atmospheric correction, but additional processing should be required in the future to mitigate the problem caused by their many missing values. This study provided the basis for the atmospheric correction of CAS500-4 by developing a machine learning-based atmospheric correction simulation algorithm.

Estimating pile setup parameter using XGBoost-based optimized models

  • Xigang Du;Ximeng Ma;Chenxi Dong;Mehrdad Sattari Nikkhoo
    • Geomechanics and Engineering
    • /
    • v.36 no.3
    • /
    • pp.259-276
    • /
    • 2024
  • The undrained shear strength is widely acknowledged as a fundamental mechanical property of soil and is considered a critical engineering parameter. In recent years, researchers have employed various methodologies to evaluate the shear strength of soil under undrained conditions. These methods encompass both numerical analyses and empirical techniques, such as the cone penetration test (CPT), to gain insights into the properties and behavior of soil. However, several of these methods rely on correlation assumptions, which can lead to inconsistent accuracy and precision. The study involved the development of innovative methods using extreme gradient boosting (XGB) to predict the pile set-up component "A" based on two distinct data sets. The first data set includes average modified cone point bearing capacity (qt), average wall friction (fs), and effective vertical stress (σvo), while the second data set comprises plasticity index (PI), soil undrained shear cohesion (Su), and the over consolidation ratio (OCR). These data sets were utilized to develop XGBoost-based methods for predicting the pile set-up component "A". To optimize the internal hyperparameters of the XGBoost model, four optimization algorithms were employed: Particle Swarm Optimization (PSO), Social Spider Optimization (SSO), Arithmetic Optimization Algorithm (AOA), and Sine Cosine Optimization Algorithm (SCOA). The results from the first data set indicate that the XGBoost model optimized using the Arithmetic Optimization Algorithm (XGB - AOA) achieved the highest accuracy, with R2 values of 0.9962 for the training part and 0.9807 for the testing part. The performance of the developed models was further evaluated using the RMSE, MAE, and VAF indices. The results revealed that the XGBoost model optimized using XGBoost - AOA outperformed other models in terms of accuracy, with RMSE, MAE, and VAF values of 0.0078, 0.0015, and 99.6189 for the training part and 0.0141, 0.0112, and 98.0394 for the testing part, respectively. These findings suggest that XGBoost - AOA is the most accurate model for predicting the pile set-up component.

Development of The Irregular Radial Pulse Detection Algorithm Based on Statistical Learning Model (통계적 학습 모형에 기반한 불규칙 맥파 검출 알고리즘 개발)

  • Bae, Jang-Han;Jang, Jun-Su;Ku, Boncho
    • Journal of Biomedical Engineering Research
    • /
    • v.41 no.5
    • /
    • pp.185-194
    • /
    • 2020
  • Arrhythmia is basically diagnosed with the electrocardiogram (ECG) signal, however, ECG is difficult to measure and it requires expert help in analyzing the signal. On the other hand, the radial pulse can be measured with easy and uncomplicated way in daily life, and could be suitable bio-signal for the recent untact paradigm and extensible signal for diagnosis of Korean medicine based on pulse pattern. In this study, we developed an irregular radial pulse detection algorithm based on a learning model and considered its applicability as arrhythmia screening. A total of 1432 pulse waves including irregular pulse data were used in the experiment. Three data sets were prepared with minimal preprocessing to avoid the heuristic feature extraction. As classification algorithms, elastic net logistic regression, random forest, and extreme gradient boosting were applied to each data set and the irregular pulse detection performances were estimated using area under the receiver operating characteristic curve based on a 10-fold cross-validation. The extreme gradient boosting method showed the superior performance than others and found that the classification accuracy reached 99.7%. The results confirmed that the proposed algorithm could be used for arrhythmia screening. To make a fusion technology integrating western and Korean medicine, arrhythmia subtype classification from the perspective of Korean medicine will be needed for future research.

Ensemble deep learning-based models to predict the resilient modulus of modified base materials subjected to wet-dry cycles

  • Mahzad Esmaeili-Falak;Reza Sarkhani Benemaran
    • Geomechanics and Engineering
    • /
    • v.32 no.6
    • /
    • pp.583-600
    • /
    • 2023
  • The resilient modulus (MR) of various pavement materials plays a significant role in the pavement design by a mechanistic-empirical method. The MR determination is done by experimental tests that need time and money, along with special experimental tools. The present paper suggested a novel hybridized extreme gradient boosting (XGB) structure for forecasting the MR of modified base materials subject to wet-dry cycles. The models were created by various combinations of input variables called deep learning. Input variables consist of the number of W-D cycles (WDC), the ratio of free lime to SAF (CSAFR), the ratio of maximum dry density to the optimum moisture content (DMR), confining pressure (σ3), and deviatoric stress (σd). Two XGB structures were produced for the estimation aims, where determinative variables were optimized by particle swarm optimization (PSO) and black widow optimization algorithm (BWOA). According to the results' description and outputs of Taylor diagram, M1 model with the combination of WDC, CSAFR, DMR, σ3, and σd is recognized as the most suitable model, with R2 and RMSE values of BWOA-XGB for model M1 equal to 0.9991 and 55.19 MPa, respectively. Interestingly, the lowest value of RMSE for literature was at 116.94 MPa, while this study could gain the extremely lower RMSE owned by BWOA-XGB model at 55.198 MPa. At last, the explanations indicate the BWO algorithm's capability in determining the optimal value of XGB determinative parameters in MR prediction procedure.

An advanced machine learning technique to predict compressive strength of green concrete incorporating waste foundry sand

  • Danial Jahed Armaghani;Haleh Rasekh;Panagiotis G. Asteris
    • Computers and Concrete
    • /
    • v.33 no.1
    • /
    • pp.77-90
    • /
    • 2024
  • Waste foundry sand (WFS) is the waste product that cause environmental hazards. WFS can be used as a partial replacement of cement or fine aggregates in concrete. A database comprising 234 compressive strength tests of concrete fabricated with WFS is used. To construct the machine learning-based prediction models, the water-to-cement ratio, WFS replacement percentage, WFS-to-cement content ratio, and fineness modulus of WFS were considered as the model's inputs, and the compressive strength of concrete is set as the model's output. A base extreme gradient boosting (XGBoost) model together with two hybrid XGBoost models mixed with the tunicate swarm algorithm (TSA) and the salp swarm algorithm (SSA) were applied. The role of TSA and SSA is to identify the optimum values of XGBoost hyperparameters to obtain the higher performance. The results of these hybrid techniques were compared with the results of the base XGBoost model in order to investigate and justify the implementation of optimisation algorithms. The results showed that the hybrid XGBoost models are faster and more accurate compared to the base XGBoost technique. The XGBoost-SSA model shows superior performance compared to previously published works in the literature, offering a reduced system error rate. Although the WFS-to-cement ratio is significant, the WFS replacement percentage has a smaller influence on the compressive strength of concrete. To improve the compressive strength of concrete fabricated with WFS, the simultaneous consideration of the water-to-cement ratio and fineness modulus of WFS is recommended.

A Comparative Study of Phishing Websites Classification Based on Classifier Ensemble

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Korea Multimedia Society
    • /
    • v.21 no.5
    • /
    • pp.617-625
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.

A Comparative Study of Phishing Websites Classification Based on Classifier Ensembles

  • Tama, Bayu Adhi;Rhee, Kyung-Hyune
    • Journal of Multimedia Information System
    • /
    • v.5 no.2
    • /
    • pp.99-104
    • /
    • 2018
  • Phishing website has become a crucial concern in cyber security applications. It is performed by fraudulently deceiving users with the aim of obtaining their sensitive information such as bank account information, credit card, username, and password. The threat has led to huge losses to online retailers, e-business platform, financial institutions, and to name but a few. One way to build anti-phishing detection mechanism is to construct classification algorithm based on machine learning techniques. The objective of this paper is to compare different classifier ensemble approaches, i.e. random forest, rotation forest, gradient boosted machine, and extreme gradient boosting against single classifiers, i.e. decision tree, classification and regression tree, and credal decision tree in the case of website phishing. Area under ROC curve (AUC) is employed as a performance metric, whilst statistical tests are used as baseline indicator of significance evaluation among classifiers. The paper contributes the existing literature on making a benchmark of classifier ensembles for web phishing detection.

Bi-LSTM model with time distribution for bandwidth prediction in mobile networks

  • Hyeonji Lee;Yoohwa Kang;Minju Gwak;Donghyeok An
    • ETRI Journal
    • /
    • v.46 no.2
    • /
    • pp.205-217
    • /
    • 2024
  • We propose a bandwidth prediction approach based on deep learning. The approach is intended to accurately predict the bandwidth of various types of mobile networks. We first use a machine learning technique, namely, the gradient boosting algorithm, to recognize the connected mobile network. Second, we apply a handover detection algorithm based on network recognition to account for vertical handover that causes the bandwidth variance. Third, as the communication performance offered by 3G, 4G, and 5G networks varies, we suggest a bidirectional long short-term memory model with time distribution for bandwidth prediction per network. To increase the prediction accuracy, pretraining and fine-tuning are applied for each type of network. We use a dataset collected at University College Cork for network recognition, handover detection, and bandwidth prediction. The performance evaluation indicates that the handover detection algorithm achieves 88.5% accuracy, and the bandwidth prediction model achieves a high accuracy, with a root-mean-square error of only 2.12%.

Prediction of Soil Moisture with Open Source Weather Data and Machine Learning Algorithms (공공 기상데이터와 기계학습 모델을 이용한 토양수분 예측)

  • Jang, Young-bin;Jang, Ik-hoon;Choe, Young-chan
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.22 no.1
    • /
    • pp.1-12
    • /
    • 2020
  • As one of the essential resources in the agricultural process, soil moisture has been carefully managed by predicting future changes and deficits. In recent years, statistics and machine learning based approach to predict soil moisture has been preferred in academia for its generalizability and ease of use in the field. However, little is known that machine learning based soil moisture prediction is applicable in the situation of South Korea. In this sense, this paper aims to examine 1) whether publicly available weather data generated in South Korea has sufficient quality to predict soil moisture, 2) which machine learning algorithm would perform best in the situation of South Korea, and 3) whether a single machine learning model could be generally applicable in various regions. We used various machine learning methods such as Support Vector Machines (SVM), Random Forest (RF), Extremely Randomized Trees (ET), Gradient Boosting Machines (GBM), and Deep Feedforward Network (DFN) to predict future soil moisture in Andong, Boseong, Cheolwon, Suncheon region with open source weather data. As a result, GBM model showed the lowest prediction error in every data set we used (R squared: 0.96, RMSE: 1.8). Furthermore, GBM showed the lowest variance of prediction error between regions which indicates it has the highest generalizability.