• 제목/요약/키워드: Gradient boosting(XGBoost)

검색결과 56건 처리시간 0.03초

A Design and Implement of Efficient Agricultural Product Price Prediction Model

  • Im, Jung-Ju;Kim, Tae-Wan;Lim, Ji-Seoup;Kim, Jun-Ho;Yoo, Tae-Yong;Lee, Won Joo
    • 한국컴퓨터정보학회논문지
    • /
    • 제27권5호
    • /
    • pp.29-36
    • /
    • 2022
  • 본 논문에서는 DACON에서 제공하는 데이터셋을 기반으로 한 효과적인 농산물 가격 예측 모델을 제안한다. 이 모델은 XGBoost와 CatBoost 이며 Gradient Boosting 계열의 알고리즘으로써 기존의 Logistic Regression과 Random Forest보다 평균정확도 및 수행시간이 우수하다. 이러한 장점들을 기반으로 농산물의 이전 가격들을 기반으로 1주, 2주, 4주뒤 가격을 예측하는 머신러닝 모델을 설계한다. XGBoost 모델은 회귀 방식의 모델링인 XGBoost Regressor 라이브러리를 사용하여 하이퍼 파라미터를 조정함으로써 가장 우수한 성능을 도출할 수 있다. CatBoost 모델은 CatBoost Regressor를 사용하여 모델을 구현한다. 구현한 모델은 DACON에서 제공하는 API를 이용하여 검증하고, 모델 별 성능평가를 실시한다. XGBoost는 자체적인 과적합 규제를 진행하기 때문에 적은 데이터셋에도 불구하고 우수한 성능을 도출하지만, 학습시간, 예측시간 등 시간적인 성능 면에서는 LGBM보다 성능이 낮다는 것을 알 수 있었다.

Gradient Boosting 기법을 활용한 다크넷 트래픽 탐지 및 분류 (Darknet Traffic Detection and Classification Using Gradient Boosting Techniques)

  • 김지혜;이수진
    • 정보보호학회논문지
    • /
    • 제32권2호
    • /
    • pp.371-379
    • /
    • 2022
  • 다크넷(Darknet)은 익명성과 보안을 바탕으로 하고 있어 각종 범죄 및 불법 활동에 지속적으로 악용되고 있으며, 이러한 오·남용을 막기 위해 다크넷 트래픽을 정확하게 탐지하고 분류하는 연구는 매우 중요하다. 본 논문에서는 그레디언트 부스팅 기법을 활용한 다크넷 트래픽 탐지 및 분류 기법을 제안하였다. CIC-Darknet2020 데이터셋에 XGBoost와 LightGBM 알고리즘을 적용한 결과, 99.99%의 탐지율과 99% 이상의 분류 성능을 나타내어 기존 연구에 비해 3% 이상 높은 탐지 성능과 13% 이상의 높은 분류 성능을 달성할 수 있었다. 특히, LightGBM 알고리즘의 경우, XGBoost보다 약 1.6배의 학습 시간과 10배의 하이퍼 파라미터 튜닝 실행시간을 단축하여 월등히 우수한 성능으로 다크넷 트래픽 탐지 및 분류를 수행하였다.

Xgboosting 기법을 이용한 실내 위치 측위 기법 (Indoor positioning system using Xgboosting)

  • 황치곤;윤창표;김대진
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국정보통신학회 2021년도 추계학술대회
    • /
    • pp.492-494
    • /
    • 2021
  • 기계학습에서 분류를 위한 기법으로 의사결정트리 기법을 이용한다. 그러나 의사결정트리는 과적합의 문제로 성능이 저하되는 문제가 있다. 이러한 문제를 해결하기 위해 여러 개의 부트스트랩을 생성하여 각 자료를 모델링하여 학습하는 Bagging기법, 샘플링한 데이터를 모델링하여 가중치를 조정하여 과적합을 감소시키는 Boosting과 같은 기법으로 이를 해결할 수 있다. 또한, 최근에 Xgboost 기법이 등장하였다. 이에 본 논문에서는 실내 측위를 위한 wifi 신호 데이터를 수집하여 기존 방식과 Xgboost에 적용하고, 이를 통한 성능평가를 수행한다.

  • PDF

ConvXGB: A new deep learning model for classification problems based on CNN and XGBoost

  • Thongsuwan, Setthanun;Jaiyen, Saichon;Padcharoen, Anantachai;Agarwal, Praveen
    • Nuclear Engineering and Technology
    • /
    • 제53권2호
    • /
    • pp.522-531
    • /
    • 2021
  • We describe a new deep learning model - Convolutional eXtreme Gradient Boosting (ConvXGB) for classification problems based on convolutional neural nets and Chen et al.'s XGBoost. As well as image data, ConvXGB also supports the general classification problems, with a data preprocessing module. ConvXGB consists of several stacked convolutional layers to learn the features of the input and is able to learn features automatically, followed by XGBoost in the last layer for predicting the class labels. The ConvXGB model is simplified by reducing the number of parameters under appropriate conditions, since it is not necessary re-adjust the weight values in a back propagation cycle. Experiments on several data sets from UCL Repository, including images and general data sets, showed that our model handled the classification problems, for all the tested data sets, slightly better than CNN and XGBoost alone and was sometimes significantly better.

A LightGBM and XGBoost Learning Method for Postoperative Critical Illness Key Indicators Analysis

  • Lei Han;Yiziting Zhu;Yuwen Chen;Guoqiong Huang;Bin Yi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제17권8호
    • /
    • pp.2016-2029
    • /
    • 2023
  • Accurate prediction of critical illness is significant for ensuring the lives and health of patients. The selection of indicators affects the real-time capability and accuracy of the prediction for critical illness. However, the diversity and complexity of these indicators make it difficult to find potential connections between them and critical illnesses. For the first time, this study proposes an indicator analysis model to extract key indicators from the preoperative and intraoperative clinical indicators and laboratory results of critical illnesses. In this study, preoperative and intraoperative data of heart failure and respiratory failure are used to verify the model. The proposed model processes the datum and extracts key indicators through four parts. To test the effectiveness of the proposed model, the key indicators are used to predict the two critical illnesses. The classifiers used in the prediction are light gradient boosting machine (LightGBM) and eXtreme Gradient Boosting (XGBoost). The predictive performance using key indicators is better than that using all indicators. In the prediction of heart failure, LightGBM and XGBoost have sensitivities of 0.889 and 0.892, and specificities of 0.939 and 0.937, respectively. For respiratory failure, LightGBM and XGBoost have sensitivities of 0.709 and 0.689, and specificity of 0.936 and 0.940, respectively. The proposed model can effectively analyze the correlation between indicators and postoperative critical illness. The analytical results make it possible to find the key indicators for postoperative critical illnesses. This model is meaningful to assist doctors in extracting key indicators in time and improving the reliability and efficiency of prediction.

Mean fragmentation size prediction in an open-pit mine using machine learning techniques and the Kuz-Ram model

  • Seung-Joong Lee;Sung-Oong Choi
    • Geomechanics and Engineering
    • /
    • 제34권5호
    • /
    • pp.547-559
    • /
    • 2023
  • We evaluated the applicability of machine learning techniques and the Kuz-Ram model for predicting the mean fragmentation size in open-pit mines. The characteristics of the in-situ rock considered here were uniaxial compressive strength, tensile strength, rock factor, and mean in-situ block size. Seventy field datasets that included these characteristics were collected to predict the mean fragmentation size. Deep neural network, support vector machine, and extreme gradient boosting (XGBoost) models were trained using the data. The performance was evaluated using the root mean squared error (RMSE) and the coefficient of determination (r2). The XGBoost model had the smallest RMSE and the highest r2 value compared with the other models. Additionally, when analyzing the error rate between the measured and predicted values, XGBoost had the lowest error rate. When the Kuz-Ram model was applied, low accuracy was observed owing to the differences in the characteristics of data used for model development. Consequently, the proposed XGBoost model predicted the mean fragmentation size more accurately than other models. If its performance is improved by securing sufficient data in the future, it will be useful for improving the blasting efficiency at the target site.

XGBoost 기반 상수도관망 센서 위치 최적화 (Optimal Sensor Location in Water Distribution Network using XGBoost Model)

  • 장혜운;정동휘
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2023년도 학술발표회
    • /
    • pp.217-217
    • /
    • 2023
  • 상수도관망은 사용자에게 고품질의 물을 안정적으로 공급하는 것을 목적으로 하며, 이를 평가하기 위한 지표 중 하나로 압력을 활용한다. 최근 스마트 센서의 설치가 확장됨에 따라 기계학습기법을 이용한 실시간 데이터 기반의 분석이 활발하다. 따라서 어디에서 데이터를 수집하느냐에 대한 센서 위치 결정이 중요하다. 본 연구는 eXtreme Gradient Boosting(XGBoost) 모델을 활용하여 대규모 상수도관망 내 센서 위치를 최적화하는 방법론을 제안한다. XGBoost 모델은 여러 의사결정 나무(decision tree)를 활용하는 앙상블(ensemble) 모델이며, 오차에 따른 가중치를 부여하여 성능을 향상시키는 부스팅(boosting) 방식을 이용한다. 이는 분산 및 병렬 처리가 가능해 메모리리소스를 최적으로 사용하고, 학습 속도가 빠르며 결측치에 대한 전처리 과정을 모델 내에 포함하고 있다는 장점이 있다. 모델 구현을 위한 독립 변수 결정을 위해 압력 데이터의 변동성 및 평균압력 값을 고려하여 상수도관망을 대표하는 중요 절점(critical node)를 선정한다. 중요 절점의 압력 값을 예측하는 XGBoost 모델을 구축하고 모델의 성능과 요인 중요도(feature importance) 값을 고려하여 센서의 최적 위치를 선정한다. 이러한 방법론을 기반으로 상수도관망의 특성에 따른 경향성을 파악하기 위해 다양한 형태(예를 들어, 망형, 가지형)와 구성 절점의 수를 변화시키며 결과를 분석한다. 본 연구에서 구축한 XGBoost 모델은 추가적인 전처리 과정을 최소화하며 대규모 관망에 간편하게 사용할 수 있어 추후 다양한 입출력 데이터의 조합을 통해 센서 위치 외에도 상수도관망에서의 성능 최적화에 활용할 수 있을 것으로 기대한다.

  • PDF

딥러닝을 이용한 사용자 피부색 기반 파운데이션 색상 추천 기법 연구 (A Study On User Skin Color-Based Foundation Color Recommendation Method Using Deep Learning)

  • 정민욱;김현지;곽채원;오유수
    • 한국멀티미디어학회논문지
    • /
    • 제25권9호
    • /
    • pp.1367-1374
    • /
    • 2022
  • In this paper, we propose an automatic cosmetic foundation recommendation system that suggests a good foundation product based on the user's skin color. The proposed system receives and preprocesses user images and detects skin color with OpenCV and machine learning algorithms. The system then compares the performance of the training model using XGBoost, Gradient Boost, Random Forest, and Adaptive Boost (AdaBoost), based on 550 datasets collected as essential bestsellers in the United States. Based on the comparison results, this paper implements a recommendation system using the highest performing machine learning model. As a result of the experiment, our system can effectively recommend a suitable skin color foundation. Thus, our system model is 98% accurate. Furthermore, our system can reduce the selection trials of foundations against the user's skin color. It can also save time in selecting foundations.

An advanced machine learning technique to predict compressive strength of green concrete incorporating waste foundry sand

  • Danial Jahed Armaghani;Haleh Rasekh;Panagiotis G. Asteris
    • Computers and Concrete
    • /
    • 제33권1호
    • /
    • pp.77-90
    • /
    • 2024
  • Waste foundry sand (WFS) is the waste product that cause environmental hazards. WFS can be used as a partial replacement of cement or fine aggregates in concrete. A database comprising 234 compressive strength tests of concrete fabricated with WFS is used. To construct the machine learning-based prediction models, the water-to-cement ratio, WFS replacement percentage, WFS-to-cement content ratio, and fineness modulus of WFS were considered as the model's inputs, and the compressive strength of concrete is set as the model's output. A base extreme gradient boosting (XGBoost) model together with two hybrid XGBoost models mixed with the tunicate swarm algorithm (TSA) and the salp swarm algorithm (SSA) were applied. The role of TSA and SSA is to identify the optimum values of XGBoost hyperparameters to obtain the higher performance. The results of these hybrid techniques were compared with the results of the base XGBoost model in order to investigate and justify the implementation of optimisation algorithms. The results showed that the hybrid XGBoost models are faster and more accurate compared to the base XGBoost technique. The XGBoost-SSA model shows superior performance compared to previously published works in the literature, offering a reduced system error rate. Although the WFS-to-cement ratio is significant, the WFS replacement percentage has a smaller influence on the compressive strength of concrete. To improve the compressive strength of concrete fabricated with WFS, the simultaneous consideration of the water-to-cement ratio and fineness modulus of WFS is recommended.

Estimating pile setup parameter using XGBoost-based optimized models

  • Xigang Du;Ximeng Ma;Chenxi Dong;Mehrdad Sattari Nikkhoo
    • Geomechanics and Engineering
    • /
    • 제36권3호
    • /
    • pp.259-276
    • /
    • 2024
  • The undrained shear strength is widely acknowledged as a fundamental mechanical property of soil and is considered a critical engineering parameter. In recent years, researchers have employed various methodologies to evaluate the shear strength of soil under undrained conditions. These methods encompass both numerical analyses and empirical techniques, such as the cone penetration test (CPT), to gain insights into the properties and behavior of soil. However, several of these methods rely on correlation assumptions, which can lead to inconsistent accuracy and precision. The study involved the development of innovative methods using extreme gradient boosting (XGB) to predict the pile set-up component "A" based on two distinct data sets. The first data set includes average modified cone point bearing capacity (qt), average wall friction (fs), and effective vertical stress (σvo), while the second data set comprises plasticity index (PI), soil undrained shear cohesion (Su), and the over consolidation ratio (OCR). These data sets were utilized to develop XGBoost-based methods for predicting the pile set-up component "A". To optimize the internal hyperparameters of the XGBoost model, four optimization algorithms were employed: Particle Swarm Optimization (PSO), Social Spider Optimization (SSO), Arithmetic Optimization Algorithm (AOA), and Sine Cosine Optimization Algorithm (SCOA). The results from the first data set indicate that the XGBoost model optimized using the Arithmetic Optimization Algorithm (XGB - AOA) achieved the highest accuracy, with R2 values of 0.9962 for the training part and 0.9807 for the testing part. The performance of the developed models was further evaluated using the RMSE, MAE, and VAF indices. The results revealed that the XGBoost model optimized using XGBoost - AOA outperformed other models in terms of accuracy, with RMSE, MAE, and VAF values of 0.0078, 0.0015, and 99.6189 for the training part and 0.0141, 0.0112, and 98.0394 for the testing part, respectively. These findings suggest that XGBoost - AOA is the most accurate model for predicting the pile set-up component.