• Title/Summary/Keyword: LightGBM model

Search Result 67, Processing Time 0.03 seconds

Prediction Model of CNC Processing Defects Using Machine Learning (머신러닝을 이용한 CNC 가공 불량 발생 예측 모델)

  • Han, Yong Hee
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.2
    • /
    • pp.249-255
    • /
    • 2022
  • This study proposed an analysis framework for real-time prediction of CNC processing defects using machine learning-based models that are recently attracting attention as processing defect prediction methods, and applied it to CNC machines. Analysis shows that the XGBoost, CatBoost, and LightGBM models have the same best accuracy, precision, recall, F1 score, and AUC, of which the LightGBM model took the shortest execution time. This short run time has practical advantages such as reducing actual system deployment costs, reducing the probability of CNC machine damage due to rapid prediction of defects, and increasing overall CNC machine utilization, confirming that the LightGBM model is the most effective machine learning model for CNC machines with only basic sensors installed. In addition, it was confirmed that classification performance was maximized when an ensemble model consisting of LightGBM, ExtraTrees, k-Nearest Neighbors, and logistic regression models was applied in situations where there are no restrictions on execution time and computing power.

Investigating the performance of different decomposition methods in rainfall prediction from LightGBM algorithm

  • Narimani, Roya;Jun, Changhyun;Nezhad, Somayeh Moghimi;Parisouj, Peiman
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2022.05a
    • /
    • pp.150-150
    • /
    • 2022
  • This study investigates the roles of decomposition methods on high accuracy in daily rainfall prediction from light gradient boosting machine (LightGBM) algorithm. Here, empirical mode decomposition (EMD) and singular spectrum analysis (SSA) methods were considered to decompose and reconstruct input time series into trend terms, fluctuating terms, and noise components. The decomposed time series from EMD and SSA methods were used as input data for LightGBM algorithm in two hybrid models, including empirical mode-based light gradient boosting machine (EMDGBM) and singular spectrum analysis-based light gradient boosting machine (SSAGBM), respectively. A total of four parameters (i.e., temperature, humidity, wind speed, and rainfall) at a daily scale from 2003 to 2017 is used as input data for daily rainfall prediction. As results from statistical performance indicators, it indicates that the SSAGBM model shows a better performance than the EMDGBM model and the original LightGBM algorithm with no decomposition methods. It represents that the accuracy of LightGBM algorithm in rainfall prediction was improved with the SSA method when using multivariate dataset.

  • PDF

Predicting of the Severity of Car Traffic Accidents on a Highway Using Light Gradient Boosting Model (LightGBM 알고리즘을 활용한 고속도로 교통사고심각도 예측모델 구축)

  • Lee, Hyun-Mi;Jeon, Gyo-Seok;Jang, Jeong-Ah
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.6
    • /
    • pp.1123-1130
    • /
    • 2020
  • This study aims to classify the severity in car crashes using five classification learning models. The dataset used in this study contains 21,013 vehicle crashes, obtained from Korea Expressway Corporation, between the year of 2015-2017 and the LightGBM(Light Gradient Boosting Model) performed well with the highest accuracy. LightGBM, the number of involved vehicles, type of accident, incident location, incident lane type, types of accidents, types of vehicles involved in accidents were shown as priority factors. Based on the results of this model, the establishment of a management strategy for response of highway traffic accident should be presented through a consistent prediction process of accident severity level. This study identifies applicability of Machine Learning Models for Predicting of the Severity of Car Traffic Accidents on a Highway and suggests that various machine learning techniques based on big data that can be used in the future.

A LightGBM and XGBoost Learning Method for Postoperative Critical Illness Key Indicators Analysis

  • Lei Han;Yiziting Zhu;Yuwen Chen;Guoqiong Huang;Bin Yi
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.8
    • /
    • pp.2016-2029
    • /
    • 2023
  • Accurate prediction of critical illness is significant for ensuring the lives and health of patients. The selection of indicators affects the real-time capability and accuracy of the prediction for critical illness. However, the diversity and complexity of these indicators make it difficult to find potential connections between them and critical illnesses. For the first time, this study proposes an indicator analysis model to extract key indicators from the preoperative and intraoperative clinical indicators and laboratory results of critical illnesses. In this study, preoperative and intraoperative data of heart failure and respiratory failure are used to verify the model. The proposed model processes the datum and extracts key indicators through four parts. To test the effectiveness of the proposed model, the key indicators are used to predict the two critical illnesses. The classifiers used in the prediction are light gradient boosting machine (LightGBM) and eXtreme Gradient Boosting (XGBoost). The predictive performance using key indicators is better than that using all indicators. In the prediction of heart failure, LightGBM and XGBoost have sensitivities of 0.889 and 0.892, and specificities of 0.939 and 0.937, respectively. For respiratory failure, LightGBM and XGBoost have sensitivities of 0.709 and 0.689, and specificity of 0.936 and 0.940, respectively. The proposed model can effectively analyze the correlation between indicators and postoperative critical illness. The analytical results make it possible to find the key indicators for postoperative critical illnesses. This model is meaningful to assist doctors in extracting key indicators in time and improving the reliability and efficiency of prediction.

Attack Detection and Classification Method Using PCA and LightGBM in MQTT-based IoT Environment (MQTT 기반 IoT 환경에서의 PCA와 LightGBM을 이용한 공격 탐지 및 분류 방안)

  • Lee Ji Gu;Lee Soo Jin;Kim Young Won
    • Convergence Security Journal
    • /
    • v.22 no.4
    • /
    • pp.17-24
    • /
    • 2022
  • Recently, machine learning-based cyber attack detection and classification research has been actively conducted, achieving a high level of detection accuracy. However, low-spec IoT devices and large-scale network traffic make it difficult to apply machine learning-based detection models in IoT environment. Therefore, In this paper, we propose an efficient IoT attack detection and classification method through PCA(Principal Component Analysis) and LightGBM(Light Gradient Boosting Model) using datasets collected in a MQTT(Message Queuing Telementry Transport) IoT protocol environment that is also used in the defense field. As a result of the experiment, even though the original dataset was reduced to about 15%, the performance was almost similar to that of the original. It also showed the best performance in comparative evaluation with the four dimensional reduction techniques selected in this paper.

Method of Analyzing Important Variables using Machine Learning-based Golf Putting Direction Prediction Model (머신러닝 기반 골프 퍼팅 방향 예측 모델을 활용한 중요 변수 분석 방법론)

  • Kim, Yeon Ho;Cho, Seung Hyun;Jung, Hae Ryun;Lee, Ki Kwang
    • Korean Journal of Applied Biomechanics
    • /
    • v.32 no.1
    • /
    • pp.1-8
    • /
    • 2022
  • Objective: This study proposes a methodology to analyze important variables that have a significant impact on the putting direction prediction using a machine learning-based putting direction prediction model trained with IMU sensor data. Method: Putting data were collected using an IMU sensor measuring 12 variables from 6 adult males in their 20s at K University who had no golf experience. The data was preprocessed so that it could be applied to machine learning, and a model was built using five machine learning algorithms. Finally, by comparing the performance of the built models, the model with the highest performance was selected as the proposed model, and then 12 variables of the IMU sensor were applied one by one to analyze important variables affecting the learning performance. Results: As a result of comparing the performance of five machine learning algorithms (K-NN, Naive Bayes, Decision Tree, Random Forest, and Light GBM), the prediction accuracy of the Light GBM-based prediction model was higher than that of other algorithms. Using the Light GBM algorithm, which had excellent performance, an experiment was performed to rank the importance of variables that affect the direction prediction of the model. Conclusion: Among the five machine learning algorithms, the algorithm that best predicts the putting direction was the Light GBM algorithm. When the model predicted the putting direction, the variable that had the greatest influence was the left-right inclination (Roll).

Prediction of Stock Returns from News Article's Recommended Stocks Using XGBoost and LightGBM Models

  • Yoo-jin Hwang;Seung-yeon Son;Zoon-ky Lee
    • Journal of the Korea Society of Computer and Information
    • /
    • v.29 no.2
    • /
    • pp.51-59
    • /
    • 2024
  • This study examines the relationship between the release of the news and the individual stock returns. Investors utilize a variety of information sources to maximize stock returns when establishing investment strategies. News companies publish their articles based on stock recommendation reports of analysts, enhancing the reliability of the information. Defining release of a stock-recommendation news article as an event, we examine its economic impacts and propose a binary classification model that predicts the stock return 10 days after the event. XGBoost and LightGBM models are applied for the study with accuracy of 75%, 71% respectively. In addition, after categorizing the recommended stocks based on the listed market(KOSPI/KOSDAQ) and market capitalization(Big/Small), this study verifies difference in the accuracy of models across four sub-datasets. Finally, by conducting SHAP(Shapley Additive exPlanations) analysis, we identify the key variables in each model, reinforcing the interpretability of models.

LightGBM Based Prediction of East Sea Vertical Temperature Profile Using XBT Data (XBT 데이터를 이용한 LightGBM 기반 동해 수직 수온분포 예측)

  • Kim, Young-Joo;Lee, Soo-Jin
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.27-28
    • /
    • 2022
  • 최근 우리나라에서도 인공지능 모델을 이용한 수온 예측 관련 연구가 활발히 진행되고 있으나 한반도 주변 해역의 수온 예측 연구에서는 주로 해수면 온도만을 예측하는데 중점을 두고 있다. 본 논문에서는 XBT(eXpendable Bathy-Thermograph) 데이터와 LightGBM(Light Gradient Boosting Model)을 이용하여 잠수함 작전 및 대잠전(Anti Submarine Warfare)에 있어서 군사적으로 중요한 동해의 수직 수온분포를 예측하였다. 동해 특정해역의 해수면부터 수심 200m까지 측정된 XBT 데이터를 이용하여 모델을 학습시키고 성능 평가지표(MAE, MSE, RMSE)와 수직 수온분포 그래프를 통해 예측 정확도를 평가하였다.

  • PDF

A Comparative Analysis of Ensemble Learning-Based Classification Models for Explainable Term Deposit Subscription Forecasting (설명 가능한 정기예금 가입 여부 예측을 위한 앙상블 학습 기반 분류 모델들의 비교 분석)

  • Shin, Zian;Moon, Jihoon;Rho, Seungmin
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.3
    • /
    • pp.97-117
    • /
    • 2021
  • Predicting term deposit subscriptions is one of representative financial marketing in banks, and banks can build a prediction model using various customer information. In order to improve the classification accuracy for term deposit subscriptions, many studies have been conducted based on machine learning techniques. However, even if these models can achieve satisfactory performance, utilizing them is not an easy task in the industry when their decision-making process is not adequately explained. To address this issue, this paper proposes an explainable scheme for term deposit subscription forecasting. For this, we first construct several classification models using decision tree-based ensemble learning methods, which yield excellent performance in tabular data, such as random forest, gradient boosting machine (GBM), extreme gradient boosting (XGB), and light gradient boosting machine (LightGBM). We then analyze their classification performance in depth through 10-fold cross-validation. After that, we provide the rationale for interpreting the influence of customer information and the decision-making process by applying Shapley additive explanation (SHAP), an explainable artificial intelligence technique, to the best classification model. To verify the practicality and validity of our scheme, experiments were conducted with the bank marketing dataset provided by Kaggle; we applied the SHAP to the GBM and LightGBM models, respectively, according to different dataset configurations and then performed their analysis and visualization for explainable term deposit subscriptions.

A Study on the Development of Traffic Volume Estimation Model Based on Mobile Communication Data Using Machine Learning (머신러닝을 이용한 이동통신 데이터 기반 교통량 추정 모형 개발)

  • Dong-seob Oh;So-sig Yoon;Choul-ki Lee;Yong-Sung CHO
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.4
    • /
    • pp.1-13
    • /
    • 2023
  • This study develops an optimal mobile-communication-based National Highway traffic volume estimation model using an ensemble-based machine learning algorithm. Based on information such as mobile communication data and VDS data, the LightGBM model was selected as the optimal model for estimating traffic volume. As a result of evaluating traffic volume estimation performance from 96 points where VDS was installed, MAPE was 8.49 (accuracy 91.51%). On the roads where VDS was not installed, traffic estimation accuracy was 92.6%.