• Title/Summary/Keyword: Light GBM

Search Result 82, Processing Time 0.024 seconds

Prediction of Good Seller in Overseas sales of Domestic Books Using Big Data (빅데이터를 활용한 국내 도서의 해외 판매시 굿셀러 예측)

  • Kim, Nayeon;Kim, Doyoung;Kim, Miryeo;Jung, Jiyeong;Kim, Hyon Hee
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.05a
    • /
    • pp.401-404
    • /
    • 2022
  • 한국 문학이 세계로 뻗어나감에 따라 해외 시장에서 자리를 잡는 것이 중요해진 시점이다. 본 연구에서는 2016 년도부터 2020 년도까지 최근 5 년간 해외 출간된 도서들 중에서 굿셀러로 분류되는 누적 5 천부 이상 판매 여부를 예측하고자 했다. 굿셀러로 분류되는 도서는 전체 번역 도서 중 적은 비율을 차지하여 데이터 불균형이 발생하였으며, 본 연구에서는 SMOTE 기법과 앙상블 알고리즘을 적용하여 데이터 불균형 문제를 해결하였다. 그 결과, 데이터 클래스 비율이 1:1 에 가까울수록 성능 개선 효과가 나타났으며 LightGBM 모델이 99.83%의 AUC 값을 얻어 다른 앙상블 알고리즘에 비해 가장 좋은 예측 성능을 보임을 검증하였다. 또한 누적 5 천부 이상 판매 여부 예측에 있어 큰 영향을 미치는 변수로는 작가가 가장 중요한 요인으로 나타났으며 출간 국가, 그리고 평점 평균, 평점 참여자 수 같은 온라인 요인도 판매 예측에 유의미한 변수로 나타난 것을 확인할 수 있었다.

A Model Stacking Algorithm for Indoor Positioning System using WiFi Fingerprinting

  • JinQuan Wang;YiJun Wang;GuangWen Liu;GuiFen Chen
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.4
    • /
    • pp.1200-1215
    • /
    • 2023
  • With the development of IoT and artificial intelligence, location-based services are getting more and more attention. For solving the current problem that indoor positioning error is large and generalization is poor, this paper proposes a Model Stacking Algorithm for Indoor Positioning System using WiFi fingerprinting. Firstly, we adopt a model stacking method based on Bayesian optimization to predict the location of indoor targets to improve indoor localization accuracy and model generalization. Secondly, Taking the predicted position based on model stacking as the observation value of particle filter, collaborative particle filter localization based on model stacking algorithm is realized. The experimental results show that the algorithm can control the position error within 2m, which is superior to KNN, GBDT, Xgboost, LightGBM, RF. The location accuracy of the fusion particle filter algorithm is improved by 31%, and the predicted trajectory is close to the real trajectory. The algorithm can also adapt to the application scenarios with fewer wireless access points.

A Study on Total Production Time Prediction Using Machine Learning Techniques (머신러닝 기법을 이용한 총생산시간 예측 연구)

  • Eun-Jae Nam;Kwang-Soo Kim
    • Journal of the Korea Safety Management & Science
    • /
    • v.25 no.2
    • /
    • pp.159-165
    • /
    • 2023
  • The entire industry is increasing the use of big data analysis using artificial intelligence technology due to the Fourth Industrial Revolution. The value of big data is increasing, and the same is true of the production technology. However, small and medium -sized manufacturers with small size are difficult to use for work due to lack of data management ability, and it is difficult to enter smart factories. Therefore, to help small and medium -sized manufacturing companies use big data, we will predict the gross production time through machine learning. In previous studies, machine learning was conducted as a time and quantity factor for production, and the excellence of the ExtraTree Algorithm was confirmed by predicting gross product time. In this study, the worker's proficiency factors were added to the time and quantity factors necessary for production, and the prediction rate of LightGBM Algorithm knowing was the highest. The results of the study will help to enhance the company's competitiveness and enhance the competitiveness of the company by identifying the possibility of data utilization of the MES system and supporting systematic production schedule management.

Prediction of Larix kaempferi Stand Growth in Gangwon, Korea, Using Machine Learning Algorithms

  • Hyo-Bin Ji;Jin-Woo Park;Jung-Kee Choi
    • Journal of Forest and Environmental Science
    • /
    • v.39 no.4
    • /
    • pp.195-202
    • /
    • 2023
  • In this study, we sought to compare and evaluate the accuracy and predictive performance of machine learning algorithms for estimating the growth of individual Larix kaempferi trees in Gangwon Province, Korea. We employed linear regression, random forest, XGBoost, and LightGBM algorithms to predict tree growth using monitoring data organized based on different thinning intensities. Furthermore, we compared and evaluated the goodness-of-fit of these models using metrics such as the coefficient of determination (R2), mean absolute error (MAE), and root mean square error (RMSE). The results revealed that XGBoost provided the highest goodness-of-fit, with an R2 value of 0.62 across all thinning intensities, while also yielding the lowest values for MAE and RMSE, thereby indicating the best model fit. When predicting the growth volume of individual trees after 3 years using the XGBoost model, the agreement was exceptionally high, reaching approximately 97% for all stand sites in accordance with the different thinning intensities. Notably, in non-thinned plots, the predicted volumes were approximately 2.1 m3 lower than the actual volumes; however, the agreement remained highly accurate at approximately 99.5%. These findings will contribute to the development of growth prediction models for individual trees using machine learning algorithms.

A Study on Predicting Lung Cancer Using RNA-Sequencing Data with Ensemble Learning (앙상블 기법을 활용한 RNA-Sequencing 데이터의 폐암 예측 연구)

  • Geon AN;JooYong PARK
    • Journal of Korea Artificial Intelligence Association
    • /
    • v.2 no.1
    • /
    • pp.7-14
    • /
    • 2024
  • In this paper, we explore the application of RNA-sequencing data and ensemble machine learning to predict lung cancer and treatment strategies for lung cancer, a leading cause of cancer mortality worldwide. The research utilizes Random Forest, XGBoost, and LightGBM models to analyze gene expression profiles from extensive datasets, aiming to enhance predictive accuracy for lung cancer prognosis. The methodology focuses on preprocessing RNA-seq data to standardize expression levels across samples and applying ensemble algorithms to maximize prediction stability and reduce model overfitting. Key findings indicate that ensemble models, especially XGBoost, substantially outperform traditional predictive models. Significant genetic markers such as ADGRF5 is identified as crucial for predicting lung cancer outcomes. In conclusion, ensemble learning using RNA-seq data proves highly effective in predicting lung cancer, suggesting a potential shift towards more precise and personalized treatment approaches. The results advocate for further integration of molecular and clinical data to refine diagnostic models and improve clinical outcomes, underscoring the critical role of advanced molecular diagnostics in enhancing patient survival rates and quality of life. This study lays the groundwork for future research in the application of RNA-sequencing data and ensemble machine learning techniques in clinical settings.

Improved prediction of soil liquefaction susceptibility using ensemble learning algorithms

  • Satyam Tiwari;Sarat K. Das;Madhumita Mohanty;Prakhar
    • Geomechanics and Engineering
    • /
    • v.37 no.5
    • /
    • pp.475-498
    • /
    • 2024
  • The prediction of the susceptibility of soil to liquefaction using a limited set of parameters, particularly when dealing with highly unbalanced databases is a challenging problem. The current study focuses on different ensemble learning classification algorithms using highly unbalanced databases of results from in-situ tests; standard penetration test (SPT), shear wave velocity (Vs) test, and cone penetration test (CPT). The input parameters for these datasets consist of earthquake intensity parameters, strong ground motion parameters, and in-situ soil testing parameters. liquefaction index serving as the binary output parameter. After a rigorous comparison with existing literature, extreme gradient boosting (XGBoost), bagging, and random forest (RF) emerge as the most efficient models for liquefaction instance classification across different datasets. Notably, for SPT and Vs-based models, XGBoost exhibits superior performance, followed by Light gradient boosting machine (LightGBM) and Bagging, while for CPT-based models, Bagging ranks highest, followed by Gradient boosting and random forest, with CPT-based models demonstrating lower Gmean(error), rendering them preferable for soil liquefaction susceptibility prediction. Key parameters influencing model performance include internal friction angle of soil (ϕ) and percentage of fines less than 75 µ (F75) for SPT and Vs data and normalized average cone tip resistance (qc) and peak horizontal ground acceleration (amax) for CPT data. It was also observed that the addition of Vs measurement to SPT data increased the efficiency of the prediction in comparison to only SPT data. Furthermore, to enhance usability, a graphical user interface (GUI) for seamless classification operations based on provided input parameters was proposed.

Effects of the Acasia Catechu Extract on the Membranous Nephropathy Induced by Cationic Bovine Serum Albumin in Mice (아차(兒茶)가 Cationic Bovine Serum Albumin 투여로 유발된 Membranous Nephropathy Mouse Model에 미치는 영향)

  • Jeong, Gi-Hun;Cho, Chung-Sik;Kim, Cheol-Jung
    • The Journal of Internal Korean Medicine
    • /
    • v.30 no.3
    • /
    • pp.495-509
    • /
    • 2009
  • Objective : Membranous nephropathy(MN) is an organ-specific autoimmune disease and a relatively common cause of nephrotic syndrome in adults worldwide. But treatment of MN is not defined. This study was to evaluate the effects of Acasia Catechu extract(ACE) on the MN induced by cBSA in mice. Methods : Mice were divided into 4 groups. The normal group was injected with a saline solution. The control group was treated with cBSA(10 mg/kg i.p.) only. The third group was treated with cBSA (10 mg/kg i.p.) and ACE (250 mg/kg, p.o.). The fourth group was treated with cBSA (10mg/kg i.p.) and ACE (500mg/kg, p.o.). After cBSA and ACE treatment for 6 weeks, we measured change of body weight, 24hrs proteinuria, serum albumin, total cholesterol, triglyceride, BUN, creatinine, TNF-$\alpha$, IL-6, IL-$1{\beta}$, IFN-$\gamma$, IgA, IgM and IgG levels. The morphologic changes of renal glomeruli were also observed with a light microscope. Results : The levels of 24 hrs proteinuria, total cholesterol, triglyceride, IgG, IgM, IgA, TNF-$\alpha$, IL-6, IL-$1{\beta}$, IFN-$\gamma$ significantly decreased in both ACE groups. The level of albumin significantly increased in both ACE groups. The mRNA expression of IL-$1{\beta}$ in splenocytes considerably decreased in the ACE-500 group. In histological findings of kidney tissue, thickening of GBM decreased in both ACE groups. Conclusions : This study shows that ACE might be effective for treatment of MN. More clinical data and studies are to be done for efficient application.

  • PDF

Effects of the Lonicerae Flos Extract on the Membranous Nephropathy Induced by Cationic Bovine Serum Albumin in Mice (금은화(金銀花)가 Cationic Bovine Serum Albumin 투여로 유발된 Membranous Nephropathy Mouse Model에 미치는 영향)

  • Lee, Ju-Ho;Cho, Chung-Sik;Kim, Chul-Jung
    • Journal of Physiology & Pathology in Korean Medicine
    • /
    • v.23 no.5
    • /
    • pp.1063-1072
    • /
    • 2009
  • Membranous nephropathy(MN) is the most common cause of adult nephrotic syndrome worldwide. But treatment of MN is not defined. This study was to evaluate the effects of Lonicerae Flos Extract(LFE) on the MN induced by cBSA in mice. Mice were divided into 4 groups. The first group named for 'Normal' was injected with a saline solution. The second group named 'Control' treated with cBSA(10 mg/kg i.p) only. The third group named 'LFE-250', treated with cBSA(10 mg/kg i.p) and LFE(250 mg/kg, p.o). The fourth group named 'LFE-500'treated with cBSA(10 mg/kg i.p) and LFE(500 mg/kg, p.o). After cBSA and LFE treatment for 4 weeks, we measured change of body weight, 24hrs proteinuria, serum albumin, total cholesterol, triglyceride, BUN, creatinine, TNF-$\alpha$, IL-6, IL-$1{\beta}$, IL-10, IFN-$\gamma$, IgA, IgM and IgG levels. The morphologic changes of renal glomeruli were also observed with a light microscope. The levels of 24 hrs proteinuria, total cholesterol, IgG , IgM, IgA, IL-6 were significantly decreased in both LFE groups. The level of triglyceride, IL-$1{\beta}$ was significantly decreased in LFE-500 group. The level of Albumin was significantly increased in LFE-250 group. The level of TNF-$\alpha$, IFN-$\gamma$ were significantly decreased in LFE-250 group. The mRNA expression of IL-$1{\beta}$ in splenocytes was consideraly decreased in LFE-500 group. In histological findings of kidney tissue, thickening of GBM decreased in both LFE groups. This study shows that the LFE might be effective for treatment of MN. More clinical data and studies are to be done for efficient application.

Effects of the Houttuyniae Herba Extract on the Membranous Nephropathy induced by Cationic Bovine Serum Albumin in Mice (어성초(魚腥草)가 Cationic Bovine Serum Albumin 투여로 유발된 Membranous Nephropathy Mouse Model에 미치는 영향)

  • Jung, Dae-Ho;Cho, Chung-Sik;Kim, Cheol-Jung
    • The Journal of Korean Medicine
    • /
    • v.30 no.4
    • /
    • pp.93-107
    • /
    • 2009
  • Objective: Membranous nephropathy (MN) is one of the most common causes of nephrotic syndrome in adults. However, there is not a satisfactory treatment for MN. This study aimed to evaluate the effect of Houttuyniae Herba Extract (HHE) on MN induced by cationic bovine serum albumin (cBSA). Methods: Mice were divided into 4 groups. The first group, Normal, was injected with saline. The second group, Control, was treated with cBSA (10mg/kg i.p) only. The third group, HHE-250, was treated with cBSA (10mg/kg i.p) and HHE (250mg/kg, p.o). The fourth group, HHE-500, was treated with cBSA (10mg/kg i.p) and HHE (500mg/kg, p.o). After treatment for 4 weeks, we measured change of body weight, 24 hrs proteinuria, serum albumin, total cholesterol, triglyceride, BUN, creatinine, IgA, IgM, IgG, TNF-${\alpha}$, IL-1${\beta}$ levels and the mRNA expression of IFN-${\gamma}$, IL-6, and IL-10. The morphologic changes of renal glomeruli were also observed with a light microscope and an electron microscope. Results: The levels of 24 hrs proteinuria and serum triglyceride, BUN, IgG, TNF-${\alpha}$, IL-1${\beta}$ significantly decreased in both HHE groups, while the level of serum albumin significantly increased in both HHE groups. The mRNA expression of IFN-${\gamma}$ and IL-6 in splenocytes considerably increased in both HHE groups. The mRNA expression of IL-10 in splenocytes considerably decreased in both HHE groups. In histological findings of kidney tissue, thickening of GBM decreased in both HHE groups. Conclusions: This study shows that HHE might be effective for treatment of acute stage MN. More clinical data and studies are to be done for efficient application.

  • PDF

Analysis of the Impact of Satellite Remote Sensing Information on the Prediction Performance of Ungauged Basin Stream Flow Using Data-driven Models (인공위성 원격 탐사 정보가 자료 기반 모형의 미계측 유역 하천유출 예측성능에 미치는 영향 분석)

  • Seo, Jiyu;Jung, Haeun;Won, Jeongeun;Choi, Sijung;Kim, Sangdan
    • Journal of Wetlands Research
    • /
    • v.26 no.2
    • /
    • pp.147-159
    • /
    • 2024
  • Lack of streamflow observations makes model calibration difficult and limits model performance improvement. Satellite-based remote sensing products offer a new alternative as they can be actively utilized to obtain hydrological data. Recently, several studies have shown that artificial intelligence-based solutions are more appropriate than traditional conceptual and physical models. In this study, a data-driven approach combining various recurrent neural networks and decision tree-based algorithms is proposed, and the utilization of satellite remote sensing information for AI training is investigated. The satellite imagery used in this study is from MODIS and SMAP. The proposed approach is validated using publicly available data from 25 watersheds. Inspired by the traditional regionalization approach, a strategy is adopted to learn one data-driven model by integrating data from all basins, and the potential of the proposed approach is evaluated by using a leave-one-out cross-validation regionalization setting to predict streamflow from different basins with one model. The GRU + Light GBM model was found to be a suitable model combination for target basins and showed good streamflow prediction performance in ungauged basins (The average model efficiency coefficient for predicting daily streamflow in 25 ungauged basins is 0.7187) except for the period when streamflow is very small. The influence of satellite remote sensing information was found to be up to 10%, with the additional application of satellite information having a greater impact on streamflow prediction during low or dry seasons than during wet or normal seasons.