• 제목/요약/키워드: 그래디언트 부스팅

Search Result 23, Processing Time 0.026 seconds

The study of foreign exchange trading revenue model using decision tree and gradient boosting (외환거래에서 의사결정나무와 그래디언트 부스팅을 이용한 수익 모형 연구)

  • Jung, Ji Hyeon;Min, Dae Kee
    • Journal of the Korean Data and Information Science Society
    • /
    • v.24 no.1
    • /
    • pp.161-170
    • /
    • 2013
  • The FX (Foreign Exchange) is a form of exchange for the global decentralized trading of international currencies. The simple sense of Forex is simultaneous purchase and sale of the currency or the exchange of one country's currency for other countries'. We can find the consistent rules of trading by comparing the gradient boosting method and the decision trees methods. Methods such as time series analysis used for the prediction of financial markets have advantage of the long-term forecasting model. On the other hand, it is difficult to reflect the rapidly changing price fluctuations in the short term. Therefore, in this study, gradient boosting method and decision tree method are applied to analyze the short-term data in order to make the rules for the revenue structure of the FX market and evaluated the stability and the prediction of the model.

A Gradient Boosting Method for Graph Neural Networks (그래프 신경망에 대한 그래디언트 부스팅 기법)

  • Jang, Eunjo;Lee, Ki Yong
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2022.11a
    • /
    • pp.574-576
    • /
    • 2022
  • 최근 여러 분야에서 그래프 신경망(graph neural network, GNN)이 활발히 연구되고 있다. 하지만 지금까지 대부분의 GNN 연구는 단일 GNN 모델의 성능을 향상하는 데 집중되었다. 본 논문에서는 앙상블(ensemble) 기법의 대표적 기법인 그래디언트 부스팅(gradient boosting)을 이용하여 GNN의 앙상블 모델을 만드는 방법을 제안한다. 제안 방법은 앞서 만들어진 GNN의 오차를 경사 하강법(gradient descent)을 이용하여 감소시키는 방향으로 다음 GNN을 생성한다. 이 과정을 반복하여 GNN의 최종 앙상블 모델을 얻는다. 실험에서 GNN의 대표적인 모델인 그래프 합성곱 신경망(graph convolutional network, GCN)에 제안 방법을 적용하여 앙상블 모델을 생성한 결과, 단일 GCN 모델에 비해 노드 분류 정확도가 11.3%p까지 증가하였음을 확인하였다.

A Study on Smoker Prediction Using Machine Learning Algorithm (기계학습 알고리즘을 이용한 흡연자 예측 연구)

  • Jongwoo Baek;Joonil Bang;Joowon Lee;Hwajong Kim
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2023.07a
    • /
    • pp.537-538
    • /
    • 2023
  • 본 논문에서는 사람에게서 나타나는 생체 특성과 흡연여부의 상관관계 분석을 위해 랜덤 포레스트와 그래디언트 부스팅 트리의 두 가지 기계학습 알고리즘을 사용하였다. 연구에 사용된 데이터는 국민건강보험공단에서 제공하고 Kaggle에서 취합하여 정리한 건강검진 정보를 사용하였다. 분류 모델의 학습에 있어 혈청 정보가 높은 관계성을 보일 것으로 예상하였으나, 실제 결과는 성별이 가장 큰 영향을 끼치는 것으로 확인되었다.

  • PDF

Who Gets Government SME R&D Subsidy? Application of Gradient Boosting Model (Gradient Boosting 모형을 이용한 중소기업 R&D 지원금 결정요인 분석)

  • Kang, Sung Won;Kang, HeeChan
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.4
    • /
    • pp.77-109
    • /
    • 2020
  • In this paper, we build a gradient Boosting model to predict government SME R&D subsidy, select features of high importance, and measure the impact of each features to the predicted subsidy using PDP and SHAP value. Unlike previous empirical researches, we focus on the effect of the R&D subsidy distribution pattern to the incentive of the firms participating subsidy competition. We used the firm data constructed by KISTEP linking government R&D subsidy record with financial statements provided by NICE, and applied a Gradient Boosting model to predict R&D subsidy. We found that firms with higher R&D performance and larger R&D investment tend to have higher R&D subsidies, but firms with higher operation profit or total asset turnover rate tend to have lower R&D subsidies. Our results suggest that current government R&D subsidy distribution pattern provides incentive to improve R&D project performance, but not business performance.

Performance Analysis of Trading Strategy using Gradient Boosting Machine Learning and Genetic Algorithm

  • Jang, Phil-Sik
    • Journal of the Korea Society of Computer and Information
    • /
    • v.27 no.11
    • /
    • pp.147-155
    • /
    • 2022
  • In this study, we developed a system to dynamically balance a daily stock portfolio and performed trading simulations using gradient boosting and genetic algorithms. We collected various stock market data from stocks listed on the KOSPI and KOSDAQ markets, including investor-specific transaction data. Subsequently, we indexed the data as a preprocessing step, and used feature engineering to modify and generate variables for training. First, we experimentally compared the performance of three popular gradient boosting algorithms in terms of accuracy, precision, recall, and F1-score, including XGBoost, LightGBM, and CatBoost. Based on the results, in a second experiment, we used a LightGBM model trained on the collected data along with genetic algorithms to predict and select stocks with a high daily probability of profit. We also conducted simulations of trading during the period of the testing data to analyze the performance of the proposed approach compared with the KOSPI and KOSDAQ indices in terms of the CAGR (Compound Annual Growth Rate), MDD (Maximum Draw Down), Sharpe ratio, and volatility. The results showed that the proposed strategies outperformed those employed by the Korean stock market in terms of all performance metrics. Moreover, our proposed LightGBM model with a genetic algorithm exhibited competitive performance in predicting stock price movements.

Malware classification using statistical techniques (통계적 기법을 이용한 악성 소프트웨어 분류)

  • Won, Sungmin;Kim, Hyunjoo;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.6
    • /
    • pp.851-865
    • /
    • 2017
  • Ransomware such as WannaCry is a global issue and methods to defend against malware attacks are important. We have to be able to classify the malware types efficiently in order to minimize the damage from malwares. This study makes models to classify malware properly with various statistical techniques. Several classification techniques such as logistic regression, random forest, gradient boosting, and support vector machine are used to construct models. This study also helps us understand key variables to classify the type of malicious software.

Prediction of Cryptocurrency Price Trend Using Gradient Boosting (그래디언트 부스팅을 활용한 암호화폐 가격동향 예측)

  • Heo, Joo-Seong;Kwon, Do-Hyung;Kim, Ju-Bong;Han, Youn-Hee;An, Chae-Hun
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.7 no.10
    • /
    • pp.387-396
    • /
    • 2018
  • Stock price prediction has been a difficult problem to solve. There have been many studies to predict stock price scientifically, but it is still impossible to predict the exact price. Recently, a variety of types of cryptocurrency has been developed, beginning with Bitcoin, which is technically implemented as the concept of distributed ledger. Various approaches have been attempted to predict the price of cryptocurrency. Especially, it is various from attempts to stock prediction techniques in traditional stock market, to attempts to apply deep learning and reinforcement learning. Since the market for cryptocurrency has many new features that are not present in the existing traditional stock market, there is a growing demand for new analytical techniques suitable for the cryptocurrency market. In this study, we first collect and process seven cryptocurrency price data through Bithumb's API. Then, we use the gradient boosting model, which is a data-driven learning based machine learning model, and let the model learn the price data change of cryptocurrency. We also find the most optimal model parameters in the verification step, and finally evaluate the prediction performance of the cryptocurrency price trends.

A Study on Domestic Drama Rating Prediction (국내 드라마 시청률 예측 및 영향요인 분석)

  • Kang, Suyeon;Jeon, Heejeong;Kim, Jihye;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.28 no.5
    • /
    • pp.933-949
    • /
    • 2015
  • Audience rating competition in the domestic drama market has increased recently due to the introduction of commercial broadcasting and diversification of channels. There is now a need for thorough studies and analysis on audience rating. Especially, a drama rating is an important measure to estimate advertisement costs for producers and advertisers. In this paper, we study the drama rating prediction models using various data mining techniques such as linear regression, LASSO regression, random forest, and gradient boosting. The analysis results show that initial drama ratings are affected by structural elements such as broadcasting station and broadcasting time. Average drama ratings are also influenced by earlier public opinion such as the number of internet searches about the drama.

Store Sales Prediction Using Gradient Boosting Model (그래디언트 부스팅 모델을 활용한 상점 매출 예측)

  • Choi, Jaeyoung;Yang, Heeyoon;Oh, Hayoung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.2
    • /
    • pp.171-177
    • /
    • 2021
  • Through the rapid developments in machine learning, there have been diverse utilization approaches not only in industrial fields but also in daily life. Implementations of machine learning on financial data, also have been of interest. Herein, we employ machine learning algorithms to store sales data and present future applications for fintech enterprises. We utilize diverse missing data processing methods to handle missing data and apply gradient boosting machine learning algorithms; XGBoost, LightGBM, CatBoost to predict the future revenue of individual stores. As a result, we found that using median imputation onto missing data with the appliance of the xgboost algorithm has the best accuracy. By employing the proposed method, fintech enterprises and customers can attain benefits. Stores can benefit by receiving financial assistance beforehand from fintech companies, while these corporations can benefit by offering financial support to these stores with low risk.

Exploration of Factors on Pre-service Science Teachers' Major Satisfaction and Academic Satisfaction Using Machine Learning and Explainable AI SHAP (머신러닝과 설명가능한 인공지능 SHAP을 활용한 사범대 과학교육 전공생의 전공만족도 및 학업만족도 영향요인 탐색)

  • Jibeom Seo;Nam-Hwa Kang
    • Journal of Science Education
    • /
    • v.47 no.1
    • /
    • pp.37-51
    • /
    • 2023
  • This study explored the factors influencing major satisfaction and academic satisfaction of science education major students at the College of Education using machine learning models, random forest, gradient boosting model, and SHAP. Analysis results showed that the performance of the gradient boosting model was better than that of the random forest, but the difference was not large. Factors influencing major satisfaction include 'satisfaction with science teachers in high school corresponding to the subject of one's major', 'motivation for teaching job', and 'age'. Through the SHAP value, the influence of variables was identified, and the results were derived for the group as a whole and for individual analysis. The comprehensive and individual results could be complementary with each other. Based on the research results, implications for ways to support pre-service science teachers' major and academic satisfaction were proposed.