• Title/Summary/Keyword: GBM 모형

Search Result 25, Processing Time 0.028 seconds

The Effects of the Previous Corporate Internal Reservation on the Current Dividend Rate - Using LEV as a moderating variable & Verification through DRF & GBM model (법인의 전기 사내유보가 당기 배당률에 미치는 영향 부채비율의 조절변수 효과 및 DRF & GBM 모델을 통한 검증)

  • Yoo, Joon-Soo;Jeong, Jae-Yeon
    • Journal of the Korea Convergence Society
    • /
    • v.8 no.10
    • /
    • pp.215-223
    • /
    • 2017
  • This article has tried to analyse the effect of the corporate earning return tax empirically through analysis on the impact of previous internal reservation on the dividends rate of the current year. In addition to this, this article has tried to the effectiveness of government policies with leverage ratio as a moderating variable. Moreover, DRF and GBM model were used to see the effect again. As a result of the actual proof analysis, OCF, ROE, FOR have a significance level of 99% in model1, model2, model3. However, ADV and MSE has appeared not to be meaningful in all models. In the result of DRF and GBM model for convergence was higher than GBM in depth and leaves. However, when it comes to a model explaining capability, GBM high than DRF. The further study will be required to examine the effect of government policy by time series analysis in the period of enforcement of the reflux tax, from 2015 to 2017.

Predicting of the Severity of Car Traffic Accidents on a Highway Using Light Gradient Boosting Model (LightGBM 알고리즘을 활용한 고속도로 교통사고심각도 예측모델 구축)

  • Lee, Hyun-Mi;Jeon, Gyo-Seok;Jang, Jeong-Ah
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.15 no.6
    • /
    • pp.1123-1130
    • /
    • 2020
  • This study aims to classify the severity in car crashes using five classification learning models. The dataset used in this study contains 21,013 vehicle crashes, obtained from Korea Expressway Corporation, between the year of 2015-2017 and the LightGBM(Light Gradient Boosting Model) performed well with the highest accuracy. LightGBM, the number of involved vehicles, type of accident, incident location, incident lane type, types of accidents, types of vehicles involved in accidents were shown as priority factors. Based on the results of this model, the establishment of a management strategy for response of highway traffic accident should be presented through a consistent prediction process of accident severity level. This study identifies applicability of Machine Learning Models for Predicting of the Severity of Car Traffic Accidents on a Highway and suggests that various machine learning techniques based on big data that can be used in the future.

The Effects of the Previous Corporation Internal Reservation on the Current R&D Investment -Using EDU as a moderating variable & Verification through GBM model (법인의 전기 사내유보가 당기 연구개발 투자에 미치는 영향 - 교육훈련비의 조절변수 효과 및 GBM 모델을 통한 검증)

  • Yoo, Joon-Soo;Jeong, Jae-Yeon
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.1
    • /
    • pp.9-20
    • /
    • 2018
  • The purpose of this paper is to analyze the effect of corporation internal reservation on R&D investment. It is to find how much effect the reflux tax has achieved through empirical analysis. In addition, education training expense was taken as a moderating variable to find the effectiveness of government policy. Furthermore, the study looked through the effect once again by using GMB model. According to the result counted by regression analysis, it could be concluded that the effect of both moderation and intervention had a significant effect and the variable of interest cost and welfare & benefit cost in model 1, 2 and 3 had a meaningful impact at the level of 99%. On the other hand, the previous corporate internal reservation failed to show any significant result in all types of models. Even in GBM model of convergence level applied to additional analysis, similar results came out.

A Study on the Development of Traffic Volume Estimation Model Based on Mobile Communication Data Using Machine Learning (머신러닝을 이용한 이동통신 데이터 기반 교통량 추정 모형 개발)

  • Dong-seob Oh;So-sig Yoon;Choul-ki Lee;Yong-Sung CHO
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.22 no.4
    • /
    • pp.1-13
    • /
    • 2023
  • This study develops an optimal mobile-communication-based National Highway traffic volume estimation model using an ensemble-based machine learning algorithm. Based on information such as mobile communication data and VDS data, the LightGBM model was selected as the optimal model for estimating traffic volume. As a result of evaluating traffic volume estimation performance from 96 points where VDS was installed, MAPE was 8.49 (accuracy 91.51%). On the roads where VDS was not installed, traffic estimation accuracy was 92.6%.

Prediction Model of CNC Processing Defects Using Machine Learning (머신러닝을 이용한 CNC 가공 불량 발생 예측 모델)

  • Han, Yong Hee
    • Journal of the Korea Convergence Society
    • /
    • v.13 no.2
    • /
    • pp.249-255
    • /
    • 2022
  • This study proposed an analysis framework for real-time prediction of CNC processing defects using machine learning-based models that are recently attracting attention as processing defect prediction methods, and applied it to CNC machines. Analysis shows that the XGBoost, CatBoost, and LightGBM models have the same best accuracy, precision, recall, F1 score, and AUC, of which the LightGBM model took the shortest execution time. This short run time has practical advantages such as reducing actual system deployment costs, reducing the probability of CNC machine damage due to rapid prediction of defects, and increasing overall CNC machine utilization, confirming that the LightGBM model is the most effective machine learning model for CNC machines with only basic sensors installed. In addition, it was confirmed that classification performance was maximized when an ensemble model consisting of LightGBM, ExtraTrees, k-Nearest Neighbors, and logistic regression models was applied in situations where there are no restrictions on execution time and computing power.

머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구

  • Yun, Yang-Hyeon;Kim, Tae-Gyeong;Kim, Su-Yeong;Park, Yong-Gyun
    • 한국벤처창업학회:학술대회논문집
    • /
    • 2021.11a
    • /
    • pp.185-187
    • /
    • 2021
  • 관리종목 지정 제도는 상장 기업 내 기업의 부실화를 경고하여 기업에게는 회생 기회를 주고, 투자자들에게는 투자 위험을 경고하기 위한 시장규제 제도이다. 본 연구는 관리종목과 비관리종목의 기업의 재무 데이터를 표본으로 하여 관리종목 지정 예측에 대한 연구를 진행하였다. 분석에 쓰인 분석 방법은 로지스틱 회귀분석, 의사결정나무, 서포트 벡터 머신, 소프트 보팅, 랜덤 포레스트, LightGBM이며 분류 정확도가 82.73%인 LightGBM이 가장 우수한 예측 모형이었으며 분류 정확도가 가장 낮은 예측 모형은 정확도가 71.94%인 의사결정나무였다. 대체적으로 앙상블을 이용한 학습 모형이 단일 학습 모형보다 예측 성능이 높았다.

  • PDF

Random effect models for simple diffusions (단순 확산과정들에 대한 확률효과 모형)

  • Lee, Eun-Kyung;Lee, In Suk;Lee, Yoon Dong
    • The Korean Journal of Applied Statistics
    • /
    • v.31 no.6
    • /
    • pp.801-810
    • /
    • 2018
  • Diffusion is a random process used to model financial and physical phenomena. When we construct statistical models for repeatedly observed diffusion processes, the idea of random effects needs to be considered. In this research, we introduce random parameters for an Ornstein-Uhlenbeck diffusion model and geometric Brownian motion diffusion model. In order to apply the maximum likelihood estimation method, we tried to build likelihoods in closed-forms, by assuming appropriate distributions for random effects. We applied the random effect models to data consisting of Dow Jones Industrial Average indices recorded daily over 27 years from 1991 to 2017.

Analysis of Important Indicators of TCB Using GBM (일반화가속모형을 이용한 기술신용평가 주요 지표 분석)

  • Jeon, Woo-Jeong(Michael);Seo, Young-Wook
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.4
    • /
    • pp.159-173
    • /
    • 2017
  • In order to provide technical financial support to small and medium-sized venture companies based on technology, the government implemented the TCB evaluation, which is a kind of technology rating evaluation, from the Kibo and a qualified private TCB. In this paper, we briefly review the current state of TCB evaluation and available indicators related to technology evaluation accumulated in the Korea Credit Information Services (TDB), and then use indicators that have a significant effect on the technology rating score. Multiple regression techniques will be explored. And the relative importance and classification accuracy of the indicators were calculated by applying the key indicators as independent features applied to the generalized boosting model, which is a representative machine learning classifier, as the class influence and the fitness of each model. As a result of the analysis, it was analyzed that the relative importance between the two models was not significantly different. However, GBM model had more weight on the InnoBiz certification, R&D department, patent registration and venture confirmation indicators than regression model.

Study on Predicting the Designation of Administrative Issue in the KOSDAQ Market Based on Machine Learning Based on Financial Data (머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구: 재무적 데이터를 중심으로)

  • Yoon, Yanghyun;Kim, Taekyung;Kim, Suyeong
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.1
    • /
    • pp.229-249
    • /
    • 2022
  • This paper investigates machine learning models for predicting the designation of administrative issues in the KOSDAQ market through various techniques. When a company in the Korean stock market is designated as administrative issue, the market recognizes the event itself as negative information, causing losses to the company and investors. The purpose of this study is to evaluate alternative methods for developing a artificial intelligence service to examine a possibility to the designation of administrative issues early through the financial ratio of companies and to help investors manage portfolio risks. In this study, the independent variables used 21 financial ratios representing profitability, stability, activity, and growth. From 2011 to 2020, when K-IFRS was applied, financial data of companies in administrative issues and non-administrative issues stocks are sampled. Logistic regression analysis, decision tree, support vector machine, random forest, and LightGBM are used to predict the designation of administrative issues. According to the results of analysis, LightGBM with 82.73% classification accuracy is the best prediction model, and the prediction model with the lowest classification accuracy is a decision tree with 71.94% accuracy. As a result of checking the top three variables of the importance of variables in the decision tree-based learning model, the financial variables common in each model are ROE(Net profit) and Capital stock turnover ratio, which are relatively important variables in designating administrative issues. In general, it is confirmed that the learning model using the ensemble had higher predictive performance than the single learning model.

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.