• Title/Summary/Keyword: splitting

Search Result 2,072, Processing Time 0.032 seconds

The Prediction of Export Credit Guarantee Accident using Machine Learning (기계학습을 이용한 수출신용보증 사고예측)

  • Cho, Jaeyoung;Joo, Jihwan;Han, Ingoo
    • Journal of Intelligence and Information Systems
    • /
    • v.27 no.1
    • /
    • pp.83-102
    • /
    • 2021
  • The government recently announced various policies for developing big-data and artificial intelligence fields to provide a great opportunity to the public with respect to disclosure of high-quality data within public institutions. KSURE(Korea Trade Insurance Corporation) is a major public institution for financial policy in Korea, and thus the company is strongly committed to backing export companies with various systems. Nevertheless, there are still fewer cases of realized business model based on big-data analyses. In this situation, this paper aims to develop a new business model which can be applied to an ex-ante prediction for the likelihood of the insurance accident of credit guarantee. We utilize internal data from KSURE which supports export companies in Korea and apply machine learning models. Then, we conduct performance comparison among the predictive models including Logistic Regression, Random Forest, XGBoost, LightGBM, and DNN(Deep Neural Network). For decades, many researchers have tried to find better models which can help to predict bankruptcy since the ex-ante prediction is crucial for corporate managers, investors, creditors, and other stakeholders. The development of the prediction for financial distress or bankruptcy was originated from Smith(1930), Fitzpatrick(1932), or Merwin(1942). One of the most famous models is the Altman's Z-score model(Altman, 1968) which was based on the multiple discriminant analysis. This model is widely used in both research and practice by this time. The author suggests the score model that utilizes five key financial ratios to predict the probability of bankruptcy in the next two years. Ohlson(1980) introduces logit model to complement some limitations of previous models. Furthermore, Elmer and Borowski(1988) develop and examine a rule-based, automated system which conducts the financial analysis of savings and loans. Since the 1980s, researchers in Korea have started to examine analyses on the prediction of financial distress or bankruptcy. Kim(1987) analyzes financial ratios and develops the prediction model. Also, Han et al.(1995, 1996, 1997, 2003, 2005, 2006) construct the prediction model using various techniques including artificial neural network. Yang(1996) introduces multiple discriminant analysis and logit model. Besides, Kim and Kim(2001) utilize artificial neural network techniques for ex-ante prediction of insolvent enterprises. After that, many scholars have been trying to predict financial distress or bankruptcy more precisely based on diverse models such as Random Forest or SVM. One major distinction of our research from the previous research is that we focus on examining the predicted probability of default for each sample case, not only on investigating the classification accuracy of each model for the entire sample. Most predictive models in this paper show that the level of the accuracy of classification is about 70% based on the entire sample. To be specific, LightGBM model shows the highest accuracy of 71.1% and Logit model indicates the lowest accuracy of 69%. However, we confirm that there are open to multiple interpretations. In the context of the business, we have to put more emphasis on efforts to minimize type 2 error which causes more harmful operating losses for the guaranty company. Thus, we also compare the classification accuracy by splitting predicted probability of the default into ten equal intervals. When we examine the classification accuracy for each interval, Logit model has the highest accuracy of 100% for 0~10% of the predicted probability of the default, however, Logit model has a relatively lower accuracy of 61.5% for 90~100% of the predicted probability of the default. On the other hand, Random Forest, XGBoost, LightGBM, and DNN indicate more desirable results since they indicate a higher level of accuracy for both 0~10% and 90~100% of the predicted probability of the default but have a lower level of accuracy around 50% of the predicted probability of the default. When it comes to the distribution of samples for each predicted probability of the default, both LightGBM and XGBoost models have a relatively large number of samples for both 0~10% and 90~100% of the predicted probability of the default. Although Random Forest model has an advantage with regard to the perspective of classification accuracy with small number of cases, LightGBM or XGBoost could become a more desirable model since they classify large number of cases into the two extreme intervals of the predicted probability of the default, even allowing for their relatively low classification accuracy. Considering the importance of type 2 error and total prediction accuracy, XGBoost and DNN show superior performance. Next, Random Forest and LightGBM show good results, but logistic regression shows the worst performance. However, each predictive model has a comparative advantage in terms of various evaluation standards. For instance, Random Forest model shows almost 100% accuracy for samples which are expected to have a high level of the probability of default. Collectively, we can construct more comprehensive ensemble models which contain multiple classification machine learning models and conduct majority voting for maximizing its overall performance.

Studies on the Kiln Drying Characteristics of Several Commercial Woods of Korea (국산 유용 수종재의 인공건조 특성에 관한 연구)

  • Chung, Byung-Jae
    • Journal of the Korean Wood Science and Technology
    • /
    • v.2 no.2
    • /
    • pp.8-12
    • /
    • 1974
  • 1. If one unity is given to the prongs whose ends touch each other for estimating the internal stresses occuring in it, the internal stresses which are developed in the open prongs can be evaluated by the ratio to the unity. In accordance with the above statement, an equation was derived as follows. For employing this equation, the prongs should be made as shown in Fig. I, and be measured A and B' as indicated in Fig. l. A more precise value will result as the angle (J becomes smaller. $CH=\frac{(A-B') (4W+A) (4W-A)}{2A[(2W+(A-B')][2W-(A-B')]}{\times}100%$ where A is thickness of the prong, B' is the distance between the two prongs shown in Fig. 1 and CH is the value of internal stress expressed by percentage. It precision is not required, the equation can be simplified as follows. $CH=\frac{A-B'}{A}{\times}200%$ 2. Under scheduled drying condition III the kiln, when the weight of a sample board is constant, the moisture content of the shell of a sample board in the case of a normal casehardening is lower than that of the equilibrium moisture content which is indicated by the Forest Products Laboratory, U. S. Department of Agriculture. This result is usually true, especially in a thin sample board. A thick unseasoned or reverse casehardened sample does not follow in the above statement. 3. The results in the comparison of drying rate with five different kinds of wood given in Table 1 show that the these drying rates, i.e., the quantity of water evaporated from the surface area of I centimeter square per hour, are graded by the order of their magnitude as follows. (1) Ginkgo biloba Linne (2) Diospyros Kaki Thumberg. (3) Pinus densiflora Sieb. et Zucc. (4) Larix kaempheri Sargent (5) Castanea crenata Sieb. et Zucc. It is shown, for example, that at the moisture content of 20 percent the highest value revealed by the Ginkgo biloba is in the order of 3.8 times as great as that for Castanea crenata Sieb. & Zucc. which has the lowest value. Especially below the moisture content of 26 percent, the drying rate, i.e., the function of moisture content in percentage, is represented by the linear equation. All of these linear equations are highly significant in testing the confficient of X i. e., moisture content in percentage. In the Table 2, the symbols are expressed as follows; Y is the quantity of water evaporated from the surface area of 1 centimeter square per hour, and X is the moisture content of the percentage. The drying rate is plotted against the moisture content of the percentage as in Fig. 2. 4. One hundred times the ratio(P%) of the number of samples occuring in the CH 4 class (from 76 to 100% of CH ratio) within the total number of saplmes tested to those of the total which underlie the given SR ratio is measured in Table 3. (The 9% indicated above is assumed as the danger probability in percentage). In summarizing above results, the conclusion is in Table 4. NOTE: In Table 4, the column numbers such as 1. 2 and 3 imply as follows, respectively. 1) The minimum SR ratio which does not reveal the CH 4, class is indicated as in the column 1. 2) The extent of SR ratio which is confined in the safety allowance of 30 percent is shown in the column 2. 3) The lowest limitation of SR ratio which gives the most danger probability of 100 percent is shown in column 3. In analyzing above results, it is clear that chestnut and larch easly form internal stress in comparison with persimmon and pine. However, in considering the fact that the revers, casehardening occured in fir and ginkgo, under the same drying condition with the others, it is deduced that fir and ginkgo form normal casehardening with difficulty in comparison with the other species tested. 5. All kinds of drying defects except casehardening are developed when the internal stresses are in excess of the ultimate strength of material in the case of long-lime loading. Under the drying condition at temperature of $170^{\circ}F$ and the lower humidity. the drying defects are not so severe. However, under the same conditions at $200^{\circ}F$, the lower humidity and not end coated, all sample boards develop severe drying defects. Especially the chestnut was very prone to form the drying defects such as casehardening and splitting.

  • PDF