• Title/Summary/Keyword: DEFAULT PREDICTION

Search Result 60, Processing Time 0.031 seconds

Influence of Housing Market Changes on Construction Company Insolvency (주택시장 변화가 규모별 건설업체 부실화에 미치는 영향 분석)

  • Jang, Ho-Myun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.5
    • /
    • pp.3260-3269
    • /
    • 2014
  • The construction industry has strong ties with other industries, and so construction company insolvency also has a strong influence on other industries. Prediction models addressing the insolvency of construction company have been well studied. Although factors contributing to insolvency must precede those of predictions of insolvency, studies on these contributing factors are limited. The purpose of this study is to analyze the influence of changes in the housing market on construction company insolvency by using the Vector Error Correction Model. Construction companies were divided into two groups, and the expected default frequency(EDF), which indicates insolvency of each company was measured through the KMV model. The results verified that 10 largest construction companies were in a better financial condition compared to relatively smaller construction companies. As a result of conducting impulse response analysis, the EDF of large companies was found to be more sensitive to housing market change than that of small- and medium-sized construction companies.

Assessment of Changed Input Modules with SMOKE Model (SMOKE 모델의 입력 모듈 변경에 따른 영향 분석)

  • Kim, Ji-Young;Kim, Jeong-Soo;Hong, Ji-Hyung;Jung, Dong-Il;Ban, Soo-Jin;Lee, Yong-Mi
    • Journal of Korean Society for Atmospheric Environment
    • /
    • v.24 no.3
    • /
    • pp.284-299
    • /
    • 2008
  • Emission input modules was developed to produce emission input data and change some profiles for Sparse Matrix Operator Kernel Emissions (SMOKE) using Clean Air Policy Support System (CAPSS)'s activities and previous studies. Specially, this study was focused to improve chemical speciation and temporal allocation profiles of SMOKE. At first, SCC cord mapping was done. 579 SCC cords of CAPSS were matched with EPA's one. Temporal allocation profiles were changed using CAPSS monthly activities. And Chemical speciation profiles were substituted using Kang et al. (2000) and Lee et al. (2005) studies and Kim et al. (2005) study. Simulation in Seoul Metropolitan Area (Seoul, Incheon, Gyeonggi) using MM5, SMOKE and CMAQ modeling system was done for effect analysis of changed input modules of SMOKE. Emission model results adjusted with new input modules were slightly changed as compared to using EPA's default modules. SMOKE outputs shows that aldehyde emissions were decreased 4.78% after changing chemical profiles, increased 0.85% after implementing new temporal profiles. Toluene emissions were decreased 18.56% by changing chemical speciation profiles, increased 0.67% by replacing temporal profiles as well. Simulated results of air quality were also slightly elevated by using new input modules. Continuous accumulation of domestic data and studies to develop input system for air quality modeling would produce more improved results of air quality prediction.

Comparisons of the corporate credit rating model power under various conditions (기준값 변화에 따른 기업신용평가모형 성능 비교)

  • Ha, Jeongcheol;Kim, Soojin
    • Journal of the Korean Data and Information Science Society
    • /
    • v.26 no.6
    • /
    • pp.1207-1216
    • /
    • 2015
  • This study aims to compare the model power in developing corporate credit rating models and to suggest a good way to build models based on the characteristic of data. Among many measurement methods, AR is used to measure the model power under various conditions. SAS/MACRO is in use for similar repetitions to reduce time to build models under several combination of conditions. A corporate credit rating model is composed of two sub-models; a credit scoring model and a default prediction model. We verify that the latter performs better than the former under various conditions. From the result of size comparisons, models of large size corporate are more powerful and more meaningful in financial viewpoint than those of small size corporate. As a corporate size gets smaller, the gap between sub-models becomes huge and the effect of outliers becomes serious.

A Study on the Optimal Loan Limit Management Using the Newsvendor Model (뉴스벤더 모델을 이용한 최적 대출금 한도 관리에 관한 연구)

  • Sin, Jeong-Hun;Hwang, Seung-June
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.38 no.3
    • /
    • pp.39-48
    • /
    • 2015
  • In this study, granting the optimal loan limit on SME (Small and Medium Enterprise) loans of financial institutions was proposed using the traditional newsvendor model. This study was the first domestic case study that applied the newsvendor model that was mainly used to calculate the optimum order quantity under some uncertain demands to the calculation of the loan limit (debt ceiling) of institutions. The method presented in this study made it possible to calculate the loan limit (debt ceiling) to maximize the revenue of a financial institution using probability functions, applied the newsvendor model setting the order volume of merchandise goods as the loan product order volume of the financial institution, and proposed, through the analysis of empirical data, the availability of additional loan to the borrower and the reduction of the debt ceiling and a management method for the recovery of the borrower who could not generate profit. In addition, the profit based loan money management model presented in this study also demonstrated that it also contributed to some extent to the prediction of the bankruptcy of the borrowing SME (Small and Medium Enterprise), as well as the calculation of the loan limit based on profit, by deriving the result values that the borrowing SME (Small and Medium Enterprise) actually went through bankruptcy at later times once the model had generated a signal of loan recovery for them during the validation of empirical data. accordingly, The method presented in this study suggested a methodology to generated a signal of loan recovery to reduce the losses by the bankruptcy.

Dynamic forecasts of bankruptcy with Recurrent Neural Network model (RNN(Recurrent Neural Network)을 이용한 기업부도예측모형에서 회계정보의 동적 변화 연구)

  • Kwon, Hyukkun;Lee, Dongkyu;Shin, Minsoo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.3
    • /
    • pp.139-153
    • /
    • 2017
  • Corporate bankruptcy can cause great losses not only to stakeholders but also to many related sectors in society. Through the economic crises, bankruptcy have increased and bankruptcy prediction models have become more and more important. Therefore, corporate bankruptcy has been regarded as one of the major topics of research in business management. Also, many studies in the industry are in progress and important. Previous studies attempted to utilize various methodologies to improve the bankruptcy prediction accuracy and to resolve the overfitting problem, such as Multivariate Discriminant Analysis (MDA), Generalized Linear Model (GLM). These methods are based on statistics. Recently, researchers have used machine learning methodologies such as Support Vector Machine (SVM), Artificial Neural Network (ANN). Furthermore, fuzzy theory and genetic algorithms were used. Because of this change, many of bankruptcy models are developed. Also, performance has been improved. In general, the company's financial and accounting information will change over time. Likewise, the market situation also changes, so there are many difficulties in predicting bankruptcy only with information at a certain point in time. However, even though traditional research has problems that don't take into account the time effect, dynamic model has not been studied much. When we ignore the time effect, we get the biased results. So the static model may not be suitable for predicting bankruptcy. Thus, using the dynamic model, there is a possibility that bankruptcy prediction model is improved. In this paper, we propose RNN (Recurrent Neural Network) which is one of the deep learning methodologies. The RNN learns time series data and the performance is known to be good. Prior to experiment, we selected non-financial firms listed on the KOSPI, KOSDAQ and KONEX markets from 2010 to 2016 for the estimation of the bankruptcy prediction model and the comparison of forecasting performance. In order to prevent a mistake of predicting bankruptcy by using the financial information already reflected in the deterioration of the financial condition of the company, the financial information was collected with a lag of two years, and the default period was defined from January to December of the year. Then we defined the bankruptcy. The bankruptcy we defined is the abolition of the listing due to sluggish earnings. We confirmed abolition of the list at KIND that is corporate stock information website. Then we selected variables at previous papers. The first set of variables are Z-score variables. These variables have become traditional variables in predicting bankruptcy. The second set of variables are dynamic variable set. Finally we selected 240 normal companies and 226 bankrupt companies at the first variable set. Likewise, we selected 229 normal companies and 226 bankrupt companies at the second variable set. We created a model that reflects dynamic changes in time-series financial data and by comparing the suggested model with the analysis of existing bankruptcy predictive models, we found that the suggested model could help to improve the accuracy of bankruptcy predictions. We used financial data in KIS Value (Financial database) and selected Multivariate Discriminant Analysis (MDA), Generalized Linear Model called logistic regression (GLM), Support Vector Machine (SVM), Artificial Neural Network (ANN) model as benchmark. The result of the experiment proved that RNN's performance was better than comparative model. The accuracy of RNN was high in both sets of variables and the Area Under the Curve (AUC) value was also high. Also when we saw the hit-ratio table, the ratio of RNNs that predicted a poor company to be bankrupt was higher than that of other comparative models. However the limitation of this paper is that an overfitting problem occurs during RNN learning. But we expect to be able to solve the overfitting problem by selecting more learning data and appropriate variables. From these result, it is expected that this research will contribute to the development of a bankruptcy prediction by proposing a new dynamic model.

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Corporate Bond Rating Using Various Multiclass Support Vector Machines (다양한 다분류 SVM을 적용한 기업채권평가)

  • Ahn, Hyun-Chul;Kim, Kyoung-Jae
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.157-178
    • /
    • 2009
  • Corporate credit rating is a very important factor in the market for corporate debt. Information concerning corporate operations is often disseminated to market participants through the changes in credit ratings that are published by professional rating agencies, such as Standard and Poor's (S&P) and Moody's Investor Service. Since these agencies generally require a large fee for the service, and the periodically provided ratings sometimes do not reflect the default risk of the company at the time, it may be advantageous for bond-market participants to be able to classify credit ratings before the agencies actually publish them. As a result, it is very important for companies (especially, financial companies) to develop a proper model of credit rating. From a technical perspective, the credit rating constitutes a typical, multiclass, classification problem because rating agencies generally have ten or more categories of ratings. For example, S&P's ratings range from AAA for the highest-quality bonds to D for the lowest-quality bonds. The professional rating agencies emphasize the importance of analysts' subjective judgments in the determination of credit ratings. However, in practice, a mathematical model that uses the financial variables of companies plays an important role in determining credit ratings, since it is convenient to apply and cost efficient. These financial variables include the ratios that represent a company's leverage status, liquidity status, and profitability status. Several statistical and artificial intelligence (AI) techniques have been applied as tools for predicting credit ratings. Among them, artificial neural networks are most prevalent in the area of finance because of their broad applicability to many business problems and their preeminent ability to adapt. However, artificial neural networks also have many defects, including the difficulty in determining the values of the control parameters and the number of processing elements in the layer as well as the risk of over-fitting. Of late, because of their robustness and high accuracy, support vector machines (SVMs) have become popular as a solution for problems with generating accurate prediction. An SVM's solution may be globally optimal because SVMs seek to minimize structural risk. On the other hand, artificial neural network models may tend to find locally optimal solutions because they seek to minimize empirical risk. In addition, no parameters need to be tuned in SVMs, barring the upper bound for non-separable cases in linear SVMs. Since SVMs were originally devised for binary classification, however they are not intrinsically geared for multiclass classifications as in credit ratings. Thus, researchers have tried to extend the original SVM to multiclass classification. Hitherto, a variety of techniques to extend standard SVMs to multiclass SVMs (MSVMs) has been proposed in the literature Only a few types of MSVM are, however, tested using prior studies that apply MSVMs to credit ratings studies. In this study, we examined six different techniques of MSVMs: (1) One-Against-One, (2) One-Against-AIL (3) DAGSVM, (4) ECOC, (5) Method of Weston and Watkins, and (6) Method of Crammer and Singer. In addition, we examined the prediction accuracy of some modified version of conventional MSVM techniques. To find the most appropriate technique of MSVMs for corporate bond rating, we applied all the techniques of MSVMs to a real-world case of credit rating in Korea. The best application is in corporate bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. For our study the research data were collected from National Information and Credit Evaluation, Inc., a major bond-rating company in Korea. The data set is comprised of the bond-ratings for the year 2002 and various financial variables for 1,295 companies from the manufacturing industry in Korea. We compared the results of these techniques with one another, and with those of traditional methods for credit ratings, such as multiple discriminant analysis (MDA), multinomial logistic regression (MLOGIT), and artificial neural networks (ANNs). As a result, we found that DAGSVM with an ordered list was the best approach for the prediction of bond rating. In addition, we found that the modified version of ECOC approach can yield higher prediction accuracy for the cases showing clear patterns.

Statistical Analysis of Extreme Values of Financial Ratios (재무비율의 극단치에 대한 통계적 분석)

  • Joo, Jihwan
    • Knowledge Management Research
    • /
    • v.22 no.2
    • /
    • pp.247-268
    • /
    • 2021
  • Investors mainly use PER and PBR among financial ratios for valuation and investment decision-making. I conduct an analysis of two basic financial ratios from a statistical perspective. Financial ratios contain key accounting numbers which reflect firm fundamentals and are useful for valuation or risk analysis such as enterprise credit evaluation and default prediction. The distribution of financial data tends to be extremely heavy-tailed, and PER and PBR show exceedingly high level of kurtosis and their extreme cases often contain significant information on financial risk. In this respect, Extreme Value Theory is required to fit its right tail more precisely. I introduce not only GPD but exGPD. GPD is conventionally preferred model in Extreme Value Theory and exGPD is log-transformed distribution of GPD. exGPD has recently proposed as an alternative of GPD(Lee and Kim, 2019). First, I conduct a simulation for comparing performances of the two distributions using the goodness of fit measures and the estimation of 90-99% percentiles. I also conduct an empirical analysis of Information Technology firms in Korea. Finally, exGPD shows better performance especially for PBR, suggesting that exGPD could be an alternative for GPD for the analysis of financial ratios.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

Prediction of Potential Risk Posed by a Military Gunnery Range after Flood Control Reservoir Construction (홍수조절지 건설 후 사격장 주변지역의 위해성예측 사례연구)

  • Ryu, Hye-Rim;Han, Joon-Kyoung;Nam, Kyoung-Phile;Bae, Bum-Han
    • Journal of Soil and Groundwater Environment
    • /
    • v.12 no.1
    • /
    • pp.87-96
    • /
    • 2007
  • Risk assessment was carried out in order to improve the remediation and management strategy on a contaminated gunnery site, where a flood control reservoir is under construction nearby. Six chemicals, including explosive chemicals and heavy metals, which were suspected to possess risk to humans by leaching events from the site were the target pollutants for the assessment. A site-specific conceptual site model was constructed based on effective, reasonable exposure pathways to avoid any overestimation of the risk. Also, conservative default values were adapted to prevent underestimation of the risk when site-specific values were not available. The risks of the six contaminants were calculated by API's Decision Support System for Exposure and Risk Assessment with several assumptions. In the crater-formed-area(Ac), the non-carcinogenic risks(i.e., HI values) of TNT(Tri-Nitro-Toluene) and Cd were slightly larger than 1, and for RDX(Royal Demolition Explosives), over 50. The total non-carcinogenic risk of the whole gunnery range calculated to a significantly high value of 62.5. Carcinogenicity of Cd was estimated to be about $10^{-3}$, while that of Pb was about $5\;{\times}\;10^{-4}$, which greatly exceeded the generally acceptable carcinogenic risk level of $10^{-4}{\sim}10^{-6}$. The risk assessment results suggest that an immediate remediation practice for both carcinogens and non-carcinogens are required before the reservoir construction. However, for more accurate risk assessment, more specific estimations on condition shifts due to the construction of the reservoir are required, and more over, the effects of the pollutants to the ecosystem is also necessary to be evaluated.