• Title/Summary/Keyword: Default Prediction Model

Search Result 44, Processing Time 0.022 seconds

Corporate Bond Rating Using Various Multiclass Support Vector Machines (다양한 다분류 SVM을 적용한 기업채권평가)

  • Ahn, Hyun-Chul;Kim, Kyoung-Jae
    • Asia pacific journal of information systems
    • /
    • v.19 no.2
    • /
    • pp.157-178
    • /
    • 2009
  • Corporate credit rating is a very important factor in the market for corporate debt. Information concerning corporate operations is often disseminated to market participants through the changes in credit ratings that are published by professional rating agencies, such as Standard and Poor's (S&P) and Moody's Investor Service. Since these agencies generally require a large fee for the service, and the periodically provided ratings sometimes do not reflect the default risk of the company at the time, it may be advantageous for bond-market participants to be able to classify credit ratings before the agencies actually publish them. As a result, it is very important for companies (especially, financial companies) to develop a proper model of credit rating. From a technical perspective, the credit rating constitutes a typical, multiclass, classification problem because rating agencies generally have ten or more categories of ratings. For example, S&P's ratings range from AAA for the highest-quality bonds to D for the lowest-quality bonds. The professional rating agencies emphasize the importance of analysts' subjective judgments in the determination of credit ratings. However, in practice, a mathematical model that uses the financial variables of companies plays an important role in determining credit ratings, since it is convenient to apply and cost efficient. These financial variables include the ratios that represent a company's leverage status, liquidity status, and profitability status. Several statistical and artificial intelligence (AI) techniques have been applied as tools for predicting credit ratings. Among them, artificial neural networks are most prevalent in the area of finance because of their broad applicability to many business problems and their preeminent ability to adapt. However, artificial neural networks also have many defects, including the difficulty in determining the values of the control parameters and the number of processing elements in the layer as well as the risk of over-fitting. Of late, because of their robustness and high accuracy, support vector machines (SVMs) have become popular as a solution for problems with generating accurate prediction. An SVM's solution may be globally optimal because SVMs seek to minimize structural risk. On the other hand, artificial neural network models may tend to find locally optimal solutions because they seek to minimize empirical risk. In addition, no parameters need to be tuned in SVMs, barring the upper bound for non-separable cases in linear SVMs. Since SVMs were originally devised for binary classification, however they are not intrinsically geared for multiclass classifications as in credit ratings. Thus, researchers have tried to extend the original SVM to multiclass classification. Hitherto, a variety of techniques to extend standard SVMs to multiclass SVMs (MSVMs) has been proposed in the literature Only a few types of MSVM are, however, tested using prior studies that apply MSVMs to credit ratings studies. In this study, we examined six different techniques of MSVMs: (1) One-Against-One, (2) One-Against-AIL (3) DAGSVM, (4) ECOC, (5) Method of Weston and Watkins, and (6) Method of Crammer and Singer. In addition, we examined the prediction accuracy of some modified version of conventional MSVM techniques. To find the most appropriate technique of MSVMs for corporate bond rating, we applied all the techniques of MSVMs to a real-world case of credit rating in Korea. The best application is in corporate bond rating, which is the most frequently studied area of credit rating for specific debt issues or other financial obligations. For our study the research data were collected from National Information and Credit Evaluation, Inc., a major bond-rating company in Korea. The data set is comprised of the bond-ratings for the year 2002 and various financial variables for 1,295 companies from the manufacturing industry in Korea. We compared the results of these techniques with one another, and with those of traditional methods for credit ratings, such as multiple discriminant analysis (MDA), multinomial logistic regression (MLOGIT), and artificial neural networks (ANNs). As a result, we found that DAGSVM with an ordered list was the best approach for the prediction of bond rating. In addition, we found that the modified version of ECOC approach can yield higher prediction accuracy for the cases showing clear patterns.

Feasibility of Deep Learning Algorithms for Binary Classification Problems (이진 분류문제에서의 딥러닝 알고리즘의 활용 가능성 평가)

  • Kim, Kitae;Lee, Bomi;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.1
    • /
    • pp.95-108
    • /
    • 2017
  • Recently, AlphaGo which is Bakuk (Go) artificial intelligence program by Google DeepMind, had a huge victory against Lee Sedol. Many people thought that machines would not be able to win a man in Go games because the number of paths to make a one move is more than the number of atoms in the universe unlike chess, but the result was the opposite to what people predicted. After the match, artificial intelligence technology was focused as a core technology of the fourth industrial revolution and attracted attentions from various application domains. Especially, deep learning technique have been attracted as a core artificial intelligence technology used in the AlphaGo algorithm. The deep learning technique is already being applied to many problems. Especially, it shows good performance in image recognition field. In addition, it shows good performance in high dimensional data area such as voice, image and natural language, which was difficult to get good performance using existing machine learning techniques. However, in contrast, it is difficult to find deep leaning researches on traditional business data and structured data analysis. In this study, we tried to find out whether the deep learning techniques have been studied so far can be used not only for the recognition of high dimensional data but also for the binary classification problem of traditional business data analysis such as customer churn analysis, marketing response prediction, and default prediction. And we compare the performance of the deep learning techniques with that of traditional artificial neural network models. The experimental data in the paper is the telemarketing response data of a bank in Portugal. It has input variables such as age, occupation, loan status, and the number of previous telemarketing and has a binary target variable that records whether the customer intends to open an account or not. In this study, to evaluate the possibility of utilization of deep learning algorithms and techniques in binary classification problem, we compared the performance of various models using CNN, LSTM algorithm and dropout, which are widely used algorithms and techniques in deep learning, with that of MLP models which is a traditional artificial neural network model. However, since all the network design alternatives can not be tested due to the nature of the artificial neural network, the experiment was conducted based on restricted settings on the number of hidden layers, the number of neurons in the hidden layer, the number of output data (filters), and the application conditions of the dropout technique. The F1 Score was used to evaluate the performance of models to show how well the models work to classify the interesting class instead of the overall accuracy. The detail methods for applying each deep learning technique in the experiment is as follows. The CNN algorithm is a method that reads adjacent values from a specific value and recognizes the features, but it does not matter how close the distance of each business data field is because each field is usually independent. In this experiment, we set the filter size of the CNN algorithm as the number of fields to learn the whole characteristics of the data at once, and added a hidden layer to make decision based on the additional features. For the model having two LSTM layers, the input direction of the second layer is put in reversed position with first layer in order to reduce the influence from the position of each field. In the case of the dropout technique, we set the neurons to disappear with a probability of 0.5 for each hidden layer. The experimental results show that the predicted model with the highest F1 score was the CNN model using the dropout technique, and the next best model was the MLP model with two hidden layers using the dropout technique. In this study, we were able to get some findings as the experiment had proceeded. First, models using dropout techniques have a slightly more conservative prediction than those without dropout techniques, and it generally shows better performance in classification. Second, CNN models show better classification performance than MLP models. This is interesting because it has shown good performance in binary classification problems which it rarely have been applied to, as well as in the fields where it's effectiveness has been proven. Third, the LSTM algorithm seems to be unsuitable for binary classification problems because the training time is too long compared to the performance improvement. From these results, we can confirm that some of the deep learning algorithms can be applied to solve business binary classification problems.

Statistical Analysis of Extreme Values of Financial Ratios (재무비율의 극단치에 대한 통계적 분석)

  • Joo, Jihwan
    • Knowledge Management Research
    • /
    • v.22 no.2
    • /
    • pp.247-268
    • /
    • 2021
  • Investors mainly use PER and PBR among financial ratios for valuation and investment decision-making. I conduct an analysis of two basic financial ratios from a statistical perspective. Financial ratios contain key accounting numbers which reflect firm fundamentals and are useful for valuation or risk analysis such as enterprise credit evaluation and default prediction. The distribution of financial data tends to be extremely heavy-tailed, and PER and PBR show exceedingly high level of kurtosis and their extreme cases often contain significant information on financial risk. In this respect, Extreme Value Theory is required to fit its right tail more precisely. I introduce not only GPD but exGPD. GPD is conventionally preferred model in Extreme Value Theory and exGPD is log-transformed distribution of GPD. exGPD has recently proposed as an alternative of GPD(Lee and Kim, 2019). First, I conduct a simulation for comparing performances of the two distributions using the goodness of fit measures and the estimation of 90-99% percentiles. I also conduct an empirical analysis of Information Technology firms in Korea. Finally, exGPD shows better performance especially for PBR, suggesting that exGPD could be an alternative for GPD for the analysis of financial ratios.

Prediction of Potential Risk Posed by a Military Gunnery Range after Flood Control Reservoir Construction (홍수조절지 건설 후 사격장 주변지역의 위해성예측 사례연구)

  • Ryu, Hye-Rim;Han, Joon-Kyoung;Nam, Kyoung-Phile;Bae, Bum-Han
    • Journal of Soil and Groundwater Environment
    • /
    • v.12 no.1
    • /
    • pp.87-96
    • /
    • 2007
  • Risk assessment was carried out in order to improve the remediation and management strategy on a contaminated gunnery site, where a flood control reservoir is under construction nearby. Six chemicals, including explosive chemicals and heavy metals, which were suspected to possess risk to humans by leaching events from the site were the target pollutants for the assessment. A site-specific conceptual site model was constructed based on effective, reasonable exposure pathways to avoid any overestimation of the risk. Also, conservative default values were adapted to prevent underestimation of the risk when site-specific values were not available. The risks of the six contaminants were calculated by API's Decision Support System for Exposure and Risk Assessment with several assumptions. In the crater-formed-area(Ac), the non-carcinogenic risks(i.e., HI values) of TNT(Tri-Nitro-Toluene) and Cd were slightly larger than 1, and for RDX(Royal Demolition Explosives), over 50. The total non-carcinogenic risk of the whole gunnery range calculated to a significantly high value of 62.5. Carcinogenicity of Cd was estimated to be about $10^{-3}$, while that of Pb was about $5\;{\times}\;10^{-4}$, which greatly exceeded the generally acceptable carcinogenic risk level of $10^{-4}{\sim}10^{-6}$. The risk assessment results suggest that an immediate remediation practice for both carcinogens and non-carcinogens are required before the reservoir construction. However, for more accurate risk assessment, more specific estimations on condition shifts due to the construction of the reservoir are required, and more over, the effects of the pollutants to the ecosystem is also necessary to be evaluated.