• Title/Summary/Keyword: Bankruptcy prediction

Search Result 122, Processing Time 0.032 seconds

Randomized Bagging for Bankruptcy Prediction (랜덤화 배깅을 이용한 재무 부실화 예측)

  • Min, Sung-Hwan
    • Journal of Information Technology Services
    • /
    • v.15 no.1
    • /
    • pp.153-166
    • /
    • 2016
  • Ensemble classification is an approach that combines individually trained classifiers in order to improve prediction accuracy over individual classifiers. Ensemble techniques have been shown to be very effective in improving the generalization ability of the classifier. But base classifiers need to be as accurate and diverse as possible in order to enhance the generalization abilities of an ensemble model. Bagging is one of the most popular ensemble methods. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. In this study we proposed a new bagging variant ensemble model, Randomized Bagging (RBagging) for improving the standard bagging ensemble model. The proposed model was applied to the bankruptcy prediction problem using a real data set and the results were compared with those of the other models. The experimental results showed that the proposed model outperformed the standard bagging model.

Generating and Validating Synthetic Training Data for Predicting Bankruptcy of Individual Businesses

  • Hong, Dong-Suk;Baik, Cheol
    • Journal of information and communication convergence engineering
    • /
    • v.19 no.4
    • /
    • pp.228-233
    • /
    • 2021
  • In this study, we analyze the credit information (loan, delinquency information, etc.) of individual business owners to generate voluminous training data to establish a bankruptcy prediction model through a partial synthetic training technique. Furthermore, we evaluate the prediction performance of the newly generated data compared to the actual data. When using conditional tabular generative adversarial networks (CTGAN)-based training data generated by the experimental results (a logistic regression task), the recall is improved by 1.75 times compared to that obtained using the actual data. The probability that both the actual and generated data are sampled over an identical distribution is verified to be much higher than 80%. Providing artificial intelligence training data through data synthesis in the fields of credit rating and default risk prediction of individual businesses, which have not been relatively active in research, promotes further in-depth research efforts focused on utilizing such methods.

SOHO Bankruptcy Prediction Using Modified Bagging Predictors (Modified Bagging Predictors를 이용한 SOHO 부도 예측)

  • Kim, Seung-Hyuk;Kim, Jong-Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.13 no.2
    • /
    • pp.15-26
    • /
    • 2007
  • In this study, a SOHO (Small Office Home Office) bankruptcy prediction model is proposed using Modified Bagging Predictors which is modification of traditional Bagging Predictors. There have been several studies on bankruptcy prediction for large and middle size companies. However, little studies have been done for SOHOs. In commercial banks, loan approval processes for SOHOs are usually less structured than those for large and middle size companies, and largely depend on partial information such as credit scores. In this study, we use a real SOHO loan approval data set of a Korean bank. First, decision tree induction techniques and artificial neural networks are applied to the data set, and the results are not satisfactory. Bagging Predictors which has been not previously applied for bankruptcy prediction and Modified Bagging Predictors which is proposed in this paper are applied to the data set. The experimental results show that Modified Bagging Predictors provides better performance than decision tree inductions techniques, artificial neural networks, and Bagging Predictors.

  • PDF

Investigating Dynamic Mutation Process of Issues Using Unstructured Text Analysis (부도예측을 위한 KNN 앙상블 모형의 동시 최적화)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.22 no.1
    • /
    • pp.139-157
    • /
    • 2016
  • Bankruptcy involves considerable costs, so it can have significant effects on a country's economy. Thus, bankruptcy prediction is an important issue. Over the past several decades, many researchers have addressed topics associated with bankruptcy prediction. Early research on bankruptcy prediction employed conventional statistical methods such as univariate analysis, discriminant analysis, multiple regression, and logistic regression. Later on, many studies began utilizing artificial intelligence techniques such as inductive learning, neural networks, and case-based reasoning. Currently, ensemble models are being utilized to enhance the accuracy of bankruptcy prediction. Ensemble classification involves combining multiple classifiers to obtain more accurate predictions than those obtained using individual models. Ensemble learning techniques are known to be very useful for improving the generalization ability of the classifier. Base classifiers in the ensemble must be as accurate and diverse as possible in order to enhance the generalization ability of an ensemble model. Commonly used methods for constructing ensemble classifiers include bagging, boosting, and random subspace. The random subspace method selects a random feature subset for each classifier from the original feature space to diversify the base classifiers of an ensemble. Each ensemble member is trained by a randomly chosen feature subspace from the original feature set, and predictions from each ensemble member are combined by an aggregation method. The k-nearest neighbors (KNN) classifier is robust with respect to variations in the dataset but is very sensitive to changes in the feature space. For this reason, KNN is a good classifier for the random subspace method. The KNN random subspace ensemble model has been shown to be very effective for improving an individual KNN model. The k parameter of KNN base classifiers and selected feature subsets for base classifiers play an important role in determining the performance of the KNN ensemble model. However, few studies have focused on optimizing the k parameter and feature subsets of base classifiers in the ensemble. This study proposed a new ensemble method that improves upon the performance KNN ensemble model by optimizing both k parameters and feature subsets of base classifiers. A genetic algorithm was used to optimize the KNN ensemble model and improve the prediction accuracy of the ensemble model. The proposed model was applied to a bankruptcy prediction problem by using a real dataset from Korean companies. The research data included 1800 externally non-audited firms that filed for bankruptcy (900 cases) or non-bankruptcy (900 cases). Initially, the dataset consisted of 134 financial ratios. Prior to the experiments, 75 financial ratios were selected based on an independent sample t-test of each financial ratio as an input variable and bankruptcy or non-bankruptcy as an output variable. Of these, 24 financial ratios were selected by using a logistic regression backward feature selection method. The complete dataset was separated into two parts: training and validation. The training dataset was further divided into two portions: one for the training model and the other to avoid overfitting. The prediction accuracy against this dataset was used to determine the fitness value in order to avoid overfitting. The validation dataset was used to evaluate the effectiveness of the final model. A 10-fold cross-validation was implemented to compare the performances of the proposed model and other models. To evaluate the effectiveness of the proposed model, the classification accuracy of the proposed model was compared with that of other models. The Q-statistic values and average classification accuracies of base classifiers were investigated. The experimental results showed that the proposed model outperformed other models, such as the single model and random subspace ensemble model.

Bankruptcy Prediction Model with AR process (AR 프로세스를 이용한 도산예측모형)

  • 이군희;지용희
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.26 no.1
    • /
    • pp.109-116
    • /
    • 2001
  • The detection of corporate failures is a subject that has been particularly amenable to cross-sectional financial ratio analysis. In most of firms, however, the financial data are available over past years. Because of this, a model utilizing these longitudinal data could provide useful information on the prediction of bankruptcy. To correctly reflect the longitudinal and firm-specific data, the generalized linear model with assuming the first order AR(autoregressive) process is proposed. The method is motivated by the clinical research that several characteristics are measured repeatedly from individual over the time. The model is compared with several other predictive models to evaluate the performance. By using the financial data from manufacturing corporations in the Korea Stock Exchange (KSE) list, we will discuss some experiences learned from the procedure of sampling scheme, variable transformation, imputation, variable selection, and model evaluation. Finally, implications of the model with repeated measurement and future direction of research will be discussed.

  • PDF

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

An Empirical Analysis of Boosing of Neural Networks for Bankruptcy Prediction (부스팅 인공신경망학습의 기업부실예측 성과비교)

  • Kim, Myoung-Jong;Kang, Dae-Ki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.1
    • /
    • pp.63-69
    • /
    • 2010
  • Ensemble is one of widely used methods for improving the performance of classification and prediction models. Two popular ensemble methods, Bagging and Boosting, have been applied with great success to various machine learning problems using mostly decision trees as base classifiers. This paper performs an empirical comparison of Boosted neural networks and traditional neural networks on bankruptcy prediction tasks. Experimental results on Korean firms indicated that the boosted neural networks showed the improved performance over traditional neural networks.

Support vector machines with optimal instance selection: An application to bankruptcy prediction

  • Ahn Hyun-Chul;Kim Kyoung-Jae;Han In-Goo
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2006.06a
    • /
    • pp.167-175
    • /
    • 2006
  • Building accurate corporate bankruptcy prediction models has been one of the most important research issues in finance. Recently, support vector machines (SVMs) are popularly applied to bankruptcy prediction because of its many strong points. However, in order to use SVM, a modeler should determine several factors by heuristics, which hinders from obtaining accurate prediction results by using SVM. As a result, some researchers have tried to optimize these factors, especially the feature subset and kernel parameters of SVM But, there have been no studies that have attempted to determine appropriate instance subset of SVM, although it may improve the performance by eliminating distorted cases. Thus in the study, we propose the simultaneous optimization of the instance selection as well as the parameters of a kernel function of SVM by using genetic algorithms (GAs). Experimental results show that our model outperforms not only conventional SVM, but also prior approaches for optimizing SVM.

  • PDF

Bankruptcy Prdiction Based on Limited Data of Artificial neural Network -in Textiles and Clothing Industries- (한정된 데이타하에서 인공신경망을 이용한 기업도산예측-섬유 및 의류산업을 중심으로-)

  • 피종호;김승권
    • Proceedings of the Korean Operations and Management Science Society Conference
    • /
    • 1996.04a
    • /
    • pp.733-736
    • /
    • 1996
  • Neural Network(NN) is known to be suitable for forecasting corporate bankruptcy because of discriminant capability. Bankruptcy prediciton on NN by now has mostly been studied based on financial indices at specific point of time. However, the financial profile of corporates fluctuates within a certain range with the elapse of time. Besides, we need a lot of data of different bankrupt types in order to apply NN for better bankruptcy prediciton. Therefore, we have decided to focus on textiles and clothing industries for bankruptcy prediction with limited data. One part of the collected data was used for training and calibration, and the other was used for verification. The model makes a learning with extended data from financial indices at specific point of time. The trained model has been tested and we could get a high hitting ratio relatively.

  • PDF

Investigation and Empirical Validation of Industry Uncertainty Risk Factors Impacting on Bankruptcy Risk of the Firm (기업부도위험에 영향을 미치는 산업 불확실성 위험요인의 탐색과 실증 분석)

  • Han, Hyun-Soo;Park, Keun-Young
    • Korean Management Science Review
    • /
    • v.33 no.3
    • /
    • pp.105-117
    • /
    • 2016
  • In this paper, we present empirical testing result to examine the validity of inbound supply and outbound demand risk factors in the sense of early predicting the firm's bankruptcy risk level. The risk factors are drawn from industry uncertainty attributes categorized as uncertainties of input market (inbound supply), and product market (outbound demand). On the basis of input-output table, industry level inbound and outbound sectors are identified to formalize supply chain structures, relevant inbound and outbound uncertainty attributes and corresponding risk factors. Subsequently, publicly available macro-economic indicators are used to appropriately quantify these risk factors. Total 68 industry level bankruptcy risk forecasting results are presented with the average R-square scores of between 53.4% and 37.1% with varying time lag. The findings offers useful insights to incorporate supply chain risk to the body of firm's bankruptcy risk level prediction literature.