• Title/Summary/Keyword: Profitability prediction model

Search Result 34, Processing Time 0.025 seconds

Bankruptcy prediction using an improved bagging ensemble (개선된 배깅 앙상블을 활용한 기업부도예측)

  • Min, Sung-Hwan
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.121-139
    • /
    • 2014
  • Predicting corporate failure has been an important topic in accounting and finance. The costs associated with bankruptcy are high, so the accuracy of bankruptcy prediction is greatly important for financial institutions. Lots of researchers have dealt with the topic associated with bankruptcy prediction in the past three decades. The current research attempts to use ensemble models for improving the performance of bankruptcy prediction. Ensemble classification is to combine individually trained classifiers in order to gain more accurate prediction than individual models. Ensemble techniques are shown to be very useful for improving the generalization ability of the classifier. Bagging is the most commonly used methods for constructing ensemble classifiers. In bagging, the different training data subsets are randomly drawn with replacement from the original training dataset. Base classifiers are trained on the different bootstrap samples. Instance selection is to select critical instances while deleting and removing irrelevant and harmful instances from the original set. Instance selection and bagging are quite well known in data mining. However, few studies have dealt with the integration of instance selection and bagging. This study proposes an improved bagging ensemble based on instance selection using genetic algorithms (GA) for improving the performance of SVM. GA is an efficient optimization procedure based on the theory of natural selection and evolution. GA uses the idea of survival of the fittest by progressively accepting better solutions to the problems. GA searches by maintaining a population of solutions from which better solutions are created rather than making incremental changes to a single solution to the problem. The initial solution population is generated randomly and evolves into the next generation by genetic operators such as selection, crossover and mutation. The solutions coded by strings are evaluated by the fitness function. The proposed model consists of two phases: GA based Instance Selection and Instance based Bagging. In the first phase, GA is used to select optimal instance subset that is used as input data of bagging model. In this study, the chromosome is encoded as a form of binary string for the instance subset. In this phase, the population size was set to 100 while maximum number of generations was set to 150. We set the crossover rate and mutation rate to 0.7 and 0.1 respectively. We used the prediction accuracy of model as the fitness function of GA. SVM model is trained on training data set using the selected instance subset. The prediction accuracy of SVM model over test data set is used as fitness value in order to avoid overfitting. In the second phase, we used the optimal instance subset selected in the first phase as input data of bagging model. We used SVM model as base classifier for bagging ensemble. The majority voting scheme was used as a combining method in this study. This study applies the proposed model to the bankruptcy prediction problem using a real data set from Korean companies. The research data used in this study contains 1832 externally non-audited firms which filed for bankruptcy (916 cases) and non-bankruptcy (916 cases). Financial ratios categorized as stability, profitability, growth, activity and cash flow were investigated through literature review and basic statistical methods and we selected 8 financial ratios as the final input variables. We separated the whole data into three subsets as training, test and validation data set. In this study, we compared the proposed model with several comparative models including the simple individual SVM model, the simple bagging model and the instance selection based SVM model. The McNemar tests were used to examine whether the proposed model significantly outperforms the other models. The experimental results show that the proposed model outperforms the other models.

A Methodology of Customer Churn Prediction based on Two-Dimensional Loyalty Segmentation (이차원 고객충성도 세그먼트 기반의 고객이탈예측 방법론)

  • Kim, Hyung Su;Hong, Seung Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.4
    • /
    • pp.111-126
    • /
    • 2020
  • Most industries have recently become aware of the importance of customer lifetime value as they are exposed to a competitive environment. As a result, preventing customers from churn is becoming a more important business issue than securing new customers. This is because maintaining churn customers is far more economical than securing new customers, and in fact, the acquisition cost of new customers is known to be five to six times higher than the maintenance cost of churn customers. Also, Companies that effectively prevent customer churn and improve customer retention rates are known to have a positive effect on not only increasing the company's profitability but also improving its brand image by improving customer satisfaction. Predicting customer churn, which had been conducted as a sub-research area for CRM, has recently become more important as a big data-based performance marketing theme due to the development of business machine learning technology. Until now, research on customer churn prediction has been carried out actively in such sectors as the mobile telecommunication industry, the financial industry, the distribution industry, and the game industry, which are highly competitive and urgent to manage churn. In addition, These churn prediction studies were focused on improving the performance of the churn prediction model itself, such as simply comparing the performance of various models, exploring features that are effective in forecasting departures, or developing new ensemble techniques, and were limited in terms of practical utilization because most studies considered the entire customer group as a group and developed a predictive model. As such, the main purpose of the existing related research was to improve the performance of the predictive model itself, and there was a relatively lack of research to improve the overall customer churn prediction process. In fact, customers in the business have different behavior characteristics due to heterogeneous transaction patterns, and the resulting churn rate is different, so it is unreasonable to assume the entire customer as a single customer group. Therefore, it is desirable to segment customers according to customer classification criteria, such as loyalty, and to operate an appropriate churn prediction model individually, in order to carry out effective customer churn predictions in heterogeneous industries. Of course, in some studies, there are studies in which customers are subdivided using clustering techniques and applied a churn prediction model for individual customer groups. Although this process of predicting churn can produce better predictions than a single predict model for the entire customer population, there is still room for improvement in that clustering is a mechanical, exploratory grouping technique that calculates distances based on inputs and does not reflect the strategic intent of an entity such as loyalties. This study proposes a segment-based customer departure prediction process (CCP/2DL: Customer Churn Prediction based on Two-Dimensional Loyalty segmentation) based on two-dimensional customer loyalty, assuming that successful customer churn management can be better done through improvements in the overall process than through the performance of the model itself. CCP/2DL is a series of churn prediction processes that segment two-way, quantitative and qualitative loyalty-based customer, conduct secondary grouping of customer segments according to churn patterns, and then independently apply heterogeneous churn prediction models for each churn pattern group. Performance comparisons were performed with the most commonly applied the General churn prediction process and the Clustering-based churn prediction process to assess the relative excellence of the proposed churn prediction process. The General churn prediction process used in this study refers to the process of predicting a single group of customers simply intended to be predicted as a machine learning model, using the most commonly used churn predicting method. And the Clustering-based churn prediction process is a method of first using clustering techniques to segment customers and implement a churn prediction model for each individual group. In cooperation with a global NGO, the proposed CCP/2DL performance showed better performance than other methodologies for predicting churn. This churn prediction process is not only effective in predicting churn, but can also be a strategic basis for obtaining a variety of customer observations and carrying out other related performance marketing activities.

Probabilistic Location Choice and Markovian Industrial Migration a Micro-Macro Composition Approach

  • Jeong, Jin-Ho
    • Journal of the Korean Regional Science Association
    • /
    • v.11 no.1
    • /
    • pp.31-60
    • /
    • 1995
  • The distribution of economic activity over a mutually exclusive and exhaustive categorical industry-region matrix is modeled as a composition of two random components: the probability-like share distribution of jobs and the dynamic evolution of absolute aggregates. The former describes the individual activity location choice by comparing the predicted profitability of the current industry-region pair against that of all other alternatives based on the available information on industry-specific, region specific, or activity specific attributes. The latter describes the time evolution of macro-level aggregates using a dynamic reduced from model. With the seperation of micro choice behavior and macro dynamic aggregate constraint, the usual independence and identicality assumptions become consistent with the activity share distribution, hence multi-regional industrial migration can be represented by a set of probability evolution equations in a conservative Markovian from. We call this a Micro-Macro Composition Approach since the product of the aggregate prediction and the predicted activity share distribution gives the predicted activity distribution gives the predicted activity distribution which explicitly considers the underlying individual choice behavior. The model can be applied to interesting practical problems such as the plant location choice of multinational enterprise, the government industrial ploicy to attract international firms, and the optimal tax-transfer mix to influence activity location choice. We consider the latter as an example.

  • PDF

Development of a Resort's Cross-selling Prediction Model and Its Interpretation using SHAP (리조트 교차판매 예측모형 개발 및 SHAP을 이용한 해석)

  • Boram Kang;Hyunchul Ahn
    • The Journal of Bigdata
    • /
    • v.7 no.2
    • /
    • pp.195-204
    • /
    • 2022
  • The tourism industry is facing a crisis due to the recent COVID-19 pandemic, and it is vital to improving profitability to overcome it. In situations such as COVID-19, it would be more efficient to sell additional products other than guest rooms to customers who have visited to increase the unit price rather than adopting an aggressive sales strategy to increase room occupancy to increase profits. Previous tourism studies have used machine learning techniques for demand forecasting, but there have been few studies on cross-selling forecasting. Also, in a broader sense, a resort is the same accommodation industry as a hotel. However, there is no study specialized in the resort industry, which is operated based on a membership system and has facilities suitable for lodging and cooking. Therefore, in this study, we propose a cross-selling prediction model using various machine learning techniques with an actual resort company's accommodation data. In addition, by applying the explainable artificial intelligence XAI(eXplainable AI) technique, we intend to interpret what factors affect cross-selling and confirm how they affect cross-selling through empirical analysis.

The Comparative Analysis of Financial Factors that influence on Corporate's Survival and Bankruptcy : Before and After Foreign Exchange Crisis in Korea (기업의 생존과 도산에 영향을 미치는 재무요인에 대한 실증분석 : 우리나라 외환위기 전.후 비교)

  • Bae, Young-Im;Song, Sung-Hwan;Hong, Soon-Ki;Yu, Sung-Yoon
    • IE interfaces
    • /
    • v.21 no.4
    • /
    • pp.385-393
    • /
    • 2008
  • Corporate's survival or bankruptcy has been determined by interaction of macroeconomic environment, industrial dynamic environment and internal process of corporate. This study attempts to examine financial factors' differences that have influence on corporate's survival or bankruptcy before and after foreign exchange crisis in Korea. The first previous empirical study that researched the cause of corporate's survival or bankruptcy in the financial ratios was attempted by Altman in 1968. Recently various survival analysis models have been published. In this paper, Multiple Discriminant Analysis model is used. We divide analytical periods into before and after foreign exchange crisis and sample randomly survival or bankruptcy firms for each period. Independent variables are financial ratios which represent growth, profitability, activity, liquidity and productivity. In conclusion, this paper examines hypothesis as "There are differences of significant financial factors before and after foreign exchange crisis."

Study on Predicting the Designation of Administrative Issue in the KOSDAQ Market Based on Machine Learning Based on Financial Data (머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구: 재무적 데이터를 중심으로)

  • Yoon, Yanghyun;Kim, Taekyung;Kim, Suyeong
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.17 no.1
    • /
    • pp.229-249
    • /
    • 2022
  • This paper investigates machine learning models for predicting the designation of administrative issues in the KOSDAQ market through various techniques. When a company in the Korean stock market is designated as administrative issue, the market recognizes the event itself as negative information, causing losses to the company and investors. The purpose of this study is to evaluate alternative methods for developing a artificial intelligence service to examine a possibility to the designation of administrative issues early through the financial ratio of companies and to help investors manage portfolio risks. In this study, the independent variables used 21 financial ratios representing profitability, stability, activity, and growth. From 2011 to 2020, when K-IFRS was applied, financial data of companies in administrative issues and non-administrative issues stocks are sampled. Logistic regression analysis, decision tree, support vector machine, random forest, and LightGBM are used to predict the designation of administrative issues. According to the results of analysis, LightGBM with 82.73% classification accuracy is the best prediction model, and the prediction model with the lowest classification accuracy is a decision tree with 71.94% accuracy. As a result of checking the top three variables of the importance of variables in the decision tree-based learning model, the financial variables common in each model are ROE(Net profit) and Capital stock turnover ratio, which are relatively important variables in designating administrative issues. In general, it is confirmed that the learning model using the ensemble had higher predictive performance than the single learning model.

Risk-based Profit Prediction Model for International Construction Projects (해외건설공사의 리스크 분석에 기초한 수익성 예측모델에 관한 연구)

  • Han, Seung-Heon;Kim, Du-Yon
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.26 no.4D
    • /
    • pp.635-647
    • /
    • 2006
  • Korean construction companies first advanced to the international markets in 1960's and so far have brought more than 4,900 projects which account for 193 billion dollars approximately. With the large increase of national employment and income being followed by the achievement, Korea's construction industry has made an enormous contribution to the improvement of domestic economy for the last 40 years. However, recently the increased risk in international markets as well as the sharpening competition with foreign companies promising in terms of advanced technologies and low labor cost have been driving Korean construction away from the market shares. According to ENR (Engineering News Record, 1994~2003), it is revealed that 15.1% of top 225 global contractors are suffering from loss in international construction markets. This phenomenon is largely due to the highly uncertain characteristics of international projects, which are inherently exposed to various and complicated risky situations. Furthermore, especially for Korean construction companies, it is often the case that the failure in an international construction project cannot be offset by even a sufficient number of successful domestic achievements. Therefore, not only the selective screening among the nominated projects which have strong possibility of collapse but the systematic strategies for controlling potential risk factors are also considered indispensable in international construction portfolio management. The purpose of this study is to first analyze the causal relationships of the profit-influencing variables and the project success, and develop the profitability forecasting model in international construction projects.

A study on the prediction of korean NPL market return (한국 NPL시장 수익률 예측에 관한 연구)

  • Lee, Hyeon Su;Jeong, Seung Hwan;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.123-139
    • /
    • 2019
  • The Korean NPL market was formed by the government and foreign capital shortly after the 1997 IMF crisis. However, this market is short-lived, as the bad debt has started to increase after the global financial crisis in 2009 due to the real economic recession. NPL has become a major investment in the market in recent years when the domestic capital market's investment capital began to enter the NPL market in earnest. Although the domestic NPL market has received considerable attention due to the overheating of the NPL market in recent years, research on the NPL market has been abrupt since the history of capital market investment in the domestic NPL market is short. In addition, decision-making through more scientific and systematic analysis is required due to the decline in profitability and the price fluctuation due to the fluctuation of the real estate business. In this study, we propose a prediction model that can determine the achievement of the benchmark yield by using the NPL market related data in accordance with the market demand. In order to build the model, we used Korean NPL data from December 2013 to December 2017 for about 4 years. The total number of things data was 2291. As independent variables, only the variables related to the dependent variable were selected for the 11 variables that indicate the characteristics of the real estate. In order to select the variables, one to one t-test and logistic regression stepwise and decision tree were performed. Seven independent variables (purchase year, SPC (Special Purpose Company), municipality, appraisal value, purchase cost, OPB (Outstanding Principle Balance), HP (Holding Period)). The dependent variable is a bivariate variable that indicates whether the benchmark rate is reached. This is because the accuracy of the model predicting the binomial variables is higher than the model predicting the continuous variables, and the accuracy of these models is directly related to the effectiveness of the model. In addition, in the case of a special purpose company, whether or not to purchase the property is the main concern. Therefore, whether or not to achieve a certain level of return is enough to make a decision. For the dependent variable, we constructed and compared the predictive model by calculating the dependent variable by adjusting the numerical value to ascertain whether 12%, which is the standard rate of return used in the industry, is a meaningful reference value. As a result, it was found that the hit ratio average of the predictive model constructed using the dependent variable calculated by the 12% standard rate of return was the best at 64.60%. In order to propose an optimal prediction model based on the determined dependent variables and 7 independent variables, we construct a prediction model by applying the five methodologies of discriminant analysis, logistic regression analysis, decision tree, artificial neural network, and genetic algorithm linear model we tried to compare them. To do this, 10 sets of training data and testing data were extracted using 10 fold validation method. After building the model using this data, the hit ratio of each set was averaged and the performance was compared. As a result, the hit ratio average of prediction models constructed by using discriminant analysis, logistic regression model, decision tree, artificial neural network, and genetic algorithm linear model were 64.40%, 65.12%, 63.54%, 67.40%, and 60.51%, respectively. It was confirmed that the model using the artificial neural network is the best. Through this study, it is proved that it is effective to utilize 7 independent variables and artificial neural network prediction model in the future NPL market. The proposed model predicts that the 12% return of new things will be achieved beforehand, which will help the special purpose companies make investment decisions. Furthermore, we anticipate that the NPL market will be liquidated as the transaction proceeds at an appropriate price.

The Determinants of New Supply in the Seoul Office Market and their Dynamic Relationship (서울 오피스 신규 공급 결정요인과 동태적 관계분석)

  • Yang, Hye-Seon;Kang, Chang-Deok
    • Journal of Cadastre & Land InformatiX
    • /
    • v.47 no.2
    • /
    • pp.159-174
    • /
    • 2017
  • The long-term imbalances between supply and demand in office market can weaken urban growth since excessive supply of offices led to office market instability and excessive demand of offices weakens growth of urban industry. Recently, there have been a lot of new large-scale supplies, which increased volatility in Seoul office market. Nevertheless, new supply of Seoul office has not been fully examined. Given this, the focus of this article was on confirming the influences of profitability, replacement cost, and demand on new office supplies in Seoul. In examining those influences, another focus was on their relative influences over time. For these purposes, we analyzed quarterly data of Seoul office market between 2003 and 2015 using a vector error correction model (VECM). As a result, in terms of the influences on the current new supply, the impact of supply before the first quarter was negative, while that of office employment before the first quarter was positive. Also, that of interest rate before the second quarter was positive, while those of cap rate before the first quarter and cap rate before the second quarter were negative. Based on the findings, it is suggested that prediction models on Seoul offices need to be developed considering the influences of profitability, replacement cost, and demand on new office supplies in Seoul.

Estimation of genetic correlations and genomic prediction accuracy for reproductive and carcass traits in Hanwoo cows

  • Md Azizul Haque;Asif Iqbal;Mohammad Zahangir Alam;Yun-Mi Lee;Jae-Jung Ha;Jong-Joo Kim
    • Journal of Animal Science and Technology
    • /
    • v.66 no.4
    • /
    • pp.682-701
    • /
    • 2024
  • This study estimated the heritabilities (h2) and genetic and phenotypic correlations between reproductive traits, including calving interval (CI), age at first calving (AFC), gestation length (GL), number of artificial inseminations per conception (NAIPC), and carcass traits, including carcass weight (CWT), eye muscle area (EMA), backfat thickness (BF), and marbling score (MS) in Korean Hanwoo cows. In addition, the accuracy of genomic predictions of breeding values was evaluated by applying the genomic best linear unbiased prediction (GBLUP) and the weighted GBLUP (WGBLUP) method. The phenotypic data for reproductive and carcass traits were collected from 1,544 Hanwoo cows, and all animals were genotyped using Illumina Bovine 50K single nucleotide polymorphism (SNP) chip. The genetic parameters were estimated using a multi-trait animal model using the MTG2 program. The estimated h2 for CI, AFC, GL, NAIPC, CWT, EMA, BF, and MS were 0.10, 0.13, 0.17, 0.11, 0.37, 0.35, 0.27, and 0.45, respectively, according to the GBLUP model. The GBLUP accuracy estimates ranged from 0.51 to 0.74, while the WGBLUP accuracy estimates for the traits under study ranged from 0.51 to 0.79. Strong and favorable genetic correlations were observed between GL and NAIPC (0.61), CWT and EMA (0.60), NAIPC and CWT (0.49), AFC and CWT (0.48), CI and GL (0.36), BF and MS (0.35), NAIPC and EMA (0.35), CI and BF (0.30), EMA and MS (0.28), CI and AFC (0.26), AFC and EMA (0.24), and AFC and BF (0.21). The present study identified low to moderate positive genetic correlations between reproductive and CWT traits, suggesting that a heavier body weight may lead to a longer CI, AFC, GL, and NAIPC. The moderately positive genetic correlation between CWT and AFC, and NAIPC, with a phenotypic correlation of nearly zero, suggesting that the genotype-environment interactions are more likely to be responsible for the phenotypic manifestation of these traits. As a result, the inclusion of these traits by breeders as selection criteria may present a good opportunity for developing a selection index to increase the response to the selection and identification of candidate animals, which can result in significantly increased profitability of production systems.