• Title/Summary/Keyword: Rating Prediction

Search Result 198, Processing Time 0.022 seconds

Performance Evaluation and Forecasting Model for Retail Institutions (유통업체의 부실예측모형 개선에 관한 연구)

  • Kim, Jung-Uk
    • Journal of Distribution Science
    • /
    • v.12 no.11
    • /
    • pp.77-83
    • /
    • 2014
  • Purpose - The National Agricultural Cooperative Federation of Korea and National Fisheries Cooperative Federation of Korea have prosecuted both financial and retail businesses. As cooperatives are public institutions and receive government support, their sound management is required by the Financial Supervisory Service in Korea. This is mainly managed by CAEL, which is changed by CAMEL. However, NFFC's business section, managing the finance and retail businesses, is unified and evaluated; the CAEL model has an insufficient classification to evaluate the retail industry. First, there is discrimination power as regards CAEL. Although the retail business sector union can receive a higher rating on a CAEL model, defaults have often been reported. Therefore, a default prediction model is needed to support a CAEL model. As we have the default prediction model using a subdivision of indexes and statistical methods, it can be useful to have a prevention function through the estimation of the retail sector's default probability. Second, separating the difference between the finance and retail business sectors is necessary. Their businesses have different characteristics. Based on various management indexes that have been systematically managed by the National Fisheries Cooperative Federation of Korea, our model predicts retail default, and is better than the CAEL model in its failure prediction because it has various discriminative financial ratios reflecting the retail industry situation. Research design, data, and methodology - The model to predict retail default was presented using logistic analysis. To develop the predictive model, we use the retail financial statements of the NFCF. We consider 93 unions each year from 2006 to 2012 to select confident management indexes. We also adapted the statistical power analysis that is a t-test, logit analysis, AR (accuracy ratio), and AUROC (Area Under Receiver Operating Characteristic) analysis. Finally, through the multivariate logistic model, we show that it is excellent in its discrimination power and higher in its hit ratio for default prediction. We also evaluate its usefulness. Results - The statistical power analysis using the AR (AUROC) method on the short term model shows that the logistic model has excellent discrimination power, with 84.6%. Further, it is higher in its hit ratio for failure (prediction) of total model, at 94%, indicating that it is temporally stable and useful for evaluating the management status of retail institutions. Conclusions - This model is useful for evaluating the management status of retail union institutions. First, subdividing CAEL evaluation is required. The existing CAEL evaluation is underdeveloped, and discrimination power falls. Second, efforts to develop a varied and rational management index are continuously required. An index reflecting retail industry characteristics needs to be developed. However, extending this study will need the following. First, it will require a complementary default model reflecting size differences. Second, in the case of small and medium retail, it will need non-financial information. Therefore, it will be a hybrid default model reflecting financial and non-financial information.

Machine learning-based corporate default risk prediction model verification and policy recommendation: Focusing on improvement through stacking ensemble model (머신러닝 기반 기업부도위험 예측모델 검증 및 정책적 제언: 스태킹 앙상블 모델을 통한 개선을 중심으로)

  • Eom, Haneul;Kim, Jaeseong;Choi, Sangok
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.2
    • /
    • pp.105-129
    • /
    • 2020
  • This study uses corporate data from 2012 to 2018 when K-IFRS was applied in earnest to predict default risks. The data used in the analysis totaled 10,545 rows, consisting of 160 columns including 38 in the statement of financial position, 26 in the statement of comprehensive income, 11 in the statement of cash flows, and 76 in the index of financial ratios. Unlike most previous prior studies used the default event as the basis for learning about default risk, this study calculated default risk using the market capitalization and stock price volatility of each company based on the Merton model. Through this, it was able to solve the problem of data imbalance due to the scarcity of default events, which had been pointed out as the limitation of the existing methodology, and the problem of reflecting the difference in default risk that exists within ordinary companies. Because learning was conducted only by using corporate information available to unlisted companies, default risks of unlisted companies without stock price information can be appropriately derived. Through this, it can provide stable default risk assessment services to unlisted companies that are difficult to determine proper default risk with traditional credit rating models such as small and medium-sized companies and startups. Although there has been an active study of predicting corporate default risks using machine learning recently, model bias issues exist because most studies are making predictions based on a single model. Stable and reliable valuation methodology is required for the calculation of default risk, given that the entity's default risk information is very widely utilized in the market and the sensitivity to the difference in default risk is high. Also, Strict standards are also required for methods of calculation. The credit rating method stipulated by the Financial Services Commission in the Financial Investment Regulations calls for the preparation of evaluation methods, including verification of the adequacy of evaluation methods, in consideration of past statistical data and experiences on credit ratings and changes in future market conditions. This study allowed the reduction of individual models' bias by utilizing stacking ensemble techniques that synthesize various machine learning models. This allows us to capture complex nonlinear relationships between default risk and various corporate information and maximize the advantages of machine learning-based default risk prediction models that take less time to calculate. To calculate forecasts by sub model to be used as input data for the Stacking Ensemble model, training data were divided into seven pieces, and sub-models were trained in a divided set to produce forecasts. To compare the predictive power of the Stacking Ensemble model, Random Forest, MLP, and CNN models were trained with full training data, then the predictive power of each model was verified on the test set. The analysis showed that the Stacking Ensemble model exceeded the predictive power of the Random Forest model, which had the best performance on a single model. Next, to check for statistically significant differences between the Stacking Ensemble model and the forecasts for each individual model, the Pair between the Stacking Ensemble model and each individual model was constructed. Because the results of the Shapiro-wilk normality test also showed that all Pair did not follow normality, Using the nonparametric method wilcoxon rank sum test, we checked whether the two model forecasts that make up the Pair showed statistically significant differences. The analysis showed that the forecasts of the Staging Ensemble model showed statistically significant differences from those of the MLP model and CNN model. In addition, this study can provide a methodology that allows existing credit rating agencies to apply machine learning-based bankruptcy risk prediction methodologies, given that traditional credit rating models can also be reflected as sub-models to calculate the final default probability. Also, the Stacking Ensemble techniques proposed in this study can help design to meet the requirements of the Financial Investment Business Regulations through the combination of various sub-models. We hope that this research will be used as a resource to increase practical use by overcoming and improving the limitations of existing machine learning-based models.

A Comparison of Urban Growth Probability Maps using Frequency Ratio and Logistic Regression Methods

  • Park, So-Young;Jin, Cheung-Kil;Kim, Shin-Yup;Jo, Gyung-Cheol;Choi, Chul-Uong
    • Journal of the Korean Institute of Landscape Architecture
    • /
    • v.38 no.5_2
    • /
    • pp.194-205
    • /
    • 2010
  • To predict urban growth according to changes in landcover, probability factors werecal culated and mapped. Topographic, geographic and social and political factors were used as prediction variables for constructing probability maps of urban growth. Urban growth-related factors included elevation, slope, aspect, distance from road,road ratio, distance from the main city, land cover, environmental rating and legislative rating. Accounting for these factors, probability maps of urban growth were constr uctedusing frequency ratio (FR) and logistic regression (LR) methods and the effectiveness of the results was verified by the relative operating characteristic (ROC). ROC values of the urban growth probability index (UGPI) maps by the FR and LR models were 0.937 and 0.940, respectively. The LR map had a slightly higher ROC value than the FR map, but the numerical difference was slight, with both models showing similar results. The FR model is the simplest tool for probability analysis of urban growth, providing a faster and easier calculation process than other available tools. Additionally, the results can be easily interpreted. In contrast, for the LR model, only a limited amount of input data can be processed by the statistical program and a separate conversion process for input and output data is necessary. In conclusion, although the FR model is the simplest way to analyze the probability of urban growth, the LR model is more appropriate because it allows for quantitative analysis.

Improvement on Similarity Calculation in Collaborative Filtering Recommendation using Demographic Information (인구 통계 정보를 이용한 협업 여과 추천의 유사도 개선 기법)

  • 이용준;이세훈;왕창종
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.9 no.5
    • /
    • pp.521-529
    • /
    • 2003
  • In this paper we present an improved method by using demographic information for overcoming the similarity miss-calculation from the sparsity problem in collaborative filtering recommendation systems. The similarity between a pair of users is only determined by the ratings given to co-rated items, so items that have not been rated by both users are ignored. To solve this problem, we add virtual neighbor's rating using demographic information of neighbors for improving prediction accuracy. It is one kind of extentions of traditional collaborative filtering methods using the peason correlation coefficient. We used the Grouplens movie rating data in experiment and we have compared the proposed method with the collaborative filtering methods by the mean absolute error and receive operating characteristic values. The results show that the proposed method is more efficient than the collaborative filtering methods using the pearson correlation coefficient about 9% in MAE and 13% in sensitivity of ROC.

A New Similarity Measure based on Separation of Common Ratings for Collaborative Filtering

  • Lee, Soojung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.11
    • /
    • pp.149-156
    • /
    • 2021
  • Among various implementation techniques of recommender systems, collaborative filtering selects nearest neighbors with high similarity based on past rating history, recommends products preferred by them, and has been successfully utilized by many commercial sites. Accurate estimation of similarity is an important factor that determines performance of the system. Various similarity measures have been developed, which are mostly based on integrating traditional similarity measures and several indices already developed. This study suggests a similarity measure of a novel approach. It separates the common rating area between two users by the magnitude of ratings, estimates similarity for each subarea, and integrates them with weights. This enables identifying similar subareas and reflecting it onto a final similarity value. Performance evaluation using two open datasets is conducted, resulting in that the proposed outperforms the previous one in terms of prediction accuracy, rank accuracy, and mean average precision especially with the dense dataset. The proposed similarity measure is expected to be utilized in various commercial systems for recommending products more suited to user preference.

A Recommendation Model based on Character-level Deep Convolution Neural Network (문자 수준 딥 컨볼루션 신경망 기반 추천 모델)

  • Ji, JiaQi;Chung, Yeongjee
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.23 no.3
    • /
    • pp.237-246
    • /
    • 2019
  • In order to improve the accuracy of the rating prediction of the recommendation model, not only user-item rating data are used but also consider auxiliary information of item such as comments, tags, or descriptions. The traditional approaches use a word-level model of the bag-of-words for the auxiliary information. This model, however, cannot utilize the auxiliary information effectively, which leads to shallow understanding of auxiliary information. Convolution neural network (CNN) can capture and extract feature vector from auxiliary information effectively. Thus, this paper proposes character-level deep-Convolution Neural Network based matrix factorization (Char-DCNN-MF) that integrates deep CNN into matrix factorization for a novel recommendation model. Char-DCNN-MF can deeper understand auxiliary information and further enhance recommendation performance. Experiments are performed on three different real data sets, and the results show that Char-DCNN-MF performs significantly better than other comparative models.

Development of Cerebral Amyloid Positivity Predicting Models Using Clinical Indicators (임상적 지표를 이용한 대뇌 아밀로이드 단백 축적 여부 예측모델 개발)

  • Chun, Young Jae;Joo, Soo Hyun
    • Korean Journal of Biological Psychiatry
    • /
    • v.27 no.2
    • /
    • pp.94-100
    • /
    • 2020
  • Objectives Amyloid β positron emission tomography (Aβ PET) is widely used as a diagnostic tool in patients who have symptoms of cognitive impairment, however, this diagnostic examination is too expensive. Thus, predicting the positivity of Aβ PET before patients undergo the examination is essential. We aimed to analyze clinical predictors of patients who underwent Aβ PET retrospectively, and to develop a predicting model of Aβ PET positivity. Methods 468 patients who underwent Aβ PET with cognitive impairment were recruited and their clinical indicators were analyzed retrospectively. We specified the primary outcome as Aβ PET positivity, and included variables such as age, sex, body mass index, diastolic blood pressure, systolic blood pressure, education, dementia family history, Mini Mental Status Examination (MMSE), Clinical Dementia Rating (CDR), Clinical Dementia Rating-Sum of Box (CDR-SB), hypertension (HTN), diabetes mellitus (DM) and presence of apolipoprotein E (ApoE) E4 as potential predictors. We developed three final models of amyloid positivity prediction for total subjects, mild cognitive impairment (MCI) and Alzheimer's disease (AD) dementia using a multivariate stepwise logistic regression analysis. Receiver operating characteristic (ROC) curve analysis was performed and the area under curve (AUC) value was calculated for the ROC curve. Results Aβ PET negative patients were 49.6% (n = 232), and Aβ PET positive patients were 50.4% (n = 236). In the final model of all subjects, older age, female sex, presence of ApoE E4 and lower MMSE are associated with Aβ PET positivity. The AUC value was 0.296. In the final model of MCI subjects (n = 244), older age and presence of ApoE E4 are associated with Aβ PET positivity. The AUC value was 0.725. In the final model of AD subjects (n = 173), lower MMSE scores, the presence of ApoE E4 and history of HTN are associated with Aβ PET positivity. The AUC value was 0.681. Conclusions The cerebral amyloid positivity model, which was based on commonly available clinical indicators, can be useful for prediction of amyloid PET positivity in MCI or AD patients.

The Study for Utilizing Data of Cut-Slope Management System by Using Logistic Regression (로지스틱 회귀분석을 이용한 도로비탈면관리시스템 데이터 활용 검토 연구)

  • Woo, Yonghoon;Kim, Seung-Hyun;Yang, Inchul;Lee, Se-Hyeok
    • The Journal of Engineering Geology
    • /
    • v.30 no.4
    • /
    • pp.649-661
    • /
    • 2020
  • Cut-slope management system (CSMS) has been investigated all slopes on the road of the whole country to evaluate risk rating of each slope. Based on this evaluation, the decision-making for maintenance can be conducted, and this procedure will be helpful to establish a consistent and efficient policy of safe road. CSMS has updated the database of all slopes annually, and this database is constructed based on a basic and detailed investigation. In the database, there are two type of data: first one is an objective data such as slopes' location, height, width, length, and information about underground and bedrock, etc; second one is subjective data, which is decided by experts based on those objective data, e.g., degree of emergency and risk, maintenance solution, etc. The purpose of this study is identifying an data application plan to utilize those CSMS data. For this purpose, logistic regression, which is a basic machine-learning method to construct a prediction model, is performed to predict a judging-type variable (i.e., subjective data) based on objective data. The constructed logistic model shows the accurate prediction, and this model can be used to judge a priority of slopes for detailed investigation. Also, it is anticipated that the prediction model can filter unusual data by comparing with a prediction value.

Prediction of Forest Fire Danger Rating over the Korean Peninsula with the Digital Forecast Data and Daily Weather Index (DWI) Model (디지털예보자료와 Daily Weather Index (DWI) 모델을 적용한 한반도의 산불발생위험 예측)

  • Won, Myoung-Soo;Lee, Myung-Bo;Lee, Woo-Kyun;Yoon, Suk-Hee
    • Korean Journal of Agricultural and Forest Meteorology
    • /
    • v.14 no.1
    • /
    • pp.1-10
    • /
    • 2012
  • Digital Forecast of the Korea Meteorological Administration (KMA) represents 5 km gridded weather forecast over the Korean Peninsula and the surrounding oceanic regions in Korean territory. Digital Forecast provides 12 weather forecast elements such as three-hour interval temperature, sky condition, wind direction, wind speed, relative humidity, wave height, probability of precipitation, 12 hour accumulated rain and snow, as well as daily minimum and maximum temperatures. These forecast elements are updated every three-hour for the next 48 hours regularly. The objective of this study was to construct Forest Fire Danger Rating Systems on the Korean Peninsula (FFDRS_KORP) based on the daily weather index (DWI) and to improve the accuracy using the digital forecast data. We produced the thematic maps of temperature, humidity, and wind speed over the Korean Peninsula to analyze DWI. To calculate DWI of the Korean Peninsula it was applied forest fire occurrence probability model by logistic regression analysis, i.e. $[1+{\exp}\{-(2.494+(0.004{\times}T_{max})-(0.008{\times}EF))\}]^{-1}$. The result of verification test among the real-time observatory data, digital forecast and RDAPS data showed that predicting values of the digital forecast advanced more than those of RDAPS data. The results of the comparison with the average forest fire danger rating index (sampled at 233 administrative districts) and those with the digital weather showed higher relative accuracy than those with the RDAPS data. The coefficient of determination of forest fire danger rating was shown as $R^2$=0.854. There was a difference of 0.5 between the national mean fire danger rating index (70) with the application of the real-time observatory data and that with the digital forecast (70.5).

Conflict of Interests and Analysts' Forecast (이해상충과 애널리스트 예측)

  • Park, Chang-Gyun;Youn, Taehoon
    • KDI Journal of Economic Policy
    • /
    • v.31 no.1
    • /
    • pp.239-276
    • /
    • 2009
  • The paper investigates the possible relationship between earnings prediction by security analysts and special ownership ties that link security companies those analysts belong to and firms under analysis. "Security analysts" are known best for their role as information producers in stock markets where imperfect information is prevalent and transaction costs are high. In such a market, changes in the fundamental value of a company are not spontaneously reflected in the stock price, and the security analysts actively produce and distribute the relevant information crucial for the price mechanism to operate efficiently. Therefore, securing the fairness and accuracy of information they provide is very important for efficiencyof resource allocation as well as protection of investors who are excluded from the special relationship. Evidence of systematic distortion of information by the special tie naturally calls for regulatory intervention, if found. However, one cannot presuppose the existence of distorted information based on the common ownership between the appraiser and the appraisee. Reputation effect is especially cherished by security firms and among analysts as indispensable intangible asset in the industry, and the incentive to maintain good reputation by providing accurate earnings prediction may overweigh the incentive to offer favorable rating or stock recommendation for the firms that are affiliated by common ownership. This study shares the theme of existing literature concerning the effect of conflict of interests on the accuracy of analyst's predictions. This study, however, focuses on the potential conflict of interest situation that may originate from the Korea-specific ownership structure of large conglomerates. Utilizing an extensive database of analysts' reports provided by WiseFn(R) in Korea, we perform empirical analysis of potential relationship between earnings prediction and common ownership. We first analyzed the prediction bias index which tells how optimistic or friendly the analyst's prediction is compared to the realized earnings. It is shown that there exists no statistically significant relationship between the prediction bias and common ownership. This is a rather surprising result since it is observed that the frequency of positive prediction bias is higher with such ownership tie. Next, we analyzed the prediction accuracy index which shows how accurate the analyst's prediction is compared to the realized earnings regardless of its sign. It is also concluded that there is no significant association between the accuracy ofearnings prediction and special relationship. We interpret the results implying that market discipline based on reputation effect is working in Korean stock market in the sense that security companies do not seem to be influenced by an incentive to offer distorted information on affiliated firms. While many of the existing studies confirm the relationship between the ability of the analystand the accuracy of the analyst's prediction, these factors cannot be controlled in the above analysis due to the lack of relevant data. As an indirect way to examine the possibility that such relationship might have distorted the result, we perform an additional but identical analysis based on a sub-sample consisting only of reports by best analysts. The result also confirms the earlier conclusion that the common ownership structure does not affect the accuracy and bias of earnings prediction by the analyst.

  • PDF