• Title/Summary/Keyword: Mathematical

Search Result 31,122, Processing Time 0.043 seconds

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Robo-Advisor Algorithm with Intelligent View Model (지능형 전망모형을 결합한 로보어드바이저 알고리즘)

  • Kim, Sunwoong
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.39-55
    • /
    • 2019
  • Recently banks and large financial institutions have introduced lots of Robo-Advisor products. Robo-Advisor is a Robot to produce the optimal asset allocation portfolio for investors by using the financial engineering algorithms without any human intervention. Since the first introduction in Wall Street in 2008, the market size has grown to 60 billion dollars and is expected to expand to 2,000 billion dollars by 2020. Since Robo-Advisor algorithms suggest asset allocation output to investors, mathematical or statistical asset allocation strategies are applied. Mean variance optimization model developed by Markowitz is the typical asset allocation model. The model is a simple but quite intuitive portfolio strategy. For example, assets are allocated in order to minimize the risk on the portfolio while maximizing the expected return on the portfolio using optimization techniques. Despite its theoretical background, both academics and practitioners find that the standard mean variance optimization portfolio is very sensitive to the expected returns calculated by past price data. Corner solutions are often found to be allocated only to a few assets. The Black-Litterman Optimization model overcomes these problems by choosing a neutral Capital Asset Pricing Model equilibrium point. Implied equilibrium returns of each asset are derived from equilibrium market portfolio through reverse optimization. The Black-Litterman model uses a Bayesian approach to combine the subjective views on the price forecast of one or more assets with implied equilibrium returns, resulting a new estimates of risk and expected returns. These new estimates can produce optimal portfolio by the well-known Markowitz mean-variance optimization algorithm. If the investor does not have any views on his asset classes, the Black-Litterman optimization model produce the same portfolio as the market portfolio. What if the subjective views are incorrect? A survey on reports of stocks performance recommended by securities analysts show very poor results. Therefore the incorrect views combined with implied equilibrium returns may produce very poor portfolio output to the Black-Litterman model users. This paper suggests an objective investor views model based on Support Vector Machines(SVM), which have showed good performance results in stock price forecasting. SVM is a discriminative classifier defined by a separating hyper plane. The linear, radial basis and polynomial kernel functions are used to learn the hyper planes. Input variables for the SVM are returns, standard deviations, Stochastics %K and price parity degree for each asset class. SVM output returns expected stock price movements and their probabilities, which are used as input variables in the intelligent views model. The stock price movements are categorized by three phases; down, neutral and up. The expected stock returns make P matrix and their probability results are used in Q matrix. Implied equilibrium returns vector is combined with the intelligent views matrix, resulting the Black-Litterman optimal portfolio. For comparisons, Markowitz mean-variance optimization model and risk parity model are used. The value weighted market portfolio and equal weighted market portfolio are used as benchmark indexes. We collect the 8 KOSPI 200 sector indexes from January 2008 to December 2018 including 132 monthly index values. Training period is from 2008 to 2015 and testing period is from 2016 to 2018. Our suggested intelligent view model combined with implied equilibrium returns produced the optimal Black-Litterman portfolio. The out of sample period portfolio showed better performance compared with the well-known Markowitz mean-variance optimization portfolio, risk parity portfolio and market portfolio. The total return from 3 year-period Black-Litterman portfolio records 6.4%, which is the highest value. The maximum draw down is -20.8%, which is also the lowest value. Sharpe Ratio shows the highest value, 0.17. It measures the return to risk ratio. Overall, our suggested view model shows the possibility of replacing subjective analysts's views with objective view model for practitioners to apply the Robo-Advisor asset allocation algorithms in the real trading fields.