• Title/Summary/Keyword: decomposition method

Search Result 2,492, Processing Time 0.026 seconds

Ensemble Learning with Support Vector Machines for Bond Rating (회사채 신용등급 예측을 위한 SVM 앙상블학습)

  • Kim, Myoung-Jong
    • Journal of Intelligence and Information Systems
    • /
    • v.18 no.2
    • /
    • pp.29-45
    • /
    • 2012
  • Bond rating is regarded as an important event for measuring financial risk of companies and for determining the investment returns of investors. As a result, it has been a popular research topic for researchers to predict companies' credit ratings by applying statistical and machine learning techniques. The statistical techniques, including multiple regression, multiple discriminant analysis (MDA), logistic models (LOGIT), and probit analysis, have been traditionally used in bond rating. However, one major drawback is that it should be based on strict assumptions. Such strict assumptions include linearity, normality, independence among predictor variables and pre-existing functional forms relating the criterion variablesand the predictor variables. Those strict assumptions of traditional statistics have limited their application to the real world. Machine learning techniques also used in bond rating prediction models include decision trees (DT), neural networks (NN), and Support Vector Machine (SVM). Especially, SVM is recognized as a new and promising classification and regression analysis method. SVM learns a separating hyperplane that can maximize the margin between two categories. SVM is simple enough to be analyzed mathematical, and leads to high performance in practical applications. SVM implements the structuralrisk minimization principle and searches to minimize an upper bound of the generalization error. In addition, the solution of SVM may be a global optimum and thus, overfitting is unlikely to occur with SVM. In addition, SVM does not require too many data sample for training since it builds prediction models by only using some representative sample near the boundaries called support vectors. A number of experimental researches have indicated that SVM has been successfully applied in a variety of pattern recognition fields. However, there are three major drawbacks that can be potential causes for degrading SVM's performance. First, SVM is originally proposed for solving binary-class classification problems. Methods for combining SVMs for multi-class classification such as One-Against-One, One-Against-All have been proposed, but they do not improve the performance in multi-class classification problem as much as SVM for binary-class classification. Second, approximation algorithms (e.g. decomposition methods, sequential minimal optimization algorithm) could be used for effective multi-class computation to reduce computation time, but it could deteriorate classification performance. Third, the difficulty in multi-class prediction problems is in data imbalance problem that can occur when the number of instances in one class greatly outnumbers the number of instances in the other class. Such data sets often cause a default classifier to be built due to skewed boundary and thus the reduction in the classification accuracy of such a classifier. SVM ensemble learning is one of machine learning methods to cope with the above drawbacks. Ensemble learning is a method for improving the performance of classification and prediction algorithms. AdaBoost is one of the widely used ensemble learning techniques. It constructs a composite classifier by sequentially training classifiers while increasing weight on the misclassified observations through iterations. The observations that are incorrectly predicted by previous classifiers are chosen more often than examples that are correctly predicted. Thus Boosting attempts to produce new classifiers that are better able to predict examples for which the current ensemble's performance is poor. In this way, it can reinforce the training of the misclassified observations of the minority class. This paper proposes a multiclass Geometric Mean-based Boosting (MGM-Boost) to resolve multiclass prediction problem. Since MGM-Boost introduces the notion of geometric mean into AdaBoost, it can perform learning process considering the geometric mean-based accuracy and errors of multiclass. This study applies MGM-Boost to the real-world bond rating case for Korean companies to examine the feasibility of MGM-Boost. 10-fold cross validations for threetimes with different random seeds are performed in order to ensure that the comparison among three different classifiers does not happen by chance. For each of 10-fold cross validation, the entire data set is first partitioned into tenequal-sized sets, and then each set is in turn used as the test set while the classifier trains on the other nine sets. That is, cross-validated folds have been tested independently of each algorithm. Through these steps, we have obtained the results for classifiers on each of the 30 experiments. In the comparison of arithmetic mean-based prediction accuracy between individual classifiers, MGM-Boost (52.95%) shows higher prediction accuracy than both AdaBoost (51.69%) and SVM (49.47%). MGM-Boost (28.12%) also shows the higher prediction accuracy than AdaBoost (24.65%) and SVM (15.42%)in terms of geometric mean-based prediction accuracy. T-test is used to examine whether the performance of each classifiers for 30 folds is significantly different. The results indicate that performance of MGM-Boost is significantly different from AdaBoost and SVM classifiers at 1% level. These results mean that MGM-Boost can provide robust and stable solutions to multi-classproblems such as bond rating.

Studies on the effect of phthalimido methyl-O,O-dimethyl-phosphorodithioate (Imidan) and its possible metabolites on the growth of rice plant (Phthalimido methyl-O,O-dimethyl phosphorodithioate (Imidan)과 그의 대사물질(代謝物質)이 수도(水稻) 생육(生育)에 미치는 영향(影響)에 관(關)한 연구(硏究))

  • Lee, Sung-Hwan;Lee, Dong-Suk;Lee, Jae-Koo
    • Applied Biological Chemistry
    • /
    • v.7
    • /
    • pp.105-117
    • /
    • 1966
  • This experiment was conducted to investigate the effet of phthalimido-methyl-O,O-dimethyl-phosphorodithioate (Imidan) known as an acaricide and its possible metabolic products on the growth of plant, when sprayed on the leaves of rice plant. The results are summarized as follows. 1) Possible metabolic products of Imidan, the following compounds were synthesized or recrystallized for the present experiment a) N-Hydroxymethyl phthalimidem b) Phthalimide c) Phthalamidic acid d) Phthalic acid e) Anthranilic acid f) p-Amino benzoic acid g) p-Hydroxy benzoic acid h) Benzoic acid 2) Among the above materials, a), c), d), e), and Imidan were dissolved in a buffer solution respectively to be 10 and 20 p.p.m. and tested with the wheat coleoptile straight growth method. According to the results, Imidan inhibited the growth of coleoptile in both 10 and 20 p.p.m., whereas the others showed much better growth than the control, especially phthalamidic acid in 10 p.p.m. It appears that Imidan itself inhibits the coleoptile growth, whereas the metabolites derived from Imidan through various metabolisms, including hydrolysis in plant tissues show growth-regulating activity. (refer: Table 1, Fig. 1) 3) 20, 100 and 200 p.p.m. solutions of Imidall emulsion in xylene f·ere prepared. The lengths of shoot and root of rice seeds germinated on the re-respective media were measured after 12 days. The data showed that root was much more elongated in Imidan 20 p.p.m., whereas shoot in Imidan 100 p.p.m., respectively, than in the xylene control. An interesting finding was that xylene used as solvent had a tendency to inhibit seriously the root growth of rice seed. (refer: Table 2,5). 4) The emulsions of concentrations in 10, 25, 50 and 100 p.p.m's of control, Imidan, N-hydroxy methyl phthalimide, anthranilic acid, and phthalmide, respectively, were sprayed twice on the rice plant on pot. After a certain period of time lengths of rice culms were measured, showing that plots treated with Imidan and N-hydroxy methyl phthalimide exhibited much more growth than those of control and the others. 5) Loaves and stems of rice plant were sampled and extracted with dried acetone at the intervals of 3-, 5-, 7-, and 14 days after treated with Imidan 250 p.p.m. emulsion. This sample extracted with acetone was purified by means of prechromatographic purification method with acetonitrile and paperchromatographed to detect the following metabolic products. Imidan (Rf: 0.97-0,98), N-hydroxy-methyl phthalimide (Rf: 0.87) phthalimide (Rf: 0.86-0.87), phthalamidic acid (Rf: 0.13-0.14), phthalic acid (Rf: 0.02-0.03), benzoic acid (Rf: 0.42-0.43), p-amino benzoic acid or p-hydroxy benzoic acid (Rf: 0.08-0.09), and unidentified compounds (Rf: 0.73, 0.59, 0.33, 0.23. 0.07). In addition, in the early stages, such as 3- and 5 days nonhydrolyzed Imidan and its first hydrolytic product, N-hydroxymethyl phthalimide were detected in relatively large amounts, whereas in the last stages of 7- and 14 days due to further decomposition, the afore-mentioned two materials were reduced in the amount and p-hthalic, phthalamidic, benzoic, and p-Hydroxy benzoic, or p-Amino benzoic acids were detected in a considerably large amount. It is, therefore, believed that most of Imidan applied to the leaves of rice plant may be decomposed within almost 14 days. In the light of above observations it is considered that Imidan itself is not involved in plant growth regulating activity, whereas various phthaloyl derivatives produced in the course of metabolism (namelr, enzymic action) in plant tissues may have such effect.

  • PDF