Optimal Selection of Classifier Ensemble Using Genetic Algorithms

유전자 알고리즘을 이용한 분류자 앙상블의 최적 선택

  • Received : 2010.11.22
  • Accepted : 2010.12.07
  • Published : 2010.12.31

Abstract

Ensemble learning is a method for improving the performance of classification and prediction algorithms. It is a method for finding a highly accurateclassifier on the training set by constructing and combining an ensemble of weak classifiers, each of which needs only to be moderately accurate on the training set. Ensemble learning has received considerable attention from machine learning and artificial intelligence fields because of its remarkable performance improvement and flexible integration with the traditional learning algorithms such as decision tree (DT), neural networks (NN), and SVM, etc. In those researches, all of DT ensemble studies have demonstrated impressive improvements in the generalization behavior of DT, while NN and SVM ensemble studies have not shown remarkable performance as shown in DT ensembles. Recently, several works have reported that the performance of ensemble can be degraded where multiple classifiers of an ensemble are highly correlated with, and thereby result in multicollinearity problem, which leads to performance degradation of the ensemble. They have also proposed the differentiated learning strategies to cope with performance degradation problem. Hansen and Salamon (1990) insisted that it is necessary and sufficient for the performance enhancement of an ensemble that the ensemble should contain diverse classifiers. Breiman (1996) explored that ensemble learning can increase the performance of unstable learning algorithms, but does not show remarkable performance improvement on stable learning algorithms. Unstable learning algorithms such as decision tree learners are sensitive to the change of the training data, and thus small changes in the training data can yield large changes in the generated classifiers. Therefore, ensemble with unstable learning algorithms can guarantee some diversity among the classifiers. To the contrary, stable learning algorithms such as NN and SVM generate similar classifiers in spite of small changes of the training data, and thus the correlation among the resulting classifiers is very high. This high correlation results in multicollinearity problem, which leads to performance degradation of the ensemble. Kim,s work (2009) showedthe performance comparison in bankruptcy prediction on Korea firms using tradition prediction algorithms such as NN, DT, and SVM. It reports that stable learning algorithms such as NN and SVM have higher predictability than the unstable DT. Meanwhile, with respect to their ensemble learning, DT ensemble shows the more improved performance than NN and SVM ensemble. Further analysis with variance inflation factor (VIF) analysis empirically proves that performance degradation of ensemble is due to multicollinearity problem. It also proposes that optimization of ensemble is needed to cope with such a problem. This paper proposes a hybrid system for coverage optimization of NN ensemble (CO-NN) in order to improve the performance of NN ensemble. Coverage optimization is a technique of choosing a sub-ensemble from an original ensemble to guarantee the diversity of classifiers in coverage optimization process. CO-NN uses GA which has been widely used for various optimization problems to deal with the coverage optimization problem. The GA chromosomes for the coverage optimization are encoded into binary strings, each bit of which indicates individual classifier. The fitness function is defined as maximization of error reduction and a constraint of variance inflation factor (VIF), which is one of the generally used methods to measure multicollinearity, is added to insure the diversity of classifiers by removing high correlation among the classifiers. We use Microsoft Excel and the GAs software package called Evolver. Experiments on company failure prediction have shown that CO-NN is effectively applied in the stable performance enhancement of NNensembles through the choice of classifiers by considering the correlations of the ensemble. The classifiers which have the potential multicollinearity problem are removed by the coverage optimization process of CO-NN and thereby CO-NN has shown higher performance than a single NN classifier and NN ensemble at 1% significance level, and DT ensemble at 5% significance level. However, there remain further research issues. First, decision optimization process to find optimal combination function should be considered in further research. Secondly, various learning strategies to deal with data noise should be introduced in more advanced further researches in the future.

앙상블 학습은 분류 및 예측 알고리즘의 성과개선을 위하여 제안된 기계학습 기법이다. 그러나 앙상블 학습은 기저 분류자의 다양성이 부족한 경우 다중공선성 문제로 인하여 성과개선 효과가 미약하고 심지어는 성과가 악화될 수 있다는 문제점이 제기되었다. 본 연구에서는 기저 분류자의 다양성을 확보하고 앙상블 학습의 성과개선 효과를 제고하기 위하여 유전자 알고리즘 기반의 범위 최적화 기법을 제안하고자 한다. 본 연구에서 제안된 최적화 기법을 기업 부실예측 인공신경망 앙상블에 적용한 결과 기저 분류자의 다양성이 확보되고 인공신경망 앙상블의 성과가 유의적으로 개선되었음을 보여주었다.

Keywords

References

  1. Alfaro, E., M. Gamez and N. García, "Multiclass corporate failure prediction by AdaBoost.M1", AdvancedEconomic Research, Vol.13(2007), 301-312.
  2. Alfaro, E., N. Garcia, M. Gamez and D. Elizondo, "Bankruptcy forecasting : an empirical comparison of AdaBooost and neural networks", Decision Support Systems, Vol.45 (2008), 110-122. https://doi.org/10.1016/j.dss.2007.12.002
  3. Bauer, E. and R. Kohavi, "An empiricalcomparison of voting classification algorithms : Bagging, boosting, and variants", Machine Learning, Vol.36(1999), 105-139. https://doi.org/10.1023/A:1007515423169
  4. Breiman, L., "Bagging predictors", Machine learning, Vol.24, No.2(1996), 123-140.
  5. Buciu, I., C. Kotrooulos and I. Pitas, "Combining support vector machines for accuracy face detection", Proc. ICIP, (2001), 1054-1057.
  6. Dong, Y. S. and K. S. Han, "A comparison of several ensemble methods for text categorization", IEEE International Conference on Service Computing, 2004.
  7. Drucker, H. and C. Cortes, "Boosting decision trees", Advanced Neural Information Processing Systems, Vol.8(1996).
  8. Evgeniou, T., L. Perez-Breva, M. Pontil and T. Poggio, "Bound on the generalization performance of kernel machine ensembles", Proc. ICMI, (2000), 271-278.
  9. Fawcett, T., "An introduction to ROC analysis", Pattern Recognition Letters, Vol.27(2006), 861-874. https://doi.org/10.1016/j.patrec.2005.10.010
  10. Freund, Y. and R. E. Schapire, "A decision theoretic generalization of online learning and an application to boosting", Journal of Computer and System Science, Vol.55, No.1(1997), 119-139. https://doi.org/10.1006/jcss.1997.1504
  11. Hansen, L. and P. Salamon, "Neural network ensembles", IEEE Trans, PAMI, Vol.12(1990), 993-1001. https://doi.org/10.1109/34.58871
  12. Ho, T. K., "Multiple classifier combination: lessons and next steps, in Hybrid Methods in Pattern Recognition(Ed. By H. Bubke and A. kandel)", World Scientific, 2002.
  13. Kim, M. J., "A Performance Comparison of Ensembles in Bankruptcy Prediction", Entrue Journal of Information Technology, Vol.8, No.2(2009), 41-49.
  14. Kim, M. J. and D. G. Kang, "An Ensemble with neural networks for bankruptcy prediction", Expert Systems with applications, Vol.37 (2010), 3373-3379. https://doi.org/10.1016/j.eswa.2009.10.012
  15. Kim, Y. W., I. S. Oh, "Classifier ensemble selection using hybrid genetic algorithms", Pattern Recognition Letters, Vol.29, No.6(2008), 796-802. https://doi.org/10.1016/j.patrec.2007.12.013
  16. Maclin, R. and D. Opitz, "An empirical evaluation of bagging and boosting", Proceedings of the Fourteenth National Conferenceon Artificial Intelligence, (1997), 546-551.
  17. Maia, T. T., A. P. Braga and A. F. Carvalho, "Hybrid classification algorithms based on boosting and support vector machines", Kybernetes, Vol.37, No.9(2008), 1469-1491. https://doi.org/10.1108/03684920810907814
  18. Oliveira, L. S., R. Sabourin, F. Bortolozzi and C. Y. Suen, "Feature selection for ensembles : a hierarchical multi-objective genetic algorithm approach", ICDAR, 2003.
  19. Quinlan, J. R., "Bagging, boosting and C4.5. Machine Learning", Proceedings of the Fourteenth International Conference", (1996), 725-730.
  20. Valentini, G., M. Muselli and F. Ruffino, "Bagged ensembles of SVMs or gene expression data analysis", The IEEE-INNS-ENNS International Joint Conference on Neural Networks, (2003), 1844-1849.
  21. Zhou, Z. H., J. X. Wu, and W. Tang, "Ensembling neural networks: many could better than all", Artificial Intelligence, Vol.137 (2002), 239-263. https://doi.org/10.1016/S0004-3702(02)00190-X