Ensemble learning of Regional Experts

지역 전문가의 앙상블 학습

  • 이병우 (서강대학교 컴퓨터공학과) ;
  • 양지훈 (서강대학교 컴퓨터공학과) ;
  • 김선호 (서강대학교 컴퓨터공학과)
  • Published : 2009.02.15

Abstract

We present a new ensemble learning method that employs the set of region experts, each of which learns to handle a subset of the training data. We split the training data and generate experts for different regions in the feature space. When classifying a data, we apply a weighted voting among the experts that include the data in their region. We used ten datasets to compare the performance of our new ensemble method with that of single classifiers as well as other ensemble methods such as Bagging and Adaboost. We used SMO, Naive Bayes and C4.5 as base learning algorithms. As a result, we found that the performance of our method is comparable to that of Adaboost and Bagging when the base learner is C4.5. In the remaining cases, our method outperformed the benchmark methods.

본 논문에서는 지역 전문가를 이용한 새로운 앙상블 방법을 제시하고자 한다. 이 앙상블 방법에서는 학습 데이타를 분할하여 속성 공간의 서로 다른 지역을 이용하여 전문가를 학습시킨다. 새로운 데이타를 분류할 때에는 그 데이타가 속한 지역을 담당하는 전문가들로 가중치 투표를 한다. UCI 기계 학습 데이타 저장소에 있는 10개의 데이타를 이용하여 단일 분류기, Bagging, Adaboost와 정확도를 비교하였다. 학습 알고리즘으로는 SVM, Naive Bayes, C4.5를 사용하였다. 그 결과 지역 전문가의 앙상블 학습 방법이 C4.5를 학습 알고리즘으로 사용한 Bagging, Adaboost와는 비슷한 성능을 보였으며 나머지 분류기보다는 좋은 성능을 보였다.

Keywords

References

  1. T. G. Dietterich, "Ensemble method in machine learning," LNCS, Vol.1857, pp. 1-15, 2000
  2. E. Bauer and R. Kohavi, “An empirical comparison of voting classification algorithm: bagging, boost-ing, and variants,” Machine Learning, Vol.36, No.1-2, pp. 105-142, 1999 https://doi.org/10.1023/A:1007515423169
  3. T. G. Dietterich, “An experimental comparison of three methods for constructing ensembles of deci-sion trees: bagging, boosting, and randomization,” Machine Learning, Vol.40, No.2, pp. 139-157, 2000 https://doi.org/10.1023/A:1007607513941
  4. D. Optiz and R. Maclin, “Popular ensemble methods: an empirical study,” Journal of AIR, Vol.11, pp. 169-198, 1999
  5. L. Breiman, “Bagging predictors,” Machine Learn-ing, Vol.24, No.2, pp. 123-140, 1996 https://doi.org/10.1023/A:1018054314350
  6. Y. Freund and O. Schapire, "Experiments with a new boosting algorithm," Proc. 13th International Conf. on Machine Learning, pp. 148-156. 1996
  7. L. Hansen and P. Salamon, “Neural network ensembles,” IEEE Trans. PAMI, Vol.12, pp. 993-1001, 1990 https://doi.org/10.1109/34.58871
  8. R. O. Duda, P. E. Hart and D. G. Stork, 2nd ed., Pattern Classification, Wiley-interscience, 2000
  9. G. Valentini, M. Muselli and F. Ruffino, “Bagged Ensembles of SVMs for Gene Expression Data Ana-lysis,” The IEEE-INNS-ENNS International Joint Conference on Neural Networks, pp. 1844-1849, 2003 https://doi.org/10.1109/IJCNN.2003.1223688
  10. I. Buciu, C. Kotropoulos and I. Pitas, “Combining support vector machines for accuracy face detec-tion," Proc. ICIP, 99. 1054-1057, 2001
  11. T. Evgeniou, L. Perez-Breva, M. Pontil and T. Poggio, "Bound on the generalization performance of kernel machine ensembles," Proc. ICML, pp. 271-278, 2000
  12. J. Quinlan, C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993
  13. J. Quinlan, "Induction of Decision Tree," Machine Learning, Vol.1, No.1, pp. 81-106, 1986 https://doi.org/10.1023/A:1022643204877
  14. J. Platt, "Fast training of support vector machines using sequential minimal optimization,” in Advances in Kernel Methods, ed. Scholkopf. B., Burges, C., Smola, A., The MIT Press, pp. 185-208, 1999
  15. I. Witten and E. Frank, 2nd ed., Data Mining: Practical Machine Learning Tools and Techniques with Java Implementation, Morgan Kaufmann, San Francisco, 2005
  16. C. Blacke and C. Merz, UCI Repository of Machine Learning Database, http://www.ics.uc.ed/~mlearn/LRepository.html, 1998
  17. Y. Freund and R. Schapire, “A decision theoretic generalization of online learning and an application to boosting,” Journal of CSC, Vol.55, pp. 119-139, 1997 https://doi.org/10.1006/jcss.1997.1504