DOI QR코드

DOI QR Code

Ensemble Learning of Region Based Classifiers

지역 기반 분류기의 앙상블 학습

  • 최성하 (서강대학교 대학원 컴퓨터학과) ;
  • 이병우 (서강대학교 대학원 컴퓨터학과) ;
  • 양지훈 (서강대학교 컴퓨터학과)
  • Published : 2007.08.31

Abstract

In machine learning, the ensemble classifier that is a set of classifiers have been introduced for higher accuracy than individual classifiers. We propose a new ensemble learning method that employs a set of region based classifiers. To show the performance of the proposed method. we compared its performance with that of bagging and boosting, which ard existing ensemble methods. Since the distribution of data can be different in different regions in the feature space, we split the data and generate classifiers based on each region and apply a weighted voting among the classifiers. We used 11 data sets from the UCI Machine Learning Repository to compare the performance of our new ensemble method with that of individual classifiers as well as existing ensemble methods such as bagging and boosting. As a result, we found that our method produced improved performance, particularly when the base learner is Naive Bayes or SVM.

기계학습에서 분류기틀의 집합으로 구성된 앙상블 분류기는 단일 분류기에 비해 정확도가 높다는 것이 입증되어왔다. 본 논문에서는 새로운 앙상블 학습으로서 데이터의 지역 기반 분류기들의 앙상블 학습을 제시하여 기존의 앙상블 학습과의 비교를 통해 성능을 검증하고자 한다. 지역 기반 분류기의 앙상블 학습은 데이터의 분포가 지역에 따라 다르다는 점에 착안하여 학습 데이터를 분할하여 해당하는 지역에 기반을 둔 분류기들을 만들어 나간다. 이렇게 만들어진 분류기들로부터 지역에 따라 가중치를 둔 투표를 적용하여 앙상블 방법을 이끌어낸다. 본 논문에서 제시한 앙상블 분류기의 성능평가를 위해 단일 분류기와 기존의 앙상블 분류기인 배깅과 부스팅 등을 UCI Machine Learning Repository에 있는 11개의 데이터 셋으로 정확도 비교를 하였다. 그 결과 새로운 앙상블 방법이 기본 분류기로 나이브 베이즈와 SVM을 사용했을 때 다른 방법보다 좋은 성능을 보이는 것을 알 수 있었다.

Keywords

References

  1. Bauer, E. & Kohavi, R., 'An Empirical Comparison of Voting Classification Algorithm: Bagging, Boosting, and Variants', Machine Learning, 36(1-2), pp. 105-142, 1999 https://doi.org/10.1023/A:1007515423169
  2. Blake, C. & Merz, C., UCI Repository of Machine Learning Database, http//www.ics.uci.edu /~mlearn/MLRepository.html, 1998
  3. Breiman, L., 'Bias, Variance, and Arcing Classifiers', Technical Report TR, 460, UC Berkeley, 1996
  4. Breiman, L., 'Bagging Predictors', Machine Learning, 24(2), pp. 123-140, 1996 https://doi.org/10.1023/A:1018054314350
  5. Dietterich, T., 'An Experimental Comparison of Three Methods for Constructing Ensembles of Decision Trees: Bagging, Boosting, and Randomization', Machine Learning , 40(2), pp. 139-157, 2000 https://doi.org/10.1023/A:1007607513941
  6. Dietterich, T., 'Ensemble method in Machine learning', In J. Kittler and F. Roli (Ed.) First International Workshop on Multiple Classifier Systems, Lecture Notes in Computer Science, pp. 1-15, 2000 https://doi.org/10.1007/3-540-45014-9_1
  7. Dietterich, T., 'Ensemble Learning', In The Handbook of Brain Theory and Neural Networks, Second edition, The MIT Press, pp. 405-408, 2002
  8. Freund, Y. & Schapire, R., 'Experiments with a new boosting algorithm', In Proc. of the Thirteenth International Conference on Machine Learning, pp. 148-156, 1996
  9. Freund, Y. & Schapire, R., 'A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting', Journal of Computer and System Science, 55, pp. 119-139, 1997 https://doi.org/10.1006/jcss.1997.1504
  10. Friedman, J., Hastie, T. & Tibshirani, R., 'Additive Logistic Regression: a Statistical View of Boosting', Annals of Statistics, 28(2), pp. 337-374, 2000 https://doi.org/10.1214/aos/1016218223
  11. Hansen, L. & Salamon, P., 'Neural Network Ensembles', IEEE Transaction on Pattern Analysis and Machine Intelligence, 12, pp. 993-1001, 1990 https://doi.org/10.1109/34.58871
  12. L.I. Kuncheva and C.J. Whitaker. 'Measures of diversity in classifier ensembles', Machine Learning, 51, pp. 181-207, 2003 https://doi.org/10.1023/A:1022859003006
  13. Opitz, D. & Maclin, R., 'Popular Ensemble Methods: An Empirical Study', Journal of Artificial Intelligence Research, 11, pp. 169-198, 1999 https://doi.org/10.1613/jair.614
  14. Platt, J. Fast Training of Support Vector Machines using Sequential Minimal Optimization, chapter 12, pp. 185-208, The MIT Press, 1999
  15. Quinlan, J., 'Induction of Decision Tree', Machine Learning, 1(1), pp. 81- 106, 1986 https://doi.org/10.1023/A:1022643204877
  16. Quinlan, J., C4.5: Programs for Machine Learning, Morgan Kaufmann, 1993
  17. Quinlan, J., 'Bagging, Boosting, and C4.5.', In Proc. of the Thirteenth National Conference on Artificial Intelligence, pp. 725-730, 1996
  18. Witten, I. & Frank, E., Data Mining: Practical Machine Learning Tools and Techniques with Java Implementation, Second edition, Morgan Kaufmann, 2005