Solving Multi-class Problem using Support Vector Machines

Support Vector Machines을 이용한 다중 클래스 문제 해결

  • 고재필 (금오공과대학교 컴퓨터공학과)
  • Published : 2005.12.01

Abstract

Support Vector Machines (SVM) is well known for a representative learner as one of the kernel methods. SVM which is based on the statistical learning theory shows good generalization performance and has been applied to various pattern recognition problems. However, SVM is basically to deal with a two-class classification problem, so we cannot solve directly a multi-class problem with a binary SVM. One-Per-Class (OPC) and All-Pairs have been applied to solve the face recognition problem, which is one of the multi-class problems, with SVM. The two methods above are ones of the output coding methods, a general approach for solving multi-class problem with multiple binary classifiers, which decomposes a complex multi-class problem into a set of binary problems and then reconstructs the outputs of binary classifiers for each binary problem. In this paper, we introduce the output coding methods as an approach for extending binary SVM to multi-class SVM and propose new output coding schemes based on the Error-Correcting Output Codes (ECOC) which is a dominant theoretical foundation of the output coding methods. From the experiment on the face recognition, we give empirical results on the properties of output coding methods including our proposed ones.

최근 기계학습 분야에서 커널머신을 이용한 대표적 학습기로 Support Vector Machines (SVM)이 주목 받고 있다. SVM은 통계적 학습이론에 기반하여 뛰어난 일반화 성능을 보여주며, 다양한 패턴인식 문제에 적용되고 있다. 그러나. SVM은 이진 분류기이므로 일반적인 다중 클래스 문제에 곧바로 적용할 수 없다. SVM을 다중 클래스 문제의 하나인 얼굴인식에 도입하기 위한 방법으로는, One-Per-Class와 All-Pairs가 대표적이다. 상기 두 방법은 다중 클래스 문제를 여러 개의 이진 클래스 문제로 분할하고, 이들을 다시 종합하여 최종 결정을 내리는 출력코딩이라는 일반적인 방법에 속한다. 본 논문에서는 이진 분류기인 SVM의 다중 클래스 분류기 확장 방안으로 출력코딩 방법론을 설명한다. 또한 출력코딩 방법론의 대표적인 이론적 기반인 ECOC(Ewor-Correcting Output Codes)를 근간으로 하는 새로운 출력코딩 방법들을 제안하고, 얼굴인식 실험을 통해 SVM을 기반 분류기로 사용할 경우의, 출력코딩 방법의 특성을 비교$\cdot$분석한다.

Keywords

References

  1. N. Cristianini and J. Shawe-Taylor, 'An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods,' Cambridge University Press, 2000
  2. V. Vapnik, 'An Overview of Statistical Learning Theory,' IEEE Trans. On Neural Networks, Vol. 10, No.5, pp. 988-999, 1999 https://doi.org/10.1109/72.788640
  3. V. Vapnik, Statistical Learning Theory, John Wiley & Sons, New York, 1998
  4. http://www.clopinet.com/isabelle/Projects/SVM/applist.html
  5. J. Weston and C. Watkins, 'Multi-class support vector machines,' Proc. of ESANN99, 1999
  6. C. Hsu and C. Lin, 'A comparison of methods for multiclass support vector machines,' IEEE Trans. on Neural Networks, Vol. 13, No.2, pp. 415-425, 2002 https://doi.org/10.1109/72.991427
  7. E. Allwein, R. Schapire and Y. Singer, 'Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers,' Journal of Machine Learning Research, Vol. 1, pp. 113-141, 2000 https://doi.org/10.1162/15324430152733133
  8. G. Guo, S. Li, and K. Chan, 'Face Recognition by Support Vector Machines,' Proc. of IEEE Int'l Conf. on Automatic Face and Gesture Recognition,' pp. 196-201, 2000
  9. M. Turk and A. Pentland, 'Eigenfaces for Recognition,' Cognitive Neuroscience, Vol. 3, pp. 71-86, 1991 https://doi.org/10.1162/jocn.1991.3.1.71
  10. P. Belhumeur, J. Hespanha, and K. Kriegman, 'Eigenfaces vs. Fisherfaces: Recognition using Class Specific linear projection,' IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 19, No.7, pp.711-720, 1997 https://doi.org/10.1109/34.598228
  11. M. Bartlett, J. Movellan, and T. Sejnowski, 'Face Recognition by Independent Component Analysis,' IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol. 13, No.6, pp. 1450-1464, 2002 https://doi.org/10.1109/TNN.2002.804287
  12. P. Penev and J. Atick, 'Local feature analysis: A general statistical theory for object representation,' Neural Systems, Vol. 7, pp. 477-500, 1996 https://doi.org/10.1088/0954-898X/7/3/002
  13. D. Lee and H. Seung, 'Algorithms for Non-negative Matrix Factorization,' Neural Information Processing Systems, pp. 556-562, 2003
  14. H. Yu and J. Yang, 'A direct LDA algorithm for high-dimensional data-with application to face recognition,' Pattern Recognition, Vol. 34, No. 10, pp. 2067-2070, 2001 https://doi.org/10.1016/S0031-3203(00)00162-X
  15. B. Scholkopf, A. Smola, and K. Muller, 'Nonliear Component Analysis as a Kernel Eigenvalue Problem,' Tech. Report 44, Max-Planck-Institute, 1996
  16. Y. Li, S. Gong, and H. Liddell, 'Constructing structures of facial identities using Kernel Disciminant Analysis,' Proc. of Int'l Workshop on Statistical and Computational Theories of Vision, 2001
  17. F. Bach and M. Jordan, 'Kernel Independent Component Analysis:,' Journal of Machine Learning Research, Vol. 3, pp. 1-48, 2002 https://doi.org/10.1162/153244303768966085
  18. J. Lu, K. Plataniotis, and A. Ventetsanopoulos, 'Face Recognition Using Kernel Direct Discriminant Analysis Algorithms,' IEEE Trans. on Neural Networks, Vol. 14, No.1, pp. 117-126, 2003 https://doi.org/10.1109/TNN.2002.806629
  19. C. Bishop, Neural Networks for Pattern Recognition. New York, Oxford, 1995
  20. B. Heisele, P. Ho, and T. Poggio, 'Face Recognition with Support Vector Machines: Global versus Component-based Approach,' 8th IEEE Int'l Conf. on Computer Vision, pp. 688-694, 2001
  21. J. Ghosh, 'Multiclassifier Systems: Back to the Future,' Proc. of the 3rd Int'l Workshop on Multiple Classifier Systems, Lecture Note in Computer Science, Vol. 2364, pp. 1-15, 2002
  22. T. Hastie and R. Tibshirani, 'Classification by Pairwise Coupling,' Advances in Neural Information Processing Systems, Vol. 10, pp. 507-513, MIT Press, 1998; The Annals of Statistics, Vol. 26, No.1, pp. 451-471, 1998 https://doi.org/10.1214/aos/1028144844
  23. T. Dietterich and G. Bakiri, 'Solving Multiclass Learning Problems via Error-Correcting Output Codes,' Journal of Artificial Intelligence Research, Vol. 2, pp. 263-286, 1995
  24. C. Burges, 'A Tutorial on Support Vector Machines for Pattern Recognition,' Data Mining and Knowledge Discovery, Vol. 2, No.2, pp. 121-167, 1998 https://doi.org/10.1023/A:1009715923555
  25. L. Hansen, and P. Salamon, 'Neural network ensembles,' IEEE Trans. on Pattern Recognition and Machine Intelligenced, Vol. 12, pp. 993-1001, 1990 https://doi.org/10.1109/34.58871
  26. F. Masulli, and G. Valentini, 'Comparing Decomposition Methods for Classification,' Proc. of Int'l Conf. Knowledge-based Intelligent Engineering Systems & Allied Technologies, Vol. 2, pp. 788-791, 2000 https://doi.org/10.1109/KES.2000.884164
  27. A. Klautau, N. Jevtic, and A. Orlisky, 'Combined Binary Classifiers with Applications to Speech Recognition,' Proc. of Int'l Conf. on SLP, pp. 2469-2472, 2002
  28. J. Ko, 'Multiclass Learning with N-Division Output Coding: A Case Study on Face Recognition,' PhD Thesis, Dept. of Computer Science, Yonsei University, 2002
  29. M. Moreira, E. Mayoraz, 'Improved Pairwise Coupling Classification with Correcting Classifiers,' Proc. of European Conf. on Machine Learning, pp. 160-171, 1998 https://doi.org/10.1007/BFb0026686
  30. J. H. Friedman, 'Another Approach to Polychotomous Classification,' Technical Report, Department of Statistics, Stanford University, 1996
  31. J. Platt and N. Cristianini, 'Large margin DAGs for multiclass classification,' Advances in Neural Information Processing Systems, Vol. 12, pp. 547-553, MIT Press, 2000
  32. A. Berger, 'Error-correcting output coding for text classification,' Proc. of Int'l Joint Conf. on Artificial Intelligence, 1999
  33. R. Ghani, 'Using error-correcting codes for text classification,' Proc. of the 17th Int'l Conf. on Machine Learning, pp. 303-310, 2000
  34. J. Rennie and R. Rifkin, 'Improving Multiclass Text Classification with the Support Vector Machine,' Tech. Report, AI Memo. 2001-026, AI Lab. MIT, 2001
  35. A. Klautau, N. Jevtic, and A. Orlitsky, 'Combined Binary Classifiers with Applications To Speech Recognition,' Proc. of Int'l Conf. on Spoken Language Processing, pp. 2469-2472, 2002
  36. D. Aha and R. Bankert, 'Cloud classification using error-correcting output codes,' Artificial Intelligence Applications: Natural Science, Agriculture, and Environmental Science, Vol. 11, pp. 13-28, 1997
  37. J. Kittler, R. Ghaderi, T. Windeatt, and J. Matas, 'Face Identification and Verification via ECOC,' Proc. of IEEE Int'l Conf. on Computer Vision and Pattern Recognition, pp. 755-760, 2001
  38. P. Phillips, H. Moon, S. Rizvi and P. Rauss, 'The FERET evaluation methodology for face-recognition algorithms,' IEEE Trans. Pattern Analysis and Machine Intelligence, Vol. 22, No. 10, pp. 1090-1104, 2000 https://doi.org/10.1109/34.879790
  39. J. Platt, 'Sequential minimal optimization: A fast algorithm for training support vector machines,' Tech. Report 98-14, Microsoft Research at Redmond, 1998
  40. R. Rifkin, and A. Klautau, 'In Defense of One-Vs-All Classification,' Journal of Machine Learning Research, Vol. 5, pp. 101-141, 2004