Feature Selection for Multi-Class Genre Classification using Gaussian Mixture Model

Gaussian Mixture Model을 이용한 다중 범주 분류를 위한 특징벡터 선택 알고리즘

  • 문선국 (연세대학교 전기전자공학과 디지털신호처리 연구실) ;
  • 최택성 (연세대학교 전기전자공학과 디지털신호처리 연구실) ;
  • 박영철 (연세대학교 컴퓨터정보통신공학부) ;
  • 윤대희 (연세대학교 전기전자공학과 디지털신호처리 연구실)
  • Published : 2007.10.31

Abstract

In this paper, we proposed the feature selection algorithm for multi-class genre classification. In our proposed algorithm, we developed GMM separation score based on Gaussian mixture model for measuring separability between two genres. Additionally, we improved feature subset selection algorithm based on sequential forward selection for multi-class genre classification. Instead of setting criterion as entire genre separability measures, we set criterion as worst genre separability measure for each sequential selection step. In order to assess the performance proposed algorithm, we extracted various features which represent characteristics such as timbre, rhythm, pitch and so on. Then, we investigate classification performance by GMM classifier and k-NN classifier for selected features using conventional algorithm and proposed algorithm. Proposed algorithm showed improved performance in classification accuracy up to 10 percent for classification experiments of low dimension feature vector especially.

본 논문에서는 내용 기반 음악 범주 분류 시스템에서 다중 범주를 위한 특징벡터 선택 알고리즘을 제안한다. 제안된 특징벡터 선택 알고리즘은 분리 성능을 측정할 때 가우시안 혼합 모델(Gaussian Mixture Model: GMM)을 기반으로 GMM separation score을 측정함으로써 확률분포 및 분리 성능 추정의 정확도를 높였고, sequential forward selection 방법을 개선하여 이전까지 선택된 특징벡터들이 분리를 잘 하지 못하는 범주들을 기준으로 다음 특징벡터를 선택하는 알고리즘을 제안하여 다중 범주 분류의 성능을 높였다. 제안된 알고리즘의 성능 검증을 위해 음색, 리듬, 피치 등 오디오 신호의 특징을 나타내는 다양한 파라미터를 오디오 신호로부터 추출하여 제안된 특징벡터 선택 알고리즘과 기존의 알고리즘으로 특징벡터를 선택한 후 GMM classifier와 k-NN classifier를 이용하여 분류 성능을 평가하였다. 제안된 특징벡터 선택 알고리즘은 기존 알고리즘에 비하여 3%에서 8% 정도의 분류 성능이 향상된 것을 확인할 수 있었고 특히 낮은 차원의 특징벡터의 분류 실험에서는 분류 정확도 측면에서 5%에서 10% 향상된 좋은 성능을 보였다.

Keywords

References

  1. E. scheirer, M. Slaney, 'Construction and evaluation of a robust multifeature speech/music discriminator,' Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), pp. 1331-1334, 1997
  2. G. Tzanetakis, P. Cook, 'Musical Genre Classification of audio signals', IEEE Transaction on Speech and Audio Processing, vol. 10, No. 5, pp. 293-302, 2002 https://doi.org/10.1109/TSA.2002.800560
  3. E. Wold, T. Blum, D. Keislar, and J. Wheaton, 'Content-based classification, search and retrieval of audio', IEEE Multimedia, vol. 3, No. 3, pp. 27-36, 1996 https://doi.org/10.1109/93.556537
  4. G. Peeters, 'Automatic classification of large musical instrument databases using hierarchical classifiers with inertia ratio maximization,' Proc. 115th AES Convention, New York, Oct. 2003
  5. S. Essid, G. Richard, B. David, 'Instrument rec ognition in polyphonic music based on automatic taxonomies,' IEEE Trans. on Audio, Speech and Language Processing, vol. 14, No. 1, pp. 68-80, Jan. 2006 https://doi.org/10.1109/TSA.2005.860351
  6. D.-N. Jiang, L. Lu, H.-J. Zhang, J.-H. Tao, and L.-H. Cai. 'Music type classification by spectral contrast feature,' In Proceedings of IEEE International Conference on Multimedia and Expo (ICME02), Lausanne Switzerland, Aug 2002
  7. S. Theodoridis, K. Koutroumbas, 'Pattern recognition (third edition),' Academic Press, 2006
  8. F. J. Ferri, P. Pudil, M. Hatef, J. Kittler, 'Comparative study of techniques for large-scale feature selection,' Gelsema, E.S., Kanal, L.N. (Eds.), Pattern Recognition in Practice vo. IV, pp. 403-413, 1994
  9. E. Scheirer and M. Slaney, 'Construction and evaluation of a robust multifeature speech/music discriminator,' Proc. Int. Conf. Acoustics, Speech, Signal Processing (ICASSP), pp. 1331- 1334, 1997
  10. Beth Logan, 'Mel Frequency Cepstral Coefficients for music modeling,' Proceedings of the First International Symposium on Music Information Retrieval (ISMIR), 2000
  11. G. Peeters, 'A large set of audio features for sound description (similarity and classification) in the CUIDADO project,' CUIDADO I.S.T. Project Report, 2004
  12. J. J. Burred, A. Lerch, 'A hierarchical approach to automatic musical genre classification,' Proc. 6th Int. Conference on Digital Audio Effects, London, UK, September 2003
  13. T. Tolenen and M. Karjalainen, 'A computationally efficient multipitch analysis model,' IEEE Trans. Speech Audio Processing, vol. 8, pp.708-716, Nov. 2000 https://doi.org/10.1109/89.876309
  14. T. Hastie, R. Tibshirani, J. Friedman, 'The elements of statistical learning - data mining, inference, and prediction,' Springer, 2000
  15. D. A. Reynolds, R. C. Rose, 'Robust text-independent speaker identification using Gaussian mixture speaker models,' IEEE Transaction on Speech and Audio Processing, vol. 3, No. 1, pp. 72-83, January 1995 https://doi.org/10.1109/89.365379
  16. http://ismir2004.ismir.net/genre_contest/index.htm, 2004