DOI QR코드

DOI QR Code

다중 옥타브 밴드 기반 음악 장르 분류 시스템

Musical Genre Classification System based on Multiple-Octave Bands

  • 변가람 (세종대학교 정보통신공학과) ;
  • 김무영 (세종대학교 정보통신공학과)
  • Byun, Karam (Department of Information and Communication Engineering, Sejong University) ;
  • Kim, Moo Young (Department of Information and Communication Engineering, Sejong University)
  • 투고 : 2013.11.04
  • 심사 : 2013.11.27
  • 발행 : 2013.12.25

초록

음악 장르 분류를 위해서 다양한 종류의 특징 벡터들이 이용되고 있다. 대표적인 short-term 특징 벡터들로는 mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), octave-based spectral contrast (OSC) 등이 있으며, 이들의 long-term variation이 함께 이용된다. 본 논문에서는 OSC 특징을 추출하는데 있어서 하나의 옥타브 밴드 뿐만 아니라 다중 옥타브 밴드를 동시에 이용하여 옥타브 밴드 간 상관관계를 함께 반영할 수 있도록 하였다. 2012년도 music information retrieval evaluation exchange (MIREX) 평가회의 mixed 장르 분류 분야에서 4위를 한 알고리즘에 다중 옥타브 밴드를 이용한 결과, GTZAN과 Ballroom 데이터베이스에 대해서 각각 0.40% 포인트와 3.15% 포인트의 성능 향상을 얻을 수 있었다.

For musical genre classification, various types of feature vectors are utilized. Mel-frequency cepstral coefficient (MFCC), decorrelated filter bank (DFB), and octave-based spectral contrast (OSC) are widely used as short-term features, and their long-term variations are also utilized. In this paper, OSC features are extracted not only in the single-octave band domain, but also in the multiple-octave band one to capture the correlation between octave bands. As a baseline system, we select the genre classification system that won the fourth place in the 2012 music information retrieval evaluation exchange (MIREX) contest. By applying the OSC features based on multiple-octave bands, we obtain the better classification accuracy by 0.40% and 3.15% for the GTZAN and Ballroom databases, respectively.

키워드

참고문헌

  1. Z. Fu, G. Lu, K. M. Ting, and D. Zhang, "A survey of audio-based music classification and annotation," IEEE Trans. Multimedia, vol. 13, no. 2, pp.303-319, 2011. https://doi.org/10.1109/TMM.2010.2098858
  2. J. S. Downie, "The music information retrieval evaluation exchange (2005-2007): a window into music information retrieval research," Acoustical Science and Technology, vol. 29, no. 4, pp. 247-255, 2008. https://doi.org/10.1250/ast.29.247
  3. J. S. Downie, A. F. Ehmann, M. Bay, and M. C. Jones, "The music information retrieval evaluation exchange: some observations and insights," Advances in Music Information Retrieval, vol. 274, pp. 93-115, 2010.
  4. S.-C. Lim, S.-J. Jang, S.-P. Lee, and M. Y. Kim, "Music genre/mood classification using a feature-based modulation spectrum," in Proc. IEEE Int. Conf. Mobile IT Convergence, pp. 133-136, 2011.
  5. G. P. Nam, K. R. Park, S.-P. Lee, E. C. Lee, M.-Y. Kim, K. Kim, "Intelligent query by humming system," in Proc. IEEE Int. Conf. Ubiquitous Information Technologies Applications, pp. 22-23, 2009.
  6. K. Kim, K. R. Park, S.-J. Park, S.-P. Lee, and M. Y. Kim, "Robust query-by-singing/humming system against background noise environments," IEEE Trans. Consumer Electronics, vol. 57, no. 2, pp. 720-725, 2011. https://doi.org/10.1109/TCE.2011.5955213
  7. D. A. Reynolds and R. C. Rose, "Robust text independent speaker identification using gaussian mixture speaker model," IEEE Trans. Speech, Audio Process., vol. 3, no. 1, pp. 72-83, 1995. https://doi.org/10.1109/89.365379
  8. J. Ming, "Robust speaker recognition in noisy conditions," IEEE Trans. Audio, Speech, Language Process., vol. 15, no. 5, pp. 1711-1726, 2007. https://doi.org/10.1109/TASL.2007.899278
  9. D. N. Jiang, L. Lu, H. J. Zhang, J. H. Tao, and L. H. Cai, "Music type classification by spectral contrast feature," in Proc. IEEE Int. Conf. Multimedia and Expo., pp. 113-116, 2002.
  10. S.-C. Lim, S.-J. Jang, S.-P. Lee, and M. Y. Kim, "Multiple octave-band based genre classification algorithm for music recommendation," KIICE, vol. 15, no. 7, pp. 1487-1494, 2011. https://doi.org/10.6109/jkiice.2011.15.7.1487
  11. S.-C. Lim, J.-S. Lee, S.-J. Jang, S.-P. Lee, and M. Y. Kim, "Music-genre classification system based on spectro-temporal features and feature selection," IEEE Trans. Consum. Electron., vol. 58, no. 4, pp. 1262-1268, 2012. https://doi.org/10.1109/TCE.2012.6414994
  12. D. Jang and C. D. Y, "Music genre classification using novel features and a weighted voting method," in Proc. IEEE Int. Conf. Multimedia and Expo., pp. 1377-1380, 2008.
  13. D. Jang and C. D. Y, "Music information retrieval using novel features and a weighted voting method," in Proc. IEEE Int. Symposium on Industrial Electronics, pp. 1341-1346, 2009.
  14. C-H. Lee, J-L. Shih, K-M. Yu, and J-M Su, "Automatic music genre classification using modulation spectral contrast feature," in Proc. IEEE Int. Conf. Multimedia and Expo., pp. 204-207, 2007.
  15. C.-H. Lee, J.-L. Shih, K.-M. Yu, and H.-S. Lin, "Automatic music genre classification based on modulation spectral analysis of spectral and cepstral features," IEEE Trans. Multimedia, vol. 11, no. 4, pp. 670-682, 2009. https://doi.org/10.1109/TMM.2009.2017635
  16. C. A. de los Santos, "Nonlinear audio recurrence analysis with application to music genre classification," M.S. thesis, Univ. Pompeu Fabra, 2010.
  17. E. Benetos and C. Kotropoulos, "Non-negative tensor factorization applied to music genre classification," IEEE Audio, Speech, Language Process., vol. 18, no. 8, pp. 1955-1967, 2010. https://doi.org/10.1109/TASL.2010.2040784
  18. I. guyon, J. Weston, S. Barnhill, and V. Vapnik, "Gene selection for cancer classification using support vector machines," Machine Learning, no. 1-3, vol. 46, pp. 389-422, 2002. https://doi.org/10.1023/A:1012487302797
  19. Y. Wang, "A Tree-Based Multi-class SVM Classifier for Digital Library Document", in Proc. IEEE Int. Conf. Multimedia and Information Technology, pp. 15-18, 2008.