Automatic Emotion Classification of Music Signals Using MDCT-Driven Timbre and Tempo Features

  • Kim, Hyoung-Gook (Samsung Advanced Institute of Technology Computing Lab) ;
  • Eom, Ki-Wan (Samsung Advanced Institute of Technology Computing Lab)
  • Published : 2006.06.01

Abstract

This paper proposes an effective method for classifying emotions of the music from its acoustical signals. Two feature sets, timbre and tempo, are directly extracted from the modified discrete cosine transform coefficients (MDCT), which are the output of partial MP3 (MPEG 1 Layer 3) decoder. Our tempo feature extraction method is based on the long-term modulation spectrum analysis. In order to effectively combine these two feature sets with different time resolution in an integrated system, a classifier with two layers based on AdaBoost algorithm is used. In the first layer the MDCT-driven timbre features are employed. By adding the MDCT-driven tempo feature in the second layer, the classification precision is improved dramatically.

Keywords

References

  1. P. N. Juslin, and J. A. Sloboda, 'Music and emotion: theory and research,' Oxford Univ. Press, 2001
  2. D. Liu, L. Lu, and H.J. Zhang, 'Automatic mood detection from acoustic music data,' in Proc. of 4th International com, on Music Information Retrieval 2003 (ISMIR2003), 26-30, 2003
  3. Y.Z. Feng, Y.T. Zhuang, and Y.H. Pan, 'Music information retrieval by detecting mood via computational media aesthetics,' in Proc. of IEEE/WIC International Conf. on Web Intelligence 2003, 235-241, 2003
  4. D. Pye, 'Content-based methods for the management of digital music,' in Proc. of ICASSP2000, 2473-2440, 2000
  5. D. Liu, L. Lu, and H.J. Zhang, 'Automatic mood detection from acoustic music data,' in Proc. of 4th International Conf. on Music Information Retrieval 2003 (ISMIR2003), 26-30, 2003
  6. S. Sukittanon, L. E. Atlas, and J. W. Pitton, 'Modulation-scale analysis for content identification,' IEEE Trans. on Signal Processing, 52 (10) 3023-3035, 2003 https://doi.org/10.1109/TSP.2004.833861
  7. B.L. Feldman, and J.A. Russell, 'Independence and bipolarity in the structure of affect,' Journal of Personality and Social Psychology, 74, 967-984, 1998 https://doi.org/10.1037/0022-3514.74.4.967
  8. Y. Freund, and R.E. Schapire, 'A decision-theoretic generalization of on-line learning and an application to boosting,' Journal of Computer and system Sciences, 55 (1) 119-139, 1997 https://doi.org/10.1006/jcss.1997.1504
  9. ISO/IEC 11172-3, 'Coding of moving pictures and associated audio for digital storage media, part3: audio'. 1993