Speech Emotion Recognition Based on GMM Using FFT and MFB Spectral Entropy

FFT와 MFB Spectral Entropy를 이용한 GMM 기반의 감정인식

  • Lee, Woo-Seok (School of information and Communication Engineering, Sungkyunkwan University) ;
  • Roh, Yong-Wan (School of information and Communication Engineering, Sungkyunkwan University) ;
  • Hong, Hwang-Seok (School of information and Communication Engineering, Sungkyunkwan University)
  • 이우석 (성균관대학교 정보통신공학부) ;
  • 노용완 (성균관대학교 정보통신공학부) ;
  • 홍광석 (성균관대학교 정보통신공학부)
  • Published : 2008.04.25

Abstract

This paper proposes a Gaussian Mixture Model (GMM) - based speech emotion recognition methods using four feature parameters; 1) Fast Fourier Transform(FFT) spectral entropy, 2) delta FFT spectral entropy, 3) Mel-frequency Filter Bank (MFB) spectral entropy, and 4) delta MFB spectral entropy. In addition, we use four emotions in a speech database including anger, sadness, happiness, and neutrality. We perform speech emotion recognition experiments using each pre-defined emotion and gender. The experimental results show that the proposed emotion recognition using FFT spectral-based entropy and MFB spectral-based entropy performs better than existing emotion recognition based on GMM using energy, Zero Crossing Rate (ZCR), Linear Prediction Coefficient (LPC), and pitch parameters. In experimental Results, we attained a maximum recognition rate of 75.1% when we used MFB spectral entropy and delta MFB spectral entropy.

Keywords