Browse > Article

Comparison of feature parameters for emotion recognition using speech signal  

김원구 (군산대학교 전자정보공학부)
Publication Information
Abstract
In this paper, comparison of feature parameters for emotion recognition using speech signal is studied. For this purpose, a corpus of emotional speech data recorded and classified according to the emotion using the subjective evaluation were used to make statical feature vectors such as average, standard deviation and maximum value of pitch and energy and phonetic feature such as MFCC parameters. In order to evaluate the performance of feature parameters speaker and context independent emotion recognition system was constructed to make experiment. In the experiments, pitch, energy parameters and their derivatives were used as a prosodic information and MFCC parameters and its derivative were used as phonetic information. Experimental results using vector quantization based emotion recognition system showed that recognition system using MFCC parameter and its derivative showed better performance than that using the pitch and energy parameters.
Keywords
MFCC;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Janet E. Cahn, 'The generation of affect in synthesized speech', Journal of the American Voice I/O Society, Vol. 8, pp. 1-19, July 1990
2 K. R. Scherer, D. R. Ladd, and K. E. A. Silverman, 'Vocal cues to speaker affect: Testing two models', Journal Acoustical Society of America, Vol. 76, No. 5, pp. 1346-1355, Nov. 1984   DOI   ScienceOn
3 D. Roy and A. Pentland, 'Automatic spoken affect analysis and classification', in Proceedings of the Second International Conference on Automatic Face and Gesture Recognition, pp. 363-367, Killington, VT, Oct. 1996   DOI
4 Rosalind W. Picard, Affective Computing, The MIT Press 1997
5 강봉석, 음성 신호를 이용한 감정 인식, 석사학위논문, 연세대학교, 1999년 12월
6 C. E. Williams and K. N. Stevens, 'Emotions and speech: Some acoustical correlates', Journal Acoustical Society of America, Vol. 52, No. 4, pp. 1238-1250, 1972   DOI
7 Frank Dellaert, Thomas Polzin, Alex Waibel, 'Recognizing emotion in speech', Proceedings of the ICSLP 96, Piladelphia, USA, Oct. 1996   DOI
8 Lain R. Murray and John L. Arnott, 'Toward the simulation of emotion in synthetic speech: A review of the literature on human vocal emotion', Published in J. Accoust. Soc. Am., pp. 1097-1108, Feb. 1993   DOI   ScienceOn
9 Thomas S. Huang, Lawrence S. Chen and Hai Tao, 'Bimodal emotion recognition by man and machine', ATR Workshop on Virtual Communication Environments - Bridges over Art/Kansei and VR Technologies, Kyoto, Japan, April 1998
10 Jun Sato, and Shigeo Morishima, 'Emotion Modeling in Speech Production using Emotion Space', Proceedings of the IEEE International Workshop 1996, pp. 472-477, IEEE, Piscataway, NJ, USA, 1996   DOI
11 Michael Lewis and Jeannette M. Haviland, Handbook of Emotions, The Guilford Press, 1993
12 Earl Gose, Richard Johnsonbaugh, and Steve Jost, Pattern Recognition and Image Analysis, Prentice Hall Inc., 1996
13 R.O. Duda, and P.E. Hart, Pattern classification and scene anlaysis, John Wiley & Sons Inc., 1973
14 L. R. Rabiner and B. H. Juang, Fundamentals of speech recognition, Prentice-Hall Inc., 1993