[KSCI] Korea Science Citation Index Service

Recognition of Emotion and Emotional Speech Based on Prosodic Processing

Kim, Sung-Ill (Division of Electronic and Electrical Engineering, College of Engineering, Kyungnam University)

Publication Information

The Journal of the Acoustical Society of Korea / v.23, no.3E, 2004 , pp. 85-90 More about this Journal

Abstract

This paper presents two kinds of new approaches, one of which is concerned with recognition of emotional speech such as anger, happiness, normal, sadness, or surprise. The other is concerned with emotion recognition in speech. For the proposed speech recognition system handling human speech with emotional states, total nine kinds of prosodic features were first extracted and then given to prosodic identifier. In evaluation, the recognition results on emotional speech showed that the rates using proposed method increased more greatly than the existing speech recognizer. For recognition of emotion, on the other hands, four kinds of prosodic parameters such as pitch, energy, and their derivatives were proposed, that were then trained by discrete duration continuous hidden Markov models(DDCHMM) for recognition. In this approach, the emotional models were adapted by specific speaker's speech, using maximum a posteriori(MAP) estimation. In evaluation, the recognition results on emotional states showed that the rates on the vocal emotions gradually increased with an increase of adaptation sample number.

Keywords

Prosody; Emotional speech; Speech recognition; Emotion recognition; HMM;

Citations & Related Records

Reference

1	F. Dellaert, T. Polzin, A. Waibel, 'Recognizing Emotion in Speech', Proc. of the ICSLP'96, October, 1996
2	T. Moriyama and S. Ozawa, 'Emotion Recognition and Synthesis System on Speech', Proc. of International Conference on Multimedia Computing and Systems(ICMCS'99), Florence, Italy, 1999
3	D. Roy, A. Pentland. Automatic, 'Spoken Affect Classification and Analysis', Proc. of the 2nd International Conference on Automatic Face and Gesture Recognition, pp 363-367, 1996
4	Alice E. Turk, James R. Sawusch, 'The processing of duration and intensity cues to prominence', Journal of the Acoustical Society of America, 99(6), 3782-3790, June 1996 DOI ScienceOn
5	A. Fernald, 'Approval and disapproval: Infant responsiveness to vocal affect in familiar and unfamiliar languages', Developmental Psychology, 64, pp 657-674, 1993
6	Y. Yu, E. Chang and C. Li, 'Computer Recognition of Emotion in Speech', The 2002 Intel International Science and Engineering Fair, 2002
7	C. Becchetti and L. P. Ricotti, Speech Recogniton: Theory and C++ Implementation, (John Wiley & Sons, 2000)
8	E. Vyzas, 'Recognition of Emotional and Cognitive States Using Physiological Data', Mechanical Engineer's Degree Thesis, MIT, June 1999
9	Y. Tsurumi and S. Nakagawa, 'An Unsupervised Speaker Adaptation Method for Continuous Parameter HMM by Maximum a Posteriori Probability Estimation', Proc. of ICSLP'94, pp.431-434, 1994
10	C Tuerk, 'A Text to Speech System based on NETtaIk', Master's Thesis, Cambridge University Englneering Dept, 1990
11	J. L. Armony, D. Servan-Schreiber, J. D. Cohen, and J. E. LeDoux, 'Computational modeling of emotion: Explorations through the anatomy and physiology of fear conditioning', Trends in Cognitive Sciences, 1(1), 28-34, April 1997
12	Waibel, A, 'Prosody and Speech Recognition', Doctoral Thesis, Carnegie Mellon Univ. 1986
13	'Julius' Japanese large vocabulary continuous speech recognition system Available: http://winnie.kuis.kyoto-u.ac.jp/pub/julius/index.html
14	L. Rabiner and B-H. Juang, 'Fundamentals of Speech Recognition', Prentice Hall Signal Processing Series, 1993
15	David Talkin. 'A robust algorithm for pitch tracking (RAPT),' in Speech Coding and Synthesis, Elsevier Science, Amsterdam, pp.495-518, 1995
16	L. R. Rabiner, R. W. Schafer, Digital Processing of Speech Signal, (Prentice-Hall), 1978
17	K.F.Lee, Automatic Speech Recognition; The Development of SPHINX System, Kluwer Academic Publisher, Norwell, Mass., 1989
18	Deb Roy, Alex Pentland, 'Automatic spoken affect classification and analysis', IEEE Face and Gesture Conference, Killington, VT, pp.363-367, 1996
19	Rosalind W. Picard, 'Affective Computing', MIT Press, Cambridge, MA, 1997