Browse > Article
http://dx.doi.org/10.5391/JKIIS.2005.15.6.655

Robust Speech Recognition Parameters for Emotional Variation  

Kim Weon-Goo (군산대학교 전자정보공학부)
Publication Information
Journal of the Korean Institute of Intelligent Systems / v.15, no.6, 2005 , pp. 655-660 More about this Journal
Abstract
This paper studied the feature parameters less affected by the emotional variation for the development of the robust speech recognition technologies. For this purpose, the effect of emotional variation on the speech recognition system and robust feature parameters of speech recognition system were studied using speech database containing various emotions. In this study, LPC cepstral coefficient, met-cepstral coefficient, root-cepstral coefficient, PLP coefficient, RASTA met-cepstral coefficient were used as a feature parameters. And CMS and SBR method were used as a signal bias removal techniques. Experimental results showed that the HMM based speaker independent word recognizer using RASTA met-cepstral coefficient :md its derivatives and CMS as a signal bias removal showed the best performance of $7.05\%$ word error rate. This corresponds to about a $52\%$ word error reduction as compare to the performance of baseline system using met - cepstral coefficient.
Keywords
HMM; MFCC;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. Koehler, N. Morgan, H. Hermansky, H. G. Hirsch, G. Tong, 'Integrating RASTA-PLP into Speech Recognition', in Proc. ICASSP, pp. 421-424, 1994
2 K. R. Scherer, D. R. Ladd, and K. E. A. Silverman, 'Vocal Cues to Speaker Affect: Testing Two Models', Journal Acoustical Society of America, Vol. 76, No. 5, pp. 1346-1355, Nov. 1984   DOI   ScienceOn
3 C. E. Williams and K. N. Stevens, 'Emotions and Speech: Some Acoustical Correlates', Journal Acoustical Society of America, Vol. 52, No. 4, pp. 1238-1250, 1972   DOI   PUBMED
4 Michael Lewis and Jeannette M. Haviland, Handbook of Emotions,,The Guilford Press 1993
5 P.Alexandre, ect. 'Root Cepstral Analysis: A Unified View. Application to Speech Processing in Car Noise Environments', Speech Communication, vol. 12, no. 3, pp. 277-288, 1993   DOI   ScienceOn
6 Iain R. Murray and John L. Arnott, 'Toward the Simulation of Emotion in Synthetic Speech: A review of the literature on human vocal emotion',, Journal of Accoustal Society of America., pp. 1097-1108, Feb. 1993
7 R. W. Picard, Affective Computing, MIT Press 1997
8 Janet E. Cahn, ,'The Generation of Affect in Synthesized Speech',, Journal of the American Voice I/O Society, Vol. 8, pp. 1-19, July 1990
9 H. Hermansky, N. Morgan, A. Bayya, P. Kohn, 'Compensation for the Effect of the Communication Channel in Auditory-Like Analysis of Speech(RASTA-PLP)', in Proc. EUROSPEECH, vol. 3, pp. 1367-1370, Sep. 1991
10 H. Hermansky, N. Morgan, H. G. Hirsch, 'Recognition of Speech in Additive and Convolutional Noise based RASTA Spectral Processing', in Proc. ICASSP, pp. 83-86, 1993
11 S. Young, 'A Review of Large-Vocabulary Continuous-Speech Recognition',,IEEE Signal Processing Magazine, Vol. 13, No. 5, pp. 45-47, 1996   DOI   ScienceOn
12 L. R. Rabiner and B. H. Juang, Fundamentals of speech recognition, Prentice-Hall Inc., 1993
13 M. G. Rahim, B. H. Juang, 'Signal Bias Removal by Maximum Likelihood Estimation for Robust Telephone Speech Recognition', IEEE Trans. Speech & Audio Processing, vol. 4, No. 1, pp. 19-30, 1996   DOI   ScienceOn
14 J. C. Junqua, and J. P. Haton, Robustness in Automatic Speech Recognition - Fundamental and Applications, Kluwer Academic Publishers, 1996
15 Noam Amir,'Classifying Emotions in Speech: a Comparison of Methods', Proceedings of Eurospeech '2001, Vol. 1, pp. 127-130, Aalborg, Denmark, 2001
16 L. R. Rabiner,,'A Tutorial on HMMs and Selected Applications in Speech Recognition', Proc. IEEE, Vol. 77, No. 2, pp. 257-285, 1989
17 A. Acero, ect, 'Environmental Robustness in Automatic Speech Recognition,' in Proc. ICASSP, pp. 849-852, April 1990
18 A. Nogueiras, etc,'Speech Emotion Recognition using Hidden Markov Models', Proceedings of Eurospeech '2001, Vol. 4, pp. 2679-2682, Aalborg, Denmark, 2001