Browse > Article

Utilizing Korean Ending Boundary Tones for Accurately Recognizing Emotions in Utterances  

Jang In-Chang (고려대학교 산업정보시스템공학과)
Lee Tae-Seung (한국과학기술연구원 CAD/CAM 연구센터)
Park Mikyoung (한국과학기술연구원 CAD/CAM 연구센터)
Kim Tae-Soo (한국과학기술연구원 CAD/CAM 연구센터)
Jang Dong-Sik (고려대학교 산업정보시스템공학과)
Abstract
Autonomic machines interacting with human should have capability to perceive the states of emotion and attitude through implicit messages for obtaining voluntary cooperation from their clients. Voice is the easiest and most natural way to exchange human messages. The automatic systems capable to understanding the states of emotion and attitude have utilized features based on pitch and energy of uttered sentences. Performance of the existing emotion recognition systems can be further improved withthe support of linguistic knowledge that specific tonal section in a sentence is related with the states of emotion and attitude. In this paper, we attempt to improve recognition rate of emotion by adopting such linguistic knowledge for Korean ending boundary tones into anautomatic system implemented using pitch-related features and multilayer perceptrons. From the results of an experiment over a Korean emotional speech database, the improvement of $4\%$ is confirmed.
Keywords
emotion recognition; human and computer interaction; intonation phonology; ending boundary tones; multilayer perceptrons;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W. and Taylor, J.G., 'Emotion Recognition in HumanComputer Interaction,' IEEE Signal Processing Magazine, Vol. 18, No.1, pp. 32-80, Jan 2001   DOI   ScienceOn
2 Schuller, B., Rigoll, G. and Lang, M., 'Hidden Markov Model-Based Speech Emotion Recognition,' Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2, pp. 1-4, Apr 2003
3 Schuller, B., Rigoll, G. and Lang, M., 'Speech Emotion Recognition Combining Acoustic Features and Linguistic Information in a Hybrid Support Vector Machine-Belief Network Architecture,' Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1, pp. 577-580, May 2004
4 O'Connor, J. D. and Arnold G. F., Intonation of Colloquial English, Longmans, 1961
5 이호영, '한국어의 억양체계,' 언어학, 제13호, pp. 129-151, 12월 1991년
6 Krubsack, D. A. and Niederjohn, R. J., 'An Autocorrelation Pitch Detector and Voicing Decision with Confidence Measures Developed for Noise-Corrupted Speech,' IEEE Transactions on Signal Processing, Vol. 39, No.2, pp. 319-329, Feb 1991   DOI   ScienceOn
7 Gauvain, J. and Lamel, L., 'Large-Vocabulary Continuous Speech Recognition: Advances and Applications,' Proceedings of the IEEE, Vol. 88, No.8, pp. 1181-1200, Aug 2000
8 Dellaert, F., Polzin, T., and Waibel, A., 'Recognizing Emotion in Speech,' Proceedings of the International Conference on Spoken Language, Vol. 3, pp. 1970-1973, Oct 1996
9 Bengio, Y., Neural Networks for Speech and Sequence Recognition, International Thomson Computer Press, 1995.
10 Jun, S., K-ToBI Labelling Conventions, Ver. 3.1, http://www.linguistics.ucla.edu/people/jun/ktobi/K-tobi.html, 2000
11 Rabiner, L. and Sambur, M. 'An Algorithm for Determining the Endpoints of Isolated Utterances,' Bell System Technical Journal, Vol. 54, pp. 297-315, Feb 1975
12 Pierrehurnbert, J. and Hirschberg, J., 'The Meaning of Intonation Contours in the Interpretation of Discourse,' Intentions in Communication, MIT Press, pp. 271-323, 1990
13 Yoshimura, T., Hayamizu, S., Ohmura, H., and Tanaka, K., 'Pitch Pattern Clustering of User Utterances in Human-Machine Dialogue,' Proceedings of the International Conference on Spoken Language, Vol. 2, pp. 837-840, Oct 1996
14 Bhatti, M. W., Wang Y., and Guan. L., 'A Neural Network Approach for Human Emotion Recognition in Speech,' Proceedings of the 2004 International Symposium on Circuits and Systems, Vol. 2, pp. 181-184, May 2004