Utilizing Korean Ending Boundary Tones for Accurately Recognizing Emotions in Utterances

Jang In-Chang;Lee Tae-Seung;Park Mikyoung;Kim Tae-Soo;Jang Dong-Sik;

한국통신학회논문지 (The Journal of Korean Institute of Communications and Information Sciences)

제30권6C호
/
Pages.505-511
/
2005
/
1226-4717(pISSN)
/
2287-3880(eISSN)

한국통신학회 (The Korean Institute of Commucations and Information Sciences)

발화 내 감정의 정밀한 인식을 위한 한국어 문미억양의 활용

Utilizing Korean Ending Boundary Tones for Accurately Recognizing Emotions in Utterances

장인창 (고려대학교 산업정보시스템공학과) ;
이태승 (한국과학기술연구원 CAD/CAM 연구센터) ;
박미경 (한국과학기술연구원 CAD/CAM 연구센터) ;
김태수 (한국과학기술연구원 CAD/CAM 연구센터) ;
장동식 (고려대학교 산업정보시스템공학과)

발행 : 2005.06.01

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

인간을 상대하는 자율장치는 고객의 자발적인 협조를 얻기 위해 암시적인 신호에 포함된 감정과 태도를 인지할 수 있어야 한다. 인간에게 음성은 가장 쉽고 자연스럽게 정보를 교환할 수 있는 수단이다. 지금까지 감정과 태도를 이해할 수 있는 자동시스템은 발성문장의 피치와 에너지에 기반한 특징을 활용하였다. 이와 같은 기존의 감정인식 시스템의 성능은 문장의 특정한 억양구간이 감정과 태도와 관련을 갖는다는 언어학적 지식의 활용으로 보다 높은 향상이 가능하다. 본 논문에서는 한국어 문미억양에 대한 언어학적 지식을 피치기반 특징과 다층신경망을 활용하여 구현한 자동시스템에 적용하여 감정인식률을 향상시킨다. 한국어 감정음성 데이터베이스를 대상으로 실험을 실시한 결과 $4\%$의 인식률 향상을 확인하였다.

Autonomic machines interacting with human should have capability to perceive the states of emotion and attitude through implicit messages for obtaining voluntary cooperation from their clients. Voice is the easiest and most natural way to exchange human messages. The automatic systems capable to understanding the states of emotion and attitude have utilized features based on pitch and energy of uttered sentences. Performance of the existing emotion recognition systems can be further improved withthe support of linguistic knowledge that specific tonal section in a sentence is related with the states of emotion and attitude. In this paper, we attempt to improve recognition rate of emotion by adopting such linguistic knowledge for Korean ending boundary tones into anautomatic system implemented using pitch-related features and multilayer perceptrons. From the results of an experiment over a Korean emotional speech database, the improvement of $4\%$ is confirmed.

키워드

참고문헌

Cowie, R., Douglas-Cowie, E., Tsapatsoulis, N., Votsis, G., Kollias, S., Fellenz, W. and Taylor, J.G., 'Emotion Recognition in HumanComputer Interaction,' IEEE Signal Processing Magazine, Vol. 18, No.1, pp. 32-80, Jan 2001 https://doi.org/10.1109/79.911197
Gauvain, J. and Lamel, L., 'Large-Vocabulary Continuous Speech Recognition: Advances and Applications,' Proceedings of the IEEE, Vol. 88, No.8, pp. 1181-1200, Aug 2000
Yoshimura, T., Hayamizu, S., Ohmura, H., and Tanaka, K., 'Pitch Pattern Clustering of User Utterances in Human-Machine Dialogue,' Proceedings of the International Conference on Spoken Language, Vol. 2, pp. 837-840, Oct 1996
Dellaert, F., Polzin, T., and Waibel, A., 'Recognizing Emotion in Speech,' Proceedings of the International Conference on Spoken Language, Vol. 3, pp. 1970-1973, Oct 1996
Bhatti, M. W., Wang Y., and Guan. L., 'A Neural Network Approach for Human Emotion Recognition in Speech,' Proceedings of the 2004 International Symposium on Circuits and Systems, Vol. 2, pp. 181-184, May 2004
Schuller, B., Rigoll, G. and Lang, M., 'Hidden Markov Model-Based Speech Emotion Recognition,' Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 2, pp. 1-4, Apr 2003
Schuller, B., Rigoll, G. and Lang, M., 'Speech Emotion Recognition Combining Acoustic Features and Linguistic Information in a Hybrid Support Vector Machine-Belief Network Architecture,' Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol. 1, pp. 577-580, May 2004
O'Connor, J. D. and Arnold G. F., Intonation of Colloquial English, Longmans, 1961
Jun, S., K-ToBI Labelling Conventions, Ver. 3.1, http://www.linguistics.ucla.edu/people/jun/ktobi/K-tobi.html, 2000
Pierrehurnbert, J. and Hirschberg, J., 'The Meaning of Intonation Contours in the Interpretation of Discourse,' Intentions in Communication, MIT Press, pp. 271-323, 1990
이호영, '한국어의 억양체계,' 언어학, 제13호, pp. 129-151, 12월 1991년
Rabiner, L. and Sambur, M. 'An Algorithm for Determining the Endpoints of Isolated Utterances,' Bell System Technical Journal, Vol. 54, pp. 297-315, Feb 1975
Krubsack, D. A. and Niederjohn, R. J., 'An Autocorrelation Pitch Detector and Voicing Decision with Confidence Measures Developed for Noise-Corrupted Speech,' IEEE Transactions on Signal Processing, Vol. 39, No.2, pp. 319-329, Feb 1991 https://doi.org/10.1109/78.80814
Bengio, Y., Neural Networks for Speech and Sequence Recognition, International Thomson Computer Press, 1995.

한국통신학회논문지 (The Journal of Korean Institute of Communications and Information Sciences)

발화 내 감정의 정밀한 인식을 위한 한국어 문미억양의 활용

Utilizing Korean Ending Boundary Tones for Accurately Recognizing Emotions in Utterances

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)