DOI QR코드

DOI QR Code

Adaptive Speech Emotion Recognition Framework Using Prompted Labeling Technique

프롬프트 레이블링을 이용한 적응형 음성기반 감정인식 프레임워크

  • Received : 2014.10.01
  • Accepted : 2014.11.25
  • Published : 2015.02.15

Abstract

Traditional speech emotion recognition techniques recognize emotions using a general training model based on the voices of various people. These techniques can not consider personalized speech character exactly. Therefore, the recognized results are very different to each person. This paper proposes an adaptive speech emotion recognition framework made from user's' immediate feedback data using a prompted labeling technique for building a personal adaptive recognition model and applying it to each user in a mobile device environment. The proposed framework can recognize emotions from the building of a personalized recognition model. The proposed framework was evaluated to be better than the traditional research techniques from three comparative experiment. The proposed framework can be applied to healthcare, emotion monitoring and personalized service.

기존의 음성기반 감정인식 기술은 다양한 사용자로부터 수집된 데이터를 기반으로 범용적인 훈련 모델을 생성하고 이를 기반으로 감정을 인식한다. 이러한 음성기반 감정인식 모델링 기술은 개인 사용자의 음성특징을 정확히 고려하기 힘든 방법으로 개인마다 인식 정확도의 편차가 크다. 본 논문에서는 스마트폰 환경에서 프로프트 레이블링 기법을 활용하여 사용자에게 즉각적으로 감정을 피드백 받아 새로운 모델을 생성하여 적용하는 적응형 음성기반 감정인식 프레임워크를 제안한다. 실험을 통하여 제안하는 적응형 음성기반 감정인식 기법이 기존의 범용적인 모델을 사용하였을 때 보다 정확도가 크게 증가됨을 증명하였다.

Keywords

Acknowledgement

Supported by : National Research Foundation of Korea(NRF)

References

  1. A. B. Kandali, A. Routray, T. K. Basu, "Emotion recognition from Assamese speeches using MFCC features and GMM classifier," Proc. of the TENCON 2008, pp. 1-5, 2008. (in Inda)
  2. Z. Xiao, Dellandrea, L. Chen, W. Dou, "Recognition of emotions in speech by a hierarchical approach," Proc. of the 3rd International Conference on Affective Computing and Intelligent Interaction, pp. 1-8, 2009. (in Netherlands)
  3. D. Morrison, R. Wang, Liyanage C. D. Silva "Ensemble methods for spoken emotion recognition in call-centres," Speech Communication, Vol. 49, Issue 2, pp. 98-112, 2007. https://doi.org/10.1016/j.specom.2006.11.004
  4. Tauhidur Rahman, Carlos Busso, "a Personalized Emotion Recognition System Using An Unsupervised Feature Adaptation Scheme," Proc. of the Acoustics, Speech and Signal Processing 2012, pp. 5117-5120, 2012. (in Japan)
  5. SangMin Ahn, MinCheol Whang, DongKeun Kim, JongHwa Kim, SangIn Park, "Real-time emotion recognition technology using individualization processemotional technology," Korean Journal of the Science of Emotion and Sensibility, Vol. 15, No. 1, pp. 133-140, 2012.
  6. Jae Hun Bang, Sungyong Lee, "Call Speech Emotion Recognition for Emotion based Services," Journal of KIISE, Software and Application, Vol. 41, No. 3, pp. 208-213, 2014.
  7. Jae Hun Bang, Sungyoung Lee, Taechung Jung, "Speech Emotion Recognition Framework on Smartphone Environment," Proc. of the 39th KIPS, Vol. 20, Issue 1, pp. 254-256, 10-11, 2013.
  8. A. Klautau (2005, Nov. 22), "The MFCC," [Online]. Available: http://www.cic.unb.br/-lamar/te073/Aulas/mfcc.pdf (downloaded 2012, Nov. 10)
  9. Wikipedia (2015, Jan. 7), "k-means clustering [Online]. Available: http://en.wikipedia.org/wiki/K-means_clustering (downloaded 2012, Nov. 10)