Feature Vector Processing for Speech Emotion Recognition in Noisy Environments

잡음 환경에서의 음성 감정 인식을 위한 특징 벡터 처리

  • 박정식 (한국과학기술원 전산학과) ;
  • 오영환 (한국과학기술원 전산학과)
  • Received : 2009.11.01
  • Accepted : 2010.01.16
  • Published : 20100300

Abstract

This paper proposes an efficient feature vector processing technique to guard the Speech Emotion Recognition (SER) system against a variety of noises. In the proposed approach, emotional feature vectors are extracted from speech processed by comb filtering. Then, these extracts are used in a robust model construction based on feature vector classification. We modify conventional comb filtering by using speech presence probability to minimize drawbacks due to incorrect pitch estimation under background noise conditions. The modified comb filtering can correctly enhance the harmonics, which is an important factor used in SER. Feature vector classification technique categorizes feature vectors into either discriminative vectors or non-discriminative vectors based on a log-likelihood criterion. This method can successfully select the discriminative vectors while preserving correct emotional characteristics. Thus, robust emotion models can be constructed by only using such discriminative vectors. On SER experiment using an emotional speech corpus contaminated by various noises, our approach exhibited superior performance to the baseline system.

Keywords