DOI QR코드

DOI QR Code

ARMA Filtering of Speech Features Using Energy Based Weights

에너지 기반 가중치를 이용한 음성 특징의 자동회귀 이동평균 필터링

  • Ban, Sung-Min (School of Electrical Engineering, Pusan National University) ;
  • Kim, Hyung-Soon (School of Electrical Engineering, Pusan National University)
  • Received : 2012.01.03
  • Accepted : 2012.02.06
  • Published : 2012.02.29

Abstract

In this paper, a robust feature compensation method to deal with the environmental mismatch is proposed. The proposed method applies energy based weights according to the degree of speech presence to the Mean subtraction, Variance normalization, and ARMA filtering (MVA) processing. The weights are further smoothed by the moving average and maximum filters. The proposed feature compensation algorithm is evaluated on AURORA 2 task and distant talking experiment using the robot platform, and we obtain error rate reduction of 14.4 % and 44.9 % by using the proposed algorithm comparing with MVA processing on AURORA 2 task and distant talking experiment, respectively.

Keywords

References

  1. H. Hermansky, N. Morgan "RASTA processing of speech", IEEE Trans. Speech and Audio Process., vol. 2, no. 4, pp. 578-589, 1994. https://doi.org/10.1109/89.326616
  2. X. Lu, S. Matsuda, M. Unoki, S. Nakamura, "Temporal contrast normalization and edge-preserved smoothing of temporal modulation structures of speech for robust speech recognition", Speech Comm., vol. 52, no. 1, pp. 1-11, 2010. https://doi.org/10.1016/j.specom.2009.08.006
  3. C. P. Chen, J. Bilmes, "MVA processing of speech features", IEEE Trans. Audio Speech Language Process., vol. 15, no. 1, pp. 257-270, 2007. https://doi.org/10.1109/TASL.2006.876717
  4. S. M. Ban, H. S. Kim, "Robust speech recognition using weighted auto-regressive moving average filter", Journal of the Korean Society of Speech Sciences, vol. 2, no. 4, pp. 145-151, 2010.
  5. H. G. Hirsch and D. Pearce, "The Aurora experimental framework for the performance evaluations of speech recognition systems under noisy conditions," ISCA ITRW ASR2000, Sep. 2000.
  6. K. B. Kim, N. I. Cho, "Frequency domain multi-channel noise reduction based on the spatial subspace decomposition and noise eigenvalue modification," Speech Comm., vol. 50, no. 5, pp. 382-391, 2008. https://doi.org/10.1016/j.specom.2007.11.004
  7. ETSI, "Speech processing, transmission and quality aspects (STQ); distributed speech recognition; advanced front-end feature extraction algorithm; compression algorithms," ETSI ES 202 050 Recommendation, 2002.