Browse > Article

Method for Spectral Enhancement by Binary Mask for Speech Recognition Enhancement Under Noise Environment  

Choi, Gab-Keun (광운대학교 대학원 컴퓨터공학과)
Kim, Soon-Hyob (광운대학교 대학원 컴퓨터공학과)
Abstract
The major factor that disturbs practical use of speech recognition is distortion by the ambient and channel noises. Generally, the ambient noise drops the performance and restricts places to use. DSR (Distributed Speech Recognition) based speech recognition also has this problem. Various noise cancelling algorithms are applied to solve this problem, but loss of spectrum and remaining noise by incorrect noise estimation at low SNR environments cause drop of recognition rate. This paper proposes methods for speech enhancement. This method uses MMSE-STSA for noise cancelling and ideal binary mask to compensate damaged spectrum. According to experiments at noisy environment (SNR 15 dB ~ 0 dB), the proposed methods showed better spectral results and recognition performance.
Keywords
Spectrum Enhancement; Noisy Speech Recognition;
Citations & Related Records
연도 인용수 순위
  • Reference
1 ETSI standard document, Speech Processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms, ETSI ES 201 108 v.1.1.1 (2000-02), Feb. 2002.
2 ETSI standard document, Speech Processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithms, ETSI ES 201 108 v.1.1.1 (2000-02), Feb. 2002.
3 R. Flynn, E jones, "Robust Distributed Speech Recognition using Speech Enhancement", IEEE Tansactions on Consumer Electronics, vol. 54, no. 3, pp. 1267-1273, 2008. 8.   DOI
4 Ephraim, Y., Malah, D. "Speech enhancement Using a minimum mean square error short-time spectral amplitude estimator", IEEE Trans. Acoust., Speech Signal Process., vol. 32, pp. 1109- 1121, 1984.   DOI
5 A. S. Bregman, Auditory Scene Analysis. Cambridge, MA: MIT Press, 1990.
6 N. Roman, D. L. Wang, and G. J. Brown, "Speech segregation based on sound localization," Journal of the Acoustical Society of America, vol. 114, no. 4, pp. 2236–2252, 2003.   DOI   ScienceOn
7 R. Lyon, "A computational model of filtering, detection, and compression in the cochlea," in Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP'82., vol. 7, pp. 1282-1285, 1982.
8 A. Varga and H. J. M. Steeneken, "Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems," Speech Communication, vol. 12, no. 3, pp. 247-251, July 1993.   DOI   ScienceOn