Browse > Article
http://dx.doi.org/10.7776/ASK.2008.27.4.191

A Noise Robust Speech Recognition Method Using Model Compensation Based on Speech Enhancement  

Shen, Guang-Hu (영남대학교 정보통신공학과)
Jung, Ho-Youl (영남대학교 정보통신공학과)
Chung, Hyun-Yeol (영남대학교 정보통신공학과)
Abstract
In this paper, we propose a MWF-PMC noise processing method which enhances the input speech by using Mel-warped Wiener Filtering (MWF) at pre-processing stage and compensates the recognition model by using PMC (Parallel Model Combination) at post-processing stage for speech recognition in noisy environments. The PMC uses the residual noise extracted from the silence region of enhanced speech at pre-processing stage to compensate the clean speech model and thus this method is considered to improve the performance of speech recognition in noisy environments. For recognition experiments we dew.-sampled KLE PBW (Phoneme Balanced Words) 452 word speech data to 8kHz and made 5 different SNR levels of noisy speech, i.e., 0dB. 5dB, 10dB, 15dB and 20dB, by adding Subway, Car and Exhibition noise to clean speech. From the recognition results, we could confirm the effectiveness of the proposed MWF-PMC method by obtaining the improved recognition performances over all compared with the existing combined methods.
Keywords
Speech recognition; Speech enhancement; Mel-warped wiener filtering; Model compensation; PMC;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Y. Ephraim, D. Malah, "Speech Enhancement Using a Minimum Mean Square Error Short-Time Spectral Amplitude Estimator," Proc. ICASSP, ASSP-32(6), 1109-1121, 1984
2 M. J. Gales, S. Young, "Robust Speech Recognition in Additive and Convolutional Noise Using Parallel Model Combination," Proc. Computer Speech and Language, 289-307, 1995   DOI   ScienceOn
3 J. A. Nolazco Flores, S. Young, "Adapting A HMM-based Recogniser for Noisy Speech Enhanced by Spectral Subtraction," CUED/F-INFENG/TR.123, Cambridge University, England, 1993
4 J. A. Nolazco Flores, S. Young, "Continuous Speech Recognition in Noise Using Spectral Subtraction and HMM Adaptatioin," Proc. ICASSP, 1, 409-412, 1994
5 H. Hermansky, "Perceptual Linear Prediction (PLP) Analysis of Speech," Proc. JASA, 1738-1752, 1990
6 R. J. McAulay, M. L. Malpass, "Speech Enhancement Using A Soft-Decision Noise Suppression Filter," Proc. IEEE Trans. on Acoustic Speech Signal Processing, 28(2), 1995
7 김희근, 정용주, 배건성, "음질향상 기법과 모델보상 방식을 결합한 강인한 음성인식 방식," 음성과학, 14(2), 115-126, 2007
8 S. J. Oh, H. Y. Chung, C. J. Hwang, B. K. Kim, A. Ito, "New State Clustering of Hidden Markov Network with Korean Phonological Rules for Speech Recognition," Proc. IEEE 4th Workshop on Multimedia Signal Processing, 39-44, 2001
9 김남수,"잡음 환경에서의 음성인식," Telecommunications Review, 13(5), 650-661, 2003
10 J. Chen, K. K. Paliwal, S. Nakamura, "Sub-Band Based Additive Noise Removal for Robust Speech Recognition," Proc. Eurospeech, 70-73, 2001
11 S. Davis, P. Mermelstein, "Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences," Proc. IEEE Trans. on Acoustics, Speech, and Signal Processing, ASSP-28(4), 357-366, 1980
12 ETSI final draft standard doc., "Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Front-End Feature Extraction Algorithm; Compression Algorithms," ETSI ES 202 050, v1.1.1, 2002
13 F. Martin, K. Shikano, Y. Minami, "Recognition of Noisy Speech by Using The Composition of Hidden Markov Models," Proc. ASJ, 1-7-10, 1992
14 M. J. Gales, S. Young, "An Improved Approach to The Hidden Markov Model Decomposition of Speech and Noise," Proc. ICASSP, I-233-236, 1992
15 K. Satoshi, S. Sumitaka, Y. Yoshikazu, T. Satoshi, "Robust Speech Recognition Based on HMM Composition and Modified Wiener Filter," Proc. ICSLP, 2053-2056, 2004
16 S. Sagayama, Y. Yamaguchi, S, Takahashi, "Jacobian Adaptation of Noisy Speech Models," Proc. ASU, 396-403, 1997
17 A. Agarwal, Y. M. Cheng, "Two-Stage Mel-warped Wiener Filter for Robust Speech Recognition," Proc. ASRU, 67-70, 1999
18 정용주, 이승욱, "자동차 잡음환경 고립단어 음성인식에서의 VTS와 PMC의 성능비교," 음성과학, 10(3), 251-261. 2003
19 S. V. Vaseghi, Advanced Signal Processing and Digital Noise Reduction (Wiley & Teubner Publishers, 1996), Chap. 5, 140-162