Browse > Article
http://dx.doi.org/10.13064/KSSS.2014.6.2.029

Noise Robust Speech Recognition Based on Noisy Speech Acoustic Model Adaptation  

Chung, Yongjoo (계명대학교)
Publication Information
Phonetics and Speech Sciences / v.6, no.2, 2014 , pp. 29-34 More about this Journal
Abstract
In the Vector Taylor Series (VTS)-based noisy speech recognition methods, Hidden Markov Models (HMM) are usually trained with clean speech. However, better performance is expected by training the HMM with noisy speech. In a previous study, we could find that Minimum Mean Square Error (MMSE) estimation of the training noisy speech in the log-spectrum domain produce improved recognition results, but since the proposed algorithm was done in the log-spectrum domain, it could not be used for the HMM adaptation. In this paper, we modify the previous algorithm to derive a novel mathematical relation between test and training noisy speech in the cepstrum domain and the mean and covariance of the Multi-condition TRaining (MTR) trained noisy speech HMM are adapted. In the noisy speech recognition experiments on the Aurora 2 database, the proposed method produced 10.6% of relative improvement in Word Error Rates (WERs) over the MTR method while the previous MMSE estimation of the training noisy speech produced 4.3% of relative improvement, which shows the superiority of the proposed method.
Keywords
noisy speech recognition; model adaptation; VTS; HMM;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Young, S. (1993). HTK: Hidden Markov Model Toolkit V3.4.1. Cambridge University, Engineering Department, Speech Group.
2 Gales, M. (1995). Model based techniques for noise- robust speech recognition. Ph.D. Dissertation, University of Cambridge, United Kingdom.
3 Ball, S. F. (1979). Suppression of acoustic noise in speech using spectral subtraction, IEEE Trans. Acoust., Speech, Signal Process. Vol. 27, No. 2, 113-122.   DOI
4 Moreno, P. J. (1996). Speech recognition in noisy environments. Ph.D. Dissertation, Carnegie Mellon University, United States of America.
5 Hirsch, H. G. & Pearce, D. (2000). The Aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions, in Proc. ICSLP. 18-20.
6 Kalinli, O., Seltzer, M. L., Droppo, J., & Acero, A. (2010). Noise adaptive training for robust automatic speech recognition. IEEE Trans. Audio, Speech and Language Process. Vol. 18, No. 8, 1889-1901.   DOI   ScienceOn
7 Chung, Y. & Hansen, J.H.L. (2013). Compensation of SNR and noise type mismatch using an environmental sniffing based speech recognition solution. EURASIP Journal on Audio, Speech, and Music Processing, 2013:12, 1-14.
8 Gopinath, R. A., Gales, M., Gopalakrishnan, P. S., Balakrishnan-Aiyer, S. & Pocheny M. A. (1995). Robust speech recognition in Noise : Performance of the IBM continuous speech recognizer on the ARPA noise spoke task. in Proc. ARPA Spoken Language System Technology. 127-130.
9 ETSI draft standard doc., Speech processing, Transmission and Quality aspects (STQ); Distributed speech recognition; Front-end feature extraction algorithm; Compression algorithm. ETSI Standard ES 202 050, 2002.
10 Xu, H., Tan, Z. -H., Dalsgaard, P. & Lindberg, B. (2007). Noise condition-dependent training based on noise classification and SNR estimation. IEEE Trans. Audio, Speech and Language Process. Vol. 15, No. 8, 2431-2443.   DOI   ScienceOn