Browse > Article
http://dx.doi.org/10.13064/KSSS.2014.6.4.127

HMM-based missing feature reconstruction for robust speech recognition in additive noise environments  

Cho, Ji-Won (서강대학교)
Park, Hyung-Min (서강대학교)
Publication Information
Phonetics and Speech Sciences / v.6, no.4, 2014 , pp. 127-132 More about this Journal
Abstract
This paper describes a robust speech recognition technique by reconstructing spectral components mismatched with a training environment. Although the cluster-based reconstruction method can compensate the unreliable components from reliable components in the same spectral vector by assuming an independent, identically distributed Gaussian-mixture process of training spectral vectors, the presented method exploits the temporal dependency of speech to reconstruct the components by introducing a hidden-Markov-model prior which incorporates an internal state transition plausible for an observed spectral vector sequence. The experimental results indicate that the described method can provide temporally consistent reconstruction and further improve recognition performance on average compared to the conventional method.
Keywords
missing feature reconstruction; robust speech recognition; cluster-based reconstruction; hidden Markov model;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Acero, A. (1990). Acoustic and Environmental Robustness in Automatic Speech Recognition, PhD. thesis, Dept. of Electrical and Computer Engineering, Carnegie Mellon University, PA.
2 Raj, B. & Stern, R. M. (2005). Missing feature approaches in speech recognition, IEEE Signal Processing Magazine, vol. 22, 101-116.
3 Peinado, A. M., Sanchez, V., Segura, J. C., & Perez-Cordoba, J. L. (2001). MMSE-based Channel Mitigation for Distributed Speech Recognition, Proc. EUROSPEECH, 2707-2710
4 Peinado, A. M., Sanchez, V., Perez-Cordoba, J. L., Segura, J. C., & Rubio, J. (2002). HMM-Based Methods for Channel Error Mitigation in Distributed Speech Recognition, Proc. ICSLP02, 2205-2208.
5 Borgstrom, B. J. & Alwan A. (2010). HMM-based reconstruction of unreliable spectrographic data for noise robust speech recognition, IEEE Transactions on Audio, Speech, and Language Processing, vol. 18, 1612-1623.   DOI
6 Huang, X., Acero, A., & Hon, H.-W. (2001). Spoken language processing: a guide to theory, algorithm, and system development, NJ: Prentice-Hall.
7 Cho, J.-W. & Park, H.-M. (2013). An efficient HMM-based feature enhancement method with filter esimation for reverberant speech recognition, IEEE Signal Processing Letter, vol. 20, 1199-1202.   DOI
8 Price, P., Fisher, W.M., Bernstein, J., Pallet, D.S.(1988). The DARPA 1000-Word Resource Management Database for Continuous Speech Recognition, Proc. IEEE ICASSP, 651-654
9 Young, S., Evermann, G., Gales, M., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., & Woodland, P. (2006). The HTK book, Cambridge, UK: Cambridge University Press.
10 Varga, A., Steeneken, H.J. (1993) Assessment for automatic speech recognition: 2. In: NOISEX 1992: A Database and anExpeiment to Study the Effect of Additive Noise on Speech Recognition Systems. Speech Comm., vol. 12, 247-251.   DOI   ScienceOn
11 Sound Jay. www.soundjay.com.