Browse > Article

A Study on Performance Improvement Method for the Multi-Model Speech Recognition System in the DSR Environment  

Jang, Hyun-Baek (희성전자(주))
Chung, Yong-Joo (계명대학교 전자공학과)
Publication Information
Journal of the Institute of Convergence Signal Processing / v.11, no.2, 2010 , pp. 137-142 More about this Journal
Abstract
Although multi-model speech recognizer has been shown to be quite successful in noisy speech recognition, the results were based on general speech front-ends which do not take into account noise adaptation techniques. In this paper, for the accurate evaluation of the multi-model based speech recognizer, we adopted a quite noise-robust speech front-end, AFE, which was proposed by the ETSI for the noisy DSR environment. For the performance comparison, the MTR which is known to give good results in the DSR environment has been used. Also, we modified the structure of the multi-model based speech recognizer to improve the recognition performance. N reference HMMs which are most similar to the input noisy speech are used as the acoustic models for recognition to cope with the errors in the selection of the reference HMMs and the noise signal variability. In addition, multiple SNR levels are used to train each of the reference HMMs to improve the robustness of the acoustic models. From the experimental results on the Aurora 2 databases, we could see better recognition rates using the modified multi-model based speech recognizer compared with the previous method.
Keywords
Speech Recognition; Multi-model Speech Recognizer; Distributed Speech Recognition;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Juang, B. H. and Rabiner, L. R, "A Probabilistic Distance Measure for Hidden Markov Models", AT&T Technology Journal, pp. 391-408, 1984.
2 정용주, "연속 잠음 음성 인식을 위한 다 모델 기반 인식기의 성능 향상에 대한 연구", 음성과학, 제15권 제2호, pp.55-65, 2008.
3 Xu, H, Tan, Z.-H., Dalsgaard, P., Lindberg, B., "Robust Speech Recognition on Noise and SNR Classification-a Multiple-Model Framework", Proc. Interspeech, 2005.
4 Macho, D., Mauuary, L., Noe, B., Cheng, Y., Eahey, D., Jouvet, D., Kelleher, H, Pearce, D., Saadoun, F., "Evaluation of a Noise-Robust DSR Front-End on Aurora Databases", Proc. ICSLP, pp.17-20, 2002.
5 ETSI Draft Standard Doc. Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Front-End Feature Extraction Algorithm; Compression Algorithm, ETSI Standard ES 202 108, 2000.
6 ETSI Draft Standard Doc. Speech Processing, Transmission and Quality Aspects (STQ); Distributed Speech Recognition; Advanced Front-End Feature Extraction Algorithm; Compression Algorithm, ETSI Standard ES 202 050,2002.
7 Ball, S. F., "Suppression of Acoustic Noise in Speech Using spectral subtraction", IEEE Trans. Acoust., Speech, Signal Process., vol.27, pp.113-120, 1979.   DOI
8 Gales, M. J. F., Model Based Techniques for Noise-Robust Speech Recognition, Ph.D. Dissertation, University of Cambridge. 1995.
9 Moreno, P. J., Speech Recognition in Noisy Environments, Ph.D. Dissertation, Carnegie Mellon University, 1996.
10 김희근, 정용주, "AURORA DB를 이용한 잠음 음성 인식실험을 위한 Segmental K-means 훈련방식의 기반인식기의 구현", 말소리, 제57호, pp. 113-122, 2006.