Browse > Article
http://dx.doi.org/10.7776/ASK.2013.32.3.252

Noise Robust Speech Recognition Based on Parallel Model Combination Adaptation Using Frequency-Variant  

Choi, Sook-Nam (Department of Information and Communication,Yeongnam University)
Chung, Hyun-Yeol (Department of Information and Communication,Yeongnam University)
Abstract
The common speech recognition system displays higher recognition performance in a quiet environment, while its performance declines sharply in a real environment where there are noises. To implement a speech recognizer that is robust in different speech settings, this study suggests the method of Parallel Model Combination adaptation using frequency-variant based on environment-awareness (FV-PMC), which uses variants in frequency; acquires the environmental data for speech recognition; applies it to upgrading the speech recognition model; and promotes its performance enhancement. This FV-PMC performs the speech recognition with the recognition model which is generated as followings: i) calculating the average frequency variant in advance among the readily-classified noise groups and setting it as a threshold value; ii) recalculating the frequency variant among noise groups when speech with unknown noises are input; iii) regarding the speech higher than the threshold value of the relevant group as the speech including the noise of its group; and iv) using the speech that includes this noise group. When noises were classified with the proposed FV-PMC, the average accuracy of classification was 56%, and the results from the speech recognition experiments showed the average recognition rate of Set A was 79.05%, the rate of Set B 79.43%m, and the rate of Set C 83.37% respectively. The grand mean of recognition rate was 80.62%, which demonstrates 5.69% more improved effects than the recognition rate of 74.93% of the existing Parallel Model Combination with a clear model, meaning that the proposed method is effective.
Keywords
Parallel model combination; Gaussian mixture model; Frequency-variant; Environment-awareness; Noise model; FV-PMC;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 G.H. Shen, H.Y. Jung, and H. Y. Chung, "A noise robust speech recognition method using model compensation based on speech enhancement"(in Korean), J. Acoust. Soc. Kr. 4(s) 27, 191-199 (2008).   과학기술학회마을
2 Hadi Veisi, Hossein Sameti, "Cepstral-domain hmm - based speech enhancement using vector taylor series and parallel model combination," ISSPA, 298-303(2012).
3 Philipos C .Loizou, Speech Enhancement -Theory and Practice, (CRC Press, Florida, 2007).
4 Varga A. and Moore R.,"Hidden markov model decomposition of speech and noise," ICASSP, 845-848 (1990).
5 Nakamura, S. Qiang Hou, Shikano, K., "Model adaptation based on hmm decomposition for reverberant speech recognition," ICASSP, 21-24 (1997).
6 G. J. Jung, "Improved on-line model compensation for robust speech recognition"(in Korean), Master's thesis (2002).
7 Gales,M. and Young S.,"HMM recognition in noise using parallel model combination," EUROSPEECH, 837-840 (1993).
8 M. J. F. Gales, S. Young, "Robust continuous speech recognition using parallel model combination," IEEE TSAP, 4, 352-359 (1996).
9 Yao, E. Visser, O. W. Kwon and T. W. Lee, "A seech processing front-end with eigenspace normalization for robust speech recognition in noisy automobile environments," Proc. Eurospeech, 9-12 ( 2003).
10 Seon-Mi Gang, "Study on speech recognition under noisy environments" (in Korean), J. Inst. Ind. Tech. 3, 301-318 (1997).
11 J. S. Lim, A. V. Oppenheim, "Enhancement and bandwidth compression of noisy speech," Proceedings IEEE, 67, 1586-1604 (1979).
12 Y. Ephraim and D. Malah, and B. H. Juang, "On the application of hidden markov models for enhancing noisy speech," Proc. ICASSP, 533-536 (1992).
13 J. C. Junqua and J. P. Haton, Robustness in Automatic Speech Recognition: Fundamentals and Applications, (Kluwer Academic Publishers, 1996).
14 Y. H. Suk, S. H. Choi, and H. S. Lee, "Cepstrum PDF normalization method for Speech recognition in noise environment"(in Korean), J. Acoust. Soc. Kr. 4(s) 24, 224-229 (2005).   과학기술학회마을
15 Hanson, B. A., and Wakita, H., "Spectral slope distance measure with linear prediction analysis for word recognition in noise," IEEE Trans. on ASSP, ASSP-35, 7, 968-973 ( 1987).
16 K. C. SIM, M.T. LUONG, "A trajectory-based parallel model combination with a unified static and dynamic parameter compensation for noisy speech recognition," ASRU, 107-112 ( 2011).
17 Juang, B. H., Rabiner, L., and Wilpon, J., "On the use og bandpass liftering in speech recognition," ICASSP, 765-768 (1986).
18 A. Nadas, D. Nahamoo and M. Picheny, "Speech recognition using noise adaptive prototypes," Proc. ICASSP, 517-520 (1988).
19 Gue-Jun Jung, Hoon-Young Cho, and Yung-Hwan Oh, "Improved compensation of dynamic parameter in PMC for robust speech recognition"(in Korean), J. Acoust. Soc. Kr. 1(s) 20, 183-186 (2001).
20 Rabiner, lr, and Juang, bh, Fundamentals of Speech Recognition,( Prentice-Hall, New Jersey,1993).
21 H.-G Hirsch, D. Pearce, "The AURORA experimental framework for the performance evaluation of speech recognition systems under noisy conditions," ISCA ITRW ASR (2000).