Browse > Article

Estimation and Weighting of Sub-band Reliability for Multi-band Speech Recognition  

조훈영 (한국과학기술원 전자전산학과)
지상문 (경성대학교 정보과학부)
오영환 (한국과학기술원 전자전산학과)
Abstract
Recently, based on the human speech recognition (HSR) model of Fletcher, the multi-band speech recognition has been intensively studied by many researchers. As a new automatic speech recognition (ASR) technique, the multi-band speech recognition splits the frequency domain into several sub-bands and recognizes each sub-band independently. The likelihood scores of sub-bands are weighted according to reliabilities of sub-bands and re-combined to make a final decision. This approach is known to be robust under noisy environments. When the noise is stationary a sub-band SNR can be estimated using the noise information in non-speech interval. However, if the noise is non-stationary it is not feasible to obtain the sub-band SNR. This paper proposes the inverse sub-band distance (ISD) weighting, where a distance of each sub-band is calculated by a stochastic matching of input feature vectors and hidden Markov models. The inverse distance is used as a sub-band weight. Experiments on 1500∼1800㎐ band-limited white noise and classical guitar sound revealed that the proposed method could represent the sub-band reliability effectively and improve the performance under both stationary and non-stationary band-limited noise environments.
Keywords
Multi-band speech recognition; Sub-band reliability; Sub-band distance; Non-stationary noise; Band-limited noise;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Robust continuous speech recognition using parallel model combination /
[ M.J.F.Gales;S.J.Young ] / IEEE Trans. On Speech and Audio Processing   DOI   ScienceOn
2 음성데이터베이스의 현황 및 과제 /
[ 이용주 ] / 제 13회 음성통신 및 신호처리 워크샵
3 A recombination strategy for multi-band speech recognition based on mutual Information criterion /
[ S.Okawa;T.Nakajima;K.Shirai ] / Proc. EUROSPEECH
4 ASR based on independent processing and recombination of partial frequency bands /
[ H.Bourlard;S.Dupont ] / Proc. Int. Conf. on Spoken Language Processing
5 Speech recognition in noise environments: A survey /
[ Y.Gong ] / Speech Communication   DOI   ScienceOn
6 Suppression of acoustic noise in speech using spectral subtraction /
[ S.Boll ] / IEEE Trans. On Speech and Audio Processing   DOI
7 Multi-stream adaptive evidence combination for noise robust ASR /
[ A.Morris;A.Hagen;H.Glotin;H.Bourlard ] / Speech Communication   DOI   ScienceOn
8 How do humans process and recognize speech? /
[ J.B.Allen ] / IEEE Trans. On Speech and Audio Processing   DOI   ScienceOn
9 Speech recognition by machines and humans /
[ R.P.Lippmann ] / Speech Communication   DOI   ScienceOn
10 Adaptive ML-weighting in multi-band recombination of Gaussian mixture ASR /
[ A.Hagen;H.Bourlard;A.Morris ] / Proc. Int. Conf. on Acoustics, Speech and Signal Processing
11 Towards ASR on partially corrupted speech /
[ H.Hermansky;S.Tibrewala;M.Pavel ] / Proc. Int. Conf. on Spoken Language Processing
12 Optimization of sub-band weights using simulated noisy speech in multi-band speech recognition /
[ Y.C.Tam;B.Mak ] / Proc. Int. Conf. on Spoken Language Processing
13 Towards a global qptimization scheme for multi-band speech recognition /
[ C.Christophe;H.J.Paul;F.Dominique ] / Proc. EUROSPEECH