Browse > Article

Study on the Improvement of Speech Recognizer by Using Time Scale Modification  

이기승 (건국대학교 정보 통신 대학 전자 공학과)
Abstract
In this paper a method for compensating for thp performance degradation or automatic speech recognition (ASR) is proposed. which is mainly caused by speaking rate variation. Before the new method is proposed. quantitative analysis of the performance of an HMM-based ASR system according to speaking rate is first performed. From this analysis, significant performance degradation was often observed in the rapidly speaking speech signals. A quantitative measure is then introduced, which is able to represent speaking rate. Time scale modification (TSM) is employed to compensate the speaking rate difference between input speech signals and training speech signals. Finally, a method for compensating the performance degradation caused by speaking rate variation is proposed, in which TSM is selectively employed according to speaking rate. By the results from the ASR experiments devised for the 10-digits mobile phone number, it is confirmed that the error rate was reduced by 15.5% when the proposed method is applied to the high speaking rate speech signals.
Keywords
Automatic speech recognition; Time scale modification; Speaking rate;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Modeling word-level rate-of-speech variation in large vocabulary conversational speech recognition /
[ J. Zheng;H. Franco;A. Stolcke ] / Speech Communication   DOI   ScienceOn
2 Time-scale modification in medium to low rate speech coding /
[ J. Makhoul;A. E. Jaroudi ] / proc. of ICASSP
3 High quality time-scale modification for speech /
[ S. Roucos;A. M. Wilgus ] / proc. of ICASSP
4 Measure of local speaking-rate for automatic speech recognition /
[ M. J. Russell;K. M. Ponting;M. J. Tomlinson ] / IEE Electronics Letters   DOI   ScienceOn
5 A technique for adapting to speech rate /
[ M. H. Nguyen;G. W. Cottrell ] / The proceedings of the 1993 IEEE-SP workshop
6 On-line speaking rate estimation using Gaussian mixture models /
[ R. Fallthauser, T. Pfau;G. Ruske ] / The proceedings of ICASSP2000
7 Signal estimation from modified short-time Fouier transform /
[ D. W. Griffin;J. S. Lim ] / IEEE Trans. on Acoust. Speech, Signal Processing
8 Computationally eficient algorithm for time scale modification(GLS-TSM) /
[ S. Yim;B. I. Pawate ] / proc. of ICASSP
9 Fast speakers in large vocabulary continuous speech recognition: analysis & antidotes /
[ N. Mirghafori;E. Fosler;N. Morgan ] / The proceedings of EUROSPEECH95
10 Towards robustness to fast speech in ASR /
[ N. Mirghafori;E. Fosler;N. Morgan ] / The proceedings of ICASSP96