[KSCI] Korea Science Citation Index Service

Study on the Improvement of Speech Recognizer by Using Time Scale Modification

이기승 (건국대학교 정보 통신 대학 전자 공학과)

Publication Information

The Journal of the Acoustical Society of Korea / v.23, no.6, 2004 , pp. 462-472 More about this Journal

Abstract

In this paper a method for compensating for thp performance degradation or automatic speech recognition (ASR) is proposed. which is mainly caused by speaking rate variation. Before the new method is proposed. quantitative analysis of the performance of an HMM-based ASR system according to speaking rate is first performed. From this analysis, significant performance degradation was often observed in the rapidly speaking speech signals. A quantitative measure is then introduced, which is able to represent speaking rate. Time scale modification (TSM) is employed to compensate the speaking rate difference between input speech signals and training speech signals. Finally, a method for compensating the performance degradation caused by speaking rate variation is proposed, in which TSM is selectively employed according to speaking rate. By the results from the ASR experiments devised for the 10-digits mobile phone number, it is confirmed that the error rate was reduced by 15.5% when the proposed method is applied to the high speaking rate speech signals.

Keywords

Automatic speech recognition; Time scale modification; Speaking rate;

Citations & Related Records

Reference

1	Modeling word-level rate-of-speech variation in large vocabulary conversational speech recognition / [ J. Zheng;H. Franco;A. Stolcke ] / Speech Communication DOI ScienceOn
2	Time-scale modification in medium to low rate speech coding / [ J. Makhoul;A. E. Jaroudi ] / proc. of ICASSP
3	High quality time-scale modification for speech / [ S. Roucos;A. M. Wilgus ] / proc. of ICASSP
4	Measure of local speaking-rate for automatic speech recognition / [ M. J. Russell;K. M. Ponting;M. J. Tomlinson ] / IEE Electronics Letters DOI ScienceOn
5	A technique for adapting to speech rate / [ M. H. Nguyen;G. W. Cottrell ] / The proceedings of the 1993 IEEE-SP workshop
6	On-line speaking rate estimation using Gaussian mixture models / [ R. Fallthauser, T. Pfau;G. Ruske ] / The proceedings of ICASSP2000
7	Signal estimation from modified short-time Fouier transform / [ D. W. Griffin;J. S. Lim ] / IEEE Trans. on Acoust. Speech, Signal Processing
8	Computationally eficient algorithm for time scale modification(GLS-TSM) / [ S. Yim;B. I. Pawate ] / proc. of ICASSP
9	Fast speakers in large vocabulary continuous speech recognition: analysis & antidotes / [ N. Mirghafori;E. Fosler;N. Morgan ] / The proceedings of EUROSPEECH95
10	Towards robustness to fast speech in ASR / [ N. Mirghafori;E. Fosler;N. Morgan ] / The proceedings of ICASSP96

KSCI

Study on the Improvement of Speech Recognizer by Using Time Scale Modification 시간축 변환을 이용한 음성 인식기의 성능 향상에 관한 연구

Study on the Improvement of Speech Recognizer by Using Time Scale Modification