Browse > Article

A Single-End-Point DTW Algorithm for Keyword Spotting  

최용선 (한국과학기술원 바이오시스템학과 및 뇌과학연구센터)
오상훈 (목원대학교 정보통신전파공학)
이수영 (한국과학기술원 바이오시스템학과 및 뇌과학연구센터)
Publication Information
Abstract
In order to implement a real time hardware for keyword spotting, we propose a Single-End-Point DTW(SEP-DTW) algorithm which is simple and less complex for computation. The SEP-DTW algorithm only needs a single end point which enables efficient applications, and it has a small wont of computations because the global search area is divided into successive local search areas. Also, we adopt new local constraints and a new distance measure for a better performance of the SEP-DTW algorithm. Besides, we make a normalization of feature same vectors so that they have the same variance in each frequency bin, and each frame has the same energy levels. To construct several reference patterns for each keyword, we use a clustering algorithm for all training patterns, and mean vectors in every cluster are taken as reference patterns. In order to detect a key word for input streams of speech, we measure the distances between reference patterns and input pattern, and we make a decision whether the distances are smaller than a pre-defined threshold value. With isolated speech recognition and keyword spotting experiments, we verify that the proposed algorithm has a better performance than other methods.
Keywords
DTW;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. G. Wilpon, L. G. Miller, and P. Modi, 'Improvements and Applications for Key Word Recognition Using Hidden Markov Modeling Techniques,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 300-312, May 1991
2 Dominique Vicard, 'Transient Part Recognition for Continuous Speech Using Transition Spotting,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 461-464, Apr. 1988   DOI
3 Sei-ichi Nakagawa, Alexander G. Hauptmann, and Masaru Tomita, 'On Quick Word Spotting Techniques,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 2311-2314, Apr. 1986
4 Herbert Gish and Kenney Ng, 'A Segmental Speech Model with Applications to Word Spotting,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, Vol. 2, pp. 447-450, Apr. 1993   DOI
5 Torsten Zeppenfeld, Rick Houghton, and Alex Waibel, 'Improving the MS-TDNN for Word Spotting,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, Vol. 2, pp, 475-478, Apr. 1993   DOI
6 L. Rabiner and B. Juang, Fundamentals of Speech Recognition, Prentice Hall, p. 229-232, 1993
7 J. C. junqua and J. P. Haton, Robustness in Automatic Speech Recognition, Kluwer Academic Publishers, p. 325-345, 1996
8 Alan. L. Higgins and Robert E. Wohlford, 'Keyword Recognition Using Template Concatenation,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 1233-1236, Mar, 1985
9 C. S. Myers, L. R. Rabiner, and A. E. Rosenberg, 'An Investigation of the Use of Dynamic Time Warping for Word Spotting and Connected Speech Recognition,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 173-177, Apr. 1980
10 D. S. Kim and S. Y. Lee and R. M. Kil, 'Auditory Processing of Speech Signals for Robust Speech Recognition in Real-World Noisy Environments,' IEEE Transaction on Speech and Audio Processing, vol. 7, no. 1, pp. 55-69, Jan. 1999   DOI   ScienceOn