A Single-End-Point DTW Algorithm for Keyword Spotting

;;;

대한전자공학회논문지SP (Journal of the Institute of Electronics Engineers of Korea SP)

제41권3호
/
Pages.209-219
/
2004
/
1229-6384(pISSN)

대한전자공학회 (The Institute of Electronics and Information Engineers)

핵심어 검출을 위한 단일 끝점 DTW알고리즘

A Single-End-Point DTW Algorithm for Keyword Spotting

최용선 (한국과학기술원 바이오시스템학과 및 뇌과학연구센터) ;
오상훈 (목원대학교 정보통신전파공학) ;
이수영 (한국과학기술원 바이오시스템학과 및 뇌과학연구센터)

발행 : 2004.05.01

PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 논문에서는 핵심어 검출 시스템을 실시간 적용이 가능한 하드웨어로 구현하기 위해 연산량이 적고 구조가 간단한 단일 끝점 DTW 방법을 제안한다. 제안된 알고리즘은 일반적 DTW가 양쪽 끝점을 요구하는데 비하여 단지 한쪽 끝점만 필요하므로 이용하기에 편리하며, 국부 검색의 연속이 전역 경로를 이루게 되므로 매우 적은 연산량을 가진다. 그리고, 제안한 단일 끝점 DTW가 보다 나은 성능을 지니도록 하기 위해 새로운 경사 가중치와 거리 측정법을 가지도록 하였다. 이외에도, 단일 끝점 DTW는 특징벡터 정규화를 적용하여 특징벡터 각각의 차원에서 데이터들이 같은 표준편차를 가지게 하며 모든 프레임이 같은 에너지를 가지도록 정규화 되었다 또한, 주어진 학습 패턴들에 클러스터링을 적용한 후, 각 클러스터 내에서 평균을 계산하여 구한 패턴을 해당 핵심어를 대표하는 여러 개의 기준패턴으로 삼았다. 이러한 기준패턴들과 입력 음성의 특징벡터가 이미 정해진 문턱값 보다 작은 거리 내에 있을 때 핵심어는 검출된다. 제안된 알고리즘을 고립단어 음성인식과 핵심어 검출 실험에 적용하여 다른 방법을 이용한 결과보다 성능이 뛰어남을 확인하였다.

In order to implement a real time hardware for keyword spotting, we propose a Single-End-Point DTW(SEP-DTW) algorithm which is simple and less complex for computation. The SEP-DTW algorithm only needs a single end point which enables efficient applications, and it has a small wont of computations because the global search area is divided into successive local search areas. Also, we adopt new local constraints and a new distance measure for a better performance of the SEP-DTW algorithm. Besides, we make a normalization of feature same vectors so that they have the same variance in each frequency bin, and each frame has the same energy levels. To construct several reference patterns for each keyword, we use a clustering algorithm for all training patterns, and mean vectors in every cluster are taken as reference patterns. In order to detect a key word for input streams of speech, we measure the distances between reference patterns and input pattern, and we make a decision whether the distances are smaller than a pre-defined threshold value. With isolated speech recognition and keyword spotting experiments, we verify that the proposed algorithm has a better performance than other methods.

키워드

DTW;

참고문헌

J. G. Wilpon, L. G. Miller, and P. Modi, 'Improvements and Applications for Key Word Recognition Using Hidden Markov Modeling Techniques,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 300-312, May 1991
Herbert Gish and Kenney Ng, 'A Segmental Speech Model with Applications to Word Spotting,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, Vol. 2, pp. 447-450, Apr. 1993 https://doi.org/10.1109/ICASSP.1993.319337
Torsten Zeppenfeld, Rick Houghton, and Alex Waibel, 'Improving the MS-TDNN for Word Spotting,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, Vol. 2, pp, 475-478, Apr. 1993 https://doi.org/10.1109/ICASSP.1993.319344
Dominique Vicard, 'Transient Part Recognition for Continuous Speech Using Transition Spotting,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 461-464, Apr. 1988 https://doi.org/10.1109/ICASSP.1988.196618
J. C. junqua and J. P. Haton, Robustness in Automatic Speech Recognition, Kluwer Academic Publishers, p. 325-345, 1996
Alan. L. Higgins and Robert E. Wohlford, 'Keyword Recognition Using Template Concatenation,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 1233-1236, Mar, 1985
Sei-ichi Nakagawa, Alexander G. Hauptmann, and Masaru Tomita, 'On Quick Word Spotting Techniques,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 2311-2314, Apr. 1986
C. S. Myers, L. R. Rabiner, and A. E. Rosenberg, 'An Investigation of the Use of Dynamic Time Warping for Word Spotting and Connected Speech Recognition,' in Proc. of IEEE Conf. on Acoustics, Speech and Signal Processing, pp. 173-177, Apr. 1980
L. Rabiner and B. Juang, Fundamentals of Speech Recognition, Prentice Hall, p. 229-232, 1993
D. S. Kim and S. Y. Lee and R. M. Kil, 'Auditory Processing of Speech Signals for Robust Speech Recognition in Real-World Noisy Environments,' IEEE Transaction on Speech and Audio Processing, vol. 7, no. 1, pp. 55-69, Jan. 1999 https://doi.org/10.1109/89.736331

대한전자공학회논문지SP (Journal of the Institute of Electronics Engineers of Korea SP)

핵심어 검출을 위한 단일 끝점 DTW알고리즘

A Single-End-Point DTW Algorithm for Keyword Spotting

초록

키워드

참고문헌

자세히 찾기