Browse > Article

A Modified Viterbi Algorithm for Word Boundary Detection Error Compensation  

Chung, Hoon (ETRI)
Chung, Ik-Joo (Kangwon National University)
Abstract
In this paper, we propose a modified Viterbi algorithm to compensate for endpoint detection error during the decoding phase of an isolated word recognition task. Since the conventional Viterbi algorithm explores only the search space whose boundaries are fixed to the endpoints of the segmented utterance by the endpoint detector, the recognition performance is highly dependent on the accuracy level of endpoint detection. Inaccurately segmented word boundaries lead directly to recognition error. In order to relax the degradation of recognition accuracy due to endpoint detection error, we describe an unconstrained search of word boundaries and present an algorithm to explore the search space with efficiency. The proposed algorithm was evaluated by performing a variety of simulated endpoint detection error cases on an isolated word recognition task. The proposed algorithm reduced the Word Error Rate (WER) considerably, from 84.4% to 10.6%, while consuming only a little more computation power.
Keywords
Noise robust recognition; Word boundary detection; Viterbi decoding;
Citations & Related Records
연도 인용수 순위
  • Reference
1 L. Rabiner and B. Juang, Fundamentals of Speech Recognition. (NJ: Prentice-Hall, 1993), PP. 339-340
2 C. Tschope, D, Hentschel, M Wolff, M. Eichner and R. Hoffmann, 'Classification of non-speech acoustic signals using structure models,' IEEE ICASSP, 653-656, May 2004
3 S. Ortmanns, H. Ney, F. Seide, and I. Lindam, 'A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition,' ICSLP, 2091-2094, Oct. 1996
4 S. G, Tanyer and H, Ozer, 'Voice activity detection in nonstationary noise,' IEEE Trans. Speech Audio Processing, 8, 478-482, July 2000   DOI   ScienceOn
5 Chin-Teng Lin, Jiann-Yow Lin and Gin-Der Wu, 'A robust word boundary detection algorithm for variable noise-level environment in cars,' IEEE Transactions on Intelligent Transportation Systems, 3, 89-101, March 2002   DOI   ScienceOn
6 R. EI Meliani and D, O'Shaughnessy, 'New efficient fillers for unlimited word recognition and keyword spotting,' ICSLP, 590-593, Oct. 1996
7 S. Ortmanns and H. Ney, 'The time-conditioned approach in dynamic programming search for LVCSR,' IEEE Trans. Speech Audio Processing, 8, 676-687, Nov. 2000