A Modified Viterbi Algorithm for Word Boundary Detection Error Compensation

단어 경계 검출 오류 보정을 위한 수정된 비터비 알고리즘

  • Published : 2007.03.31

Abstract

In this paper, we propose a modified Viterbi algorithm to compensate for endpoint detection error during the decoding phase of an isolated word recognition task. Since the conventional Viterbi algorithm explores only the search space whose boundaries are fixed to the endpoints of the segmented utterance by the endpoint detector, the recognition performance is highly dependent on the accuracy level of endpoint detection. Inaccurately segmented word boundaries lead directly to recognition error. In order to relax the degradation of recognition accuracy due to endpoint detection error, we describe an unconstrained search of word boundaries and present an algorithm to explore the search space with efficiency. The proposed algorithm was evaluated by performing a variety of simulated endpoint detection error cases on an isolated word recognition task. The proposed algorithm reduced the Word Error Rate (WER) considerably, from 84.4% to 10.6%, while consuming only a little more computation power.

Keywords

References

  1. Chin-Teng Lin, Jiann-Yow Lin and Gin-Der Wu, 'A robust word boundary detection algorithm for variable noise-level environment in cars,' IEEE Transactions on Intelligent Transportation Systems, 3, 89-101, March 2002 https://doi.org/10.1109/6979.994798
  2. S. G, Tanyer and H, Ozer, 'Voice activity detection in nonstationary noise,' IEEE Trans. Speech Audio Processing, 8, 478-482, July 2000 https://doi.org/10.1109/89.848229
  3. R. EI Meliani and D, O'Shaughnessy, 'New efficient fillers for unlimited word recognition and keyword spotting,' ICSLP, 590-593, Oct. 1996
  4. C. Tschope, D, Hentschel, M Wolff, M. Eichner and R. Hoffmann, 'Classification of non-speech acoustic signals using structure models,' IEEE ICASSP, 653-656, May 2004
  5. L. Rabiner and B. Juang, Fundamentals of Speech Recognition. (NJ: Prentice-Hall, 1993), PP. 339-340
  6. S. Ortmanns, H. Ney, F. Seide, and I. Lindam, 'A comparison of time conditioned and word conditioned search techniques for large vocabulary speech recognition,' ICSLP, 2091-2094, Oct. 1996
  7. S. Ortmanns and H. Ney, 'The time-conditioned approach in dynamic programming search for LVCSR,' IEEE Trans. Speech Audio Processing, 8, 676-687, Nov. 2000