Browse > Article
http://dx.doi.org/10.4218/etrij.11.0110.0489

An Adaptive Utterance Verification Framework Using Minimum Verification Error Training  

Shin, Sung-Hwan (Center for Signal and Image Processing, Georgia Institute Technology)
Jung, Ho-Young (Software Research Laboratory, ETRI)
Juang, Biing-Hwang (Center for Signal and Image Processing, Georgia Institute Technology)
Publication Information
ETRI Journal / v.33, no.3, 2011 , pp. 423-433 More about this Journal
Abstract
This paper introduces an adaptive and integrated utterance verification (UV) framework using minimum verification error (MVE) training as a new set of solutions suitable for real applications. UV is traditionally considered an add-on procedure to automatic speech recognition (ASR) and thus treated separately from the ASR system model design. This traditional two-stage approach often fails to cope with a wide range of variations, such as a new speaker or a new environment which is not matched with the original speaker population or the original acoustic environment that the ASR system is trained on. In this paper, we propose an integrated solution to enhance the overall UV system performance in such real applications. The integration is accomplished by adapting and merging the target model for UV with the acoustic model for ASR based on the common MVE principle at each iteration in the recognition stage. The proposed iterative procedure for UV model adaptation also involves revision of the data segmentation and the decoded hypotheses. Under this new framework, remarkable enhancement in not only recognition performance, but also verification performance has been obtained.
Keywords
Utterance verification; minimum verification error (MVE) training; adaptive framework;
Citations & Related Records

Times Cited By Web Of Science : 0  (Related Records In Web of Science)
Times Cited By SCOPUS : 1
연도 인용수 순위
  • Reference
1 M. Rahim, C.-H. Lee, and B.-H. Juang, "Discriminative Utterance Verification for Connected Digits Recognition," IEEE Trans. Speech Audio Process., vol. 5, May 1997, pp. 266-277.   DOI   ScienceOn
2 E. Lleida and R.C. Rose, "Utterance Verification in Continuous Speech Recognition: Decoding and Training Procedures," IEEE Trans. Speech Audio Process., vol. 8, March 2000, pp. 126-139.   DOI   ScienceOn
3 C.J. Leggetter and P.C. Woodland, "Maximum Likelihood Linear Regression for Speaker Adaptation of Continuous Density Hidden Markov Models," Computer Speech and Language, vol. 9, 1995, pp. 171-185.   DOI   ScienceOn
4 J. Wu and Q. Huo, "A Study of Minimum Classification Error (MCE) Linear Regression for Supervised Adaptation of MCETrained Continuous-Density Hidden Markov Models," IEEE Trans. Speech Audio Process., vol. 15, 2007, pp. 478-488.   DOI
5 G. Casella and R.L. Berger, Statistical Inference, Duxbury Press, New York, 2001.
6 M. Rahim and C.-H. Lee, "String-Based Minimum Verification Error (sb-mve) Training for Speech Recognition," Computer Speech and Language, vol. 11, 1997, pp. 147-160.   DOI   ScienceOn
7 A.E. Rosenberg, O. Siohan, and S. Parthasarathy, "Speaker Verification Using Minimum Verification Error Training," ICASSP, 1998, pp. 105-108.
8 R.A. Sukkar, A.R. Setlur, and C.-H. Lee, "Vocabulary Independent Discriminative Utterance Verification for Nonkeyword Rejection in Subword Based Speech Recognition," IEEE Trans. Speech Audio Process., vol. 4, pp. 420-429, Nov. 1996.   DOI   ScienceOn
9 E.L. Lehmann, Testing Statistical Hypotheses, John Wiley & Sons, 1959.
10 S.M. Kay, Fundamentals of Statistical Signal Processing: Detection Theory, NJ: Prentice-Hall, Englewood Cliffs, 1998.
11 M.-W. Koo, C.-H. Lee, and B.-H. Juang, "Speech Recognition and Utterance Verification Based on a Generalized Confidence Score," IEEE Trans. Speech Audio Process., vol. 9, Nov. 2001, pp. 821-832.   DOI   ScienceOn
12 X. He, L. Deng, and W. Chou, "Discriminative Learning in Sequential Pattern Recognition: A Unifying Review for Optimization-Oriented Speech Recognition," IEEE Signal Process. Mag., vol. 25, Sept. 2008, pp. 14-36.
13 Q. Fu and B.-H. Juang, "Segment-Based Phonetic Class Detection Using Minimum Verification Error (MVE) Training," in Interspeech, Lisbon, Portugal, Sept. 2005.
14 L.R. Rabiner and B.-H. Juang, Fundamentals of Speech Recognition, Englewood Cliffs, NJ: Prentice-Hall, 1993.
15 B.-H. Juang, W. Chou, and C.-H. Lee, "Minimum Classification Error Rate Methods for Speech Recognition," IEEE Trans. Speech Audio Process., vol. 5, May 1997, pp. 257-265.   DOI   ScienceOn
16 D. Povey, "Discriminative Training for Large Vocabulary Speech Recognition," PhD thesis, Cambridge University, 2004.
17 F. Wessel et al., "Confidence Measures for Large Vocabulary Continuous Speech Recognition," IEEE Trans. Speech Audio Proc., vol. 9, no. 3, Mar. 2001, pp. 288-298.   DOI   ScienceOn
18 T. Hazen and I. Bazzi, "A Comparison and Combination of Methods for OOV Word Detection and Word Confidence Scoring," IEEE Int. Conf. Acoustics, Speech, Signal Process., Salt Lake City, Utah, May 2001.
19 F.K. Soong, W.K. Lo, and S. Nakamura, "Generalized Word Posterior Probability (GWPP) for Measuring Reliability of Recognized Words," Proc. SWIM, 2004.
20 S. Shin et al., "Discriminative Linear-Transform Based Adaptation Using Minimum Verification Error," ICASSP, Texas, USA, Mar. 2010, pp. 4318-4321.
21 A. Martin et al., "The DET Curve in Assessment of Detection Task Performance," Proc. European Conf. Speech Commun. Technol., 1997, pp. 1895-1898.
22 W. Chou, "Minimum Classification Error Approach in Pattern Recognition," Pattern Recognition in Speech and Language Processing, W. Chou and B.-H. Juang, Eds., Boca Raton: CRC Press, 2003, pp. 1-49.
23 B.-H. Juang and S. Katagiri, "Discriminative Learning for Minimum Error Classification," IEEE Trans. Signal Process., vol. 40, Dec. 1992, pp. 3043-3054.   DOI   ScienceOn
24 W. Chou, C.-H. Lee, and B.-H. Juang, "Segmental GPD Training of HMM Based Speech Recognizer," ICASSP, Apr., 1992, pp. 473-476.
25 Q. Fu and B.-H. Juang, "A Study on Rescoring Using HMMBased Detectors for Continuous Speech Recognition," ASRU, Kyoto, Japan, Dec. 2007, pp. 570-575.
26 M.-H Siu, B. Mak, and W.-H. Au, "Minimization of Utterance Verification Error Rate as a Constrained Optimization Problem," IEEE Signal Process., Letters, vol. 13, Dec. 2006, pp. 760-763.   DOI
27 J.A. Snyman, Practical Mathematical Optimization, New York: Springer, 2005.