Browse > Article

An Automatic Post-processing Method for Speech Recognition using CRFs and TBL  

Seon, Choong-Nyoung (서강대학교 컴퓨터공학과)
Jeong, Hyoung-Il (서강대학교 컴퓨터공학과)
Seo, Jung-Yun (서강대학교 컴퓨터공학과)
Abstract
In the applications of a human speech interface, reducing the error rate in recognition is the one of the main research issues. Many previous studies attempted to correct errors using post-processing, which is dependent on a manually constructed corpus and correction patterns. We propose an automatically learnable post-processing method that is independent of the characteristics of both the domain and the speech recognizer. We divide the entire post-processing task into two steps: error detection and error correction. We consider the error detection step as a classification problem for which we apply the conditional random fields (CRFs) classifier. Furthermore, we apply transformation-based learning (TBL) to the error correction step. Our experimental results indicate that the proposed method corrects a speech recognizer's insertion, deletion, and substitution errors by 25.85%, 3.57%, and 7.42%, respectively.
Keywords
Speech Recognition; Post-processing for Speech Recognition; Conditional Random Fields; Transformation-Based Learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Jeong, S. Jung, G. G. Lee, "Speech recognition error correction using maximum entropy language model," Proceedings of Interspeech, pp.2137-2140, 2004.
2 R. Lopez-Cozar, Z. Callejas, "ASR post-correction for spoken dialogue systems based on semantic, syntactic, lexical and contextual information," Speech Communication, vol.50, Issue.8-9, pp.745-766, 2008.   DOI   ScienceOn
3 J. Lafferty et al., "Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data," Proceedings of ICML, pp.282-289, 2001.
4 E. Brill, "A Simple Rule-based Part of Speech Tagger," Proceedings of the Third Conference on Applied Natural Language Processing, pp.152-155, 1992.
5 K. Lee, Morph-Phonological Modeling of Pronunciation Variation for Korean Large Vocabulary Continuous Speech Recognition, Ph.D Thesis, Sogang University, 2006.
6 E. K. Ringger and J. F. Allen, "A Fertility Channel Model for Post-Correction of Continuous Speech Recognition," Proceedings of the Fourth International Conference on Spoken Language Processing (ICSLP96), vol.2, pp.897-900, 1996.
7 M. Jeong, B. Kim, G. G. Lee, "Semantic-Oriented Error Correction for Spoken Query Processing," Proceedings on IEEE Automatic speech recognition and understanding workshop (ASRU2003), pp.156- 161, 2003.
8 S. Kaki et al., "A Method for Correcting Errors in Speech Recognition Using the Statistical Features of Character Co-occurrence," Proceedings of the 17th international conference on Computational linguistics, vol.1, pp.653-657, 1998.
9 Y. Kim, M. Jeong, "Improving Performance of Continuous Speech Recognition Using Error Pattern Training and Post Processing Module," Proceedings of the KIISE Korea Computer Congress 2000, vol.27, no.1, pp.441-358, 2000. (in Korean)