Proceedings of the KSPS conference (대한음성학회:학술대회논문집)
- 2005.11a
- /
- Pages.103-106
- /
- 2005
A Corpus Selection Based Approach to Language Modeling for Large Vocabulary Continuous Speech Recognition
대용량 연속 음성 인식 시스템에서의 코퍼스 선별 방법에 의한 언어모델 설계
- Oh, Yoo-Rhee (Dept. of Information and Communications, Gwangju Institute of Science and Technology) ;
- Yoon, Jae-Sam (Dept. of Information and Communications, Gwangju Institute of Science and Technology) ;
- kim, Hong-Kook (Dept. of Information and Communications, Gwangju Institute of Science and Technology)
- Published : 2005.11.17
Abstract
In this paper, we propose a language modeling approach to improve the performance of a large vocabulary continuous speech recognition system. The proposed approach is based on the active learning framework that helps to select a text corpus from a plenty amount of text data required for language modeling. The perplexity is used as a measure for the corpus selection in the active learning. From the recognition experiments on the task of continuous Korean speech, the speech recognition system employing the language model by the proposed language modeling approach reduces the word error rate by about 6.6 % with less computational complexity than that using a language model constructed with randomly selected texts.
Keywords