Readability Enhancement of English Speech Recognition Output Using Automatic Capitalisation Classification

자동 대소문자 식별을 이용한 영어 음성인식 결과의 가독성 향상

  • Published : 2007.03.30

Abstract

A modified speech recogniser have been proposed for automatic capitalisation generation to improve the readability of English speech recognition output. In this modified speech recogniser, every word in its vocabulary is duplicated: once in a de-caplitalised form and again in the capitalised forms. In addition its language model is re-trained on mixed case texts. In order to evaluate the performance of the proposed system, experiments of automatic capitalisation generation were performed for 3 hours of Broadcast News(BN) test data using the modified HTK BN transcription system. The proposed system produced an F-measure of 0.7317 for automatic capitalisation generation with an SER of 48.55, a precision of 0.7736 and a recall of 0.6942.

Keywords