Korean broadcast news transcription system with out-of-vocabulary(OOV) update module

한국어 방송 뉴스 인식 시스템을 위한 OOV update module

  • 정의정 (한국전자통신연구원 음성언어팀) ;
  • 윤승 (한국전자통신연구원 음성언어팀)
  • Published : 2002.07.01

Abstract

We implemented a robust Korean broadcast news transcription system for out-of-vocabulary (OOV), tested its performance. The occurrence of OOV words in the input speech is inevitable in large vocabulary continuous speech recognition (LVCSR). The known vocabulary will never be complete due to the existence of for instance neologisms, proper names, and compounds in some languages. The fixed vocabulary and language model of LVCSR system directly face with these OOV words. Therefore our Broadcast news recognition system has an offline OOV update module of language model and vocabulary to solve OOV problem and selects morpheme-based recognition unit (so called, pseudo-morpheme) for OOV robustness.

Keywords