한국어 방송 뉴스 인식 시스템을 위한 OOV update module

Korean broadcast news transcription system with out-of-vocabulary(OOV) update module

  • 정의정 (한국전자통신연구원 음성언어팀) ;
  • 윤승 (한국전자통신연구원 음성언어팀)
  • 발행 : 2002.07.01

초록

We implemented a robust Korean broadcast news transcription system for out-of-vocabulary (OOV), tested its performance. The occurrence of OOV words in the input speech is inevitable in large vocabulary continuous speech recognition (LVCSR). The known vocabulary will never be complete due to the existence of for instance neologisms, proper names, and compounds in some languages. The fixed vocabulary and language model of LVCSR system directly face with these OOV words. Therefore our Broadcast news recognition system has an offline OOV update module of language model and vocabulary to solve OOV problem and selects morpheme-based recognition unit (so called, pseudo-morpheme) for OOV robustness.

키워드