Morpheme-based Korean broadcast news transcription

Park Young-Hee;Ahn Dong-Hoon;Chung Minhwa;

Proceedings of the KSPS conference (대한음성학회:학술대회논문집)

2002.11a
/
Pages.123-126
/
2002

The Korean Society Of Phonetic Sciences And Speech Technology (대한음성학회)

Morpheme-based Korean broadcast news transcription

형태소 기반의 한국어 방송뉴스 인식

박영희 (서강대학교 컴퓨터학과) ;
안동훈 (서강대학교 컴퓨터학과) ;
정민화 (서강대학교 컴퓨터학과)

Published : 2002.11.01

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we describe our LVCSR system for Korean broadcast news transcription. The main focus is to find the most proper morpheme-based lexical model for Korean broadcast news recognition to deal with the inflectional flexibilities in Korean. There are trade-offs between lexicon size and lexical coverage, and between the length of lexical unit and WER. In our system, we analyzed the training corpus to obtain a small 24k-morpheme-based lexicon with 98.8％ coverage. Then, the lexicon is optimized by combining morphemes using statistics of training corpus under monosyllable constraint or maximum length constraint. In experiments, our system reduced the number of monosyllable morphemes from 52％ to 29％ of the lexicon and obtained 13.24％ WER for anchor and 24.97％ for reporter.

Proceedings of the KSPS conference (대한음성학회:학술대회논문집)

Morpheme-based Korean broadcast news transcription

형태소 기반의 한국어 방송뉴스 인식

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)