자동 음성 분할을 위한 음향 모델링 및 에너지 기반 후처리

Acoustic Modeling and Energy-Based Postprocessing for Automatic Speech Segmentation

  • 발행 : 2002.06.01

초록

Speech segmentation at phoneme level is important for corpus-based text-to-speech synthesis. In this paper, we examine acoustic modeling methods to improve the performance of automatic speech segmentation system based on Hidden Markov Model (HMM). We compare monophone and triphone models, and evaluate several model training approaches. In addition, we employ an energy-based postprocessing scheme to make correction of frequent boundary location errors between silence and speech sounds. Experimental results show that our system provides 71.3% and 84.2% correct boundary locations given tolerance of 10 ms and 20 ms, respectively.

키워드