• Title/Summary/Keyword: Merlin 툴킷

Search Result 1, Processing Time 0.012 seconds

Performance comparison of various deep neural network architectures using Merlin toolkit for a Korean TTS system (Merlin 툴킷을 이용한 한국어 TTS 시스템의 심층 신경망 구조 성능 비교)

  • Hong, Junyoung;Kwon, Chulhong
    • Phonetics and Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.57-64
    • /
    • 2019
  • In this paper, we construct a Korean text-to-speech system using the Merlin toolkit which is an open source system for speech synthesis. In the text-to-speech system, the HMM-based statistical parametric speech synthesis method is widely used, but it is known that the quality of synthesized speech is degraded due to limitations of the acoustic modeling scheme that includes context factors. In this paper, we propose an acoustic modeling architecture that uses deep neural network technique, which shows excellent performance in various fields. Fully connected deep feedforward neural network (DNN), recurrent neural network (RNN), gated recurrent unit (GRU), long short-term memory (LSTM), bidirectional LSTM (BLSTM) are included in the architecture. Experimental results have shown that the performance is improved by including sequence modeling in the architecture, and the architecture with LSTM or BLSTM shows the best performance. It has been also found that inclusion of delta and delta-delta components in the acoustic feature parameters is advantageous for performance improvement.