[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.7776/ASK.2010.29.1.082

Spectrum Based Excitation Extraction for HMM Based Speech Synthesis System

Lee, Bong-Jin (연세대학교 디지털 신호처리 연구실)
Kim, Seong-Woo (연세대학교 디지털 신호처리 연구실)
Baek, Soon-Ho (연세대학교 디지털 신호처리 연구실)
Kim, Jong-Jin (한국전자통신연구원 음성처리연구팀)
Kang, Hong-Goo (연세대학교 디지털 신호처리 연구실)

Publication Information

The Journal of the Acoustical Society of Korea / v.29, no.1, 2010 , pp. 82-90 More about this Journal

Abstract

This paper proposes an efficient method to enhance the quality of synthesized speech in HMM based speech synthesis system. The proposed method trains spectral parameters and excitation signals using Gaussian mixture model, and estimates appropriate excitation signals from spectral parameters during the synthesis stage. Both WB-PESQ and MUSHRA results show that the proposed method provides better speech quality than conventional HMM based speech synthesis system.

Keywords

Speech Synthesis; Gaussian Mixture Model; Excitation Signal;

Citations & Related Records

Reference

1	H, Zen, T, Toda, "An Overview of Nitech HMM-based Speech Synthesis System for Blizzard Challenge 2005," in Proc. INTERSPEECH 2005, pp, 93-96, 2005.
2	H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, "Hidden semi-Markov model based speech synthesis," in Proc. ICSLP, pp. 1185-1180, 2004.
3	K. Tokuda, T. Kobayashi, T. Masuko, S. Imai, "MEL-GENERALIZED CEPSTRAL ANALYSIS - A UNIFIED APPROACH TO SPEECH SPECTRAL ESTIMATION: in Proc. of ICASSP, pp. 1043-1046, 1994.
4	S, Imai, "Cepstral analysis synthesis on the mel frequency scale Acoustics," in Proc. of ICASSP '83., pp. 93-96, 1983.
5	J. S. Garofalo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S, Pallett, N. L. Dahlgren, "The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM," Linguistic Data Consortium, 1993.
6	ITU-R Recommendation BS.1534-1, Method for the Subjective Assessment of Intermediate Sound Quality (MUSHRA), International Telecommunications Union, Geneva, Switzerland, 2001.
7	T. Toda, K. Tokuda, "Speech parameter generation algorithm considering global variance for HMM-based speech synthesis," in Proc, of Interspeech, pp, 801-2804, 2005.
8	T, Kobayashi, T, S. Imai, "Spectral analysis using generalized cepstrum," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-32, pp. 1087-1089, 1984.
9	S. Lemmetly, Review of Speech Synthesis Technology, M. S. thesis, Helsinki Univ, Technol., Helsinki, Finland, 1999.
10	T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, "Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis," in Proc. of Eurospeech, pp, 2350-2374, 1999.
11	H. W. Strube, "Linear prediction on a warped frequency scale," J. Acoust. Soc. America, vol. 68, no. 4, pp. 1071-1076, 1980. DOI ScienceOn
12	ITU-T Q.9/12, Proposed modification to draft P.862 to allow PESQ to be used for quality assessment of wideband speech, 2004.
13	K. Park, H. S. Kim, "Narrowband to wideband conversion of speech using GMM based transformation," in Proc. of ICASSP, pp. 1843-1846, 2000.

KSCI

Spectrum Based Excitation Extraction for HMM Based Speech Synthesis System 스펙트럼 기반 여기신호 추출을 통한 HMM기반 음성합성기의 음질 개선 방법

Spectrum Based Excitation Extraction for HMM Based Speech Synthesis System