DOI QR코드

DOI QR Code

Optimum MVF Estimation-Based Two-Band Excitation for HMM-Based Speech Synthesis

  • Han, Seung-Ho (Department of Information and Communications Engineering, KAIST) ;
  • Jeong, Sang-Bae (Department of Electronics Engineering (ERI), Gyeongsang National University) ;
  • Hahn, Min-Soo (Department of Information and Communications Engineering, KAIST)
  • Received : 2009.03.16
  • Accepted : 2009.05.06
  • Published : 2009.08.30

Abstract

The optimum maximum voiced frequency (MVF) estimation-based two-band excitation for hidden Markov model-based speech synthesis is presented. An analysis-by-synthesis scheme is adopted for the MVF estimation which leads to the minimum spectral distortion of synthesized speech. Experimental results show that the proposed method significantly improves synthetic speech quality.

Keywords

References

  1. T. Yoshimura et al., “Simultaneous Modeling of Spectrum, Pitch and Duration in HMM-Based Speech Synthesis,” Proc. EUROSPEECH, vol. 5, 1999, pp. 2347-2350.
  2. T. Fukada et al., “An Adaptive Algorithm for Mel-Cepstral Analysis of Speech,” Proc. ICASSP, vol. 1, 1992, pp. 137-140.
  3. T. Yoshimura et al., “Mixed Excitation for HMM-Based Speech Synthesis,” Proc. EUROSPEECH, vol. 3, 2001, pp. 2263-2266.
  4. S. Kim, J. Kim, and M. Hahn, “HMM-Based Korean Speech Synthesis System for Hand-Held Devices,” IEEE Trans. Consum. Electron., vol. 52, no. 4, Nov. 2006, pp. 1384-1390. https://doi.org/10.1109/TCE.2006.273160
  5. X. Huang, A. Acero, and H.-W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall, New Jersey, 2001.

Cited by

  1. The Deterministic Plus Stochastic Model of the Residual Signal and Its Applications vol.20, pp.3, 2012, https://doi.org/10.1109/tasl.2011.2169787
  2. TBE 모델을 사용하는 HMM 기반 음성합성기 성능 향상을 위한 하모닉 선택에 기반한 MVF 예측 방법 vol.4, pp.4, 2009, https://doi.org/10.13064/ksss.2012.4.4.079
  3. Vocal Removal From Multiobject Audio Using Harmonic Information for Karaoke Service vol.21, pp.4, 2013, https://doi.org/10.1109/tasl.2012.2234116
  4. 하모닉 정보를 이용한 SAOC의 보컬 신호 제거 방법에 관한 연구 vol.16, pp.10, 2009, https://doi.org/10.9717/kmms.2013.16.10.1171
  5. Enhanced Maximum Voiced Frequency Estimation Scheme for HTS Using Two-Band Excitation Model vol.37, pp.6, 2015, https://doi.org/10.4218/etrij.15.0115.0124
  6. Hard component detection of transient noise and its removal using empirical mode decomposition and wavelet‐based predictive filter vol.12, pp.7, 2009, https://doi.org/10.1049/iet-spr.2017.0167