Speech Quality of a Sinusoidal Model Depending on the Number of Sinusoids

  • Seo, Jeong-Wook (School of Electronic and Electrical Eng., Kyungpook National Univ.) ;
  • Kim, Ki-Hong (LG Electronics Co.) ;
  • Seok, Jong-Won (ETRI, Broadcasting Technology Department, Raio & Broadcasting Technology Lab.) ;
  • Bae, Keun-Sung (School of Electronic and Electrical Eng., Kyungpook National Univ.)
  • Published : 2000.03.01

Abstract

The STC(Sinusoidal Transform Coding) is a vocoding technique that uses a sinusoidal speech model to obtain high- quality speech at low data rate. It models and synthesizes the speech signal with fundamental frequency and its harmonic elements in frequency domain. To reduce the data rate, it is necessary to represent the sinusoidal amplitudes and phases with as small number of peaks as possible while maintaining the speech quality. As a basic research to develop a low-rate speech coding algorithm using the sinusoidal model, in this paper, we investigate the speech quality depending on the number of sinusoids. By varying the number of spectral peaks from 5 to 40 speech signals are reconstructed, and then their qualities are evaluated using spectral envelope distortion measure and MOS(Mean Opinion Score). Two approaches are used to obtain the spectral peaks: one is a conventional STFT (Short-Time Fourier Transform), and the other is a multiresolutional analysis method.

Keywords