Search | Korea Science

한우진;오영환
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.6
- /
- pp.510-521
- /
- 2002
MBE (multi-band excitation) coder can achieve high qualify synthetic speech below 4.0 kbps. There are, however, significant differences of the fine structure between the original spectrum and the synthetic spectrum. They are mainly due to the exclusive partition of voiced and unvoiced regions in frequency domain and the decision procedure based on the experimental threshold. This paper proposes MMBE (mixed multi-band excitation) speech model to overcome drawbacks of a MBE coder. In addition, two analysis methods, which do not need my decision procedure based on a threshold, are presented. Both voiced and unvoiced components can be mixed over all the frequency axis in the MMBE speech model. To illustrate the potential of the proposed speech model, we develop a 2.6 kbps MMBE coder and compare it with a 2.9 kbps MBE coder by both objective and subjective methods. The results have shown that the proposed coder has a better performance even at a lower bit-rate compared with the MBE coder.
PDF KSCI

Park, Man-Ho;Bae, Geon-Seong
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.38 no.5
- /
- pp.576-582
- /
- 2001
The Multi-Band Excitation(MBE) speech coder uses a different approach for the representation of the excitation signal. It replaces the frame-based single voiced/unvoiced classification of a classical speech coder with a set of such decision over harmonic intervals in the frequency domain. This enables each speech segment to be a mixture of voiced and unvoiced, and improves the synthetic speech quality by reducing decision errors that might occur on the frame-based single voiced and unvoiced decision process when input speech is degraded with noise. The IMBE-LP, improved version of MBE with linear prediction, represents the spectral information of MBE model with linear prediction coefficients to obtain low bit rate of 2.4 kbps. In this Paper, we proposed a variable rate IMBE-LP vocoder that has lower bit rate than IMBE-LP without degrading the synthetic speech quality. To determine the LP order, it uses the spectral band information of the MBE model that has something to do with he input speech's characteristics. Experimental results are riven with our findings and discussions.
PDF

Park, Hyung-Woo;Bae, Myung-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.2
- /
- pp.148-153
- /
- 2010
In speech signal processing, speech signal corrupted by noise should be enhanced to improve quality. Usually noise estimation methods need flexibility for variable environment. Noise profile is renewed on silence region to avoid effects of speech properties. So we have to preprocess finding voice region before noise estimation. However, if received signal does not have silence region, we cannot apply that method. In this paper, we proposed SNR estimation method for continuous speech signal. A Speech signal consists of Voice and Unvoiced Band in The MBE excitation model. And the energy of speech signal is mostly distributed on voiced region, so we can estimate SNR by the ratio of voiced region energy to unvoiced. We use the IMBE vocoder for the Voice or Unvoice band of segmented speech signal. Continuously we calculate the segmented SNR using that information and the energy of each band. And we estimate the SNR of continuous speech signal.
https://doi.org/10.7776/ASK.2010.29.2.148 인용 PDF KSCI

Ahn Yeong-uk;Kim Jong-hak;Lee Insung;Kwon Oh-ju;Bae Mun-Kwan
- Journal of the Institute of Electronics Engineers of Korea SP
- /
- v.42 no.2 s.302
- /
- pp.131-142
- /
- 2005
The low rate speech coders under 4 kbit/s are based on sinusoidal transform coding (STC) or multiband excitation (MBE). Since the harmonic coders are not efficient to reconstruct the transient segments of speech signals such as onsets, offsets, non-periodic signals, etc, the coders do not provide a natural speech quality. This paper proposes method of a efficient transient model :d a multi-mode low rate coder at 2.4 kbit/s that uses harmonic model for the voiced speech, stochastic model for the unvoiced speech and a model using aperiodic pulse location tracking (APPT) for the transient segments, respectively. The APPT utilizes the harmonic model. The proposed method uses different models depending on the characteristics of LPC residual signals. In addition, it can combine synthesized excitation in CELP coding at time domain with that in harmonic coding at frequency domain efficiently. The proposed coder shows a better speech quality than 2.4 kbit/s version of the mixed excitation linear prediction (MELP) coder that is a U.S. Federal Standard for speech coder.
PDF KSCI

Kim, Hyeon-Jin;Jang, Beom-Seon
- Journal of the Society of Naval Architects of Korea
- /
- v.55 no.6
- /
- pp.466-473
- /
- 2018
Most frequency domain-based approaches assume that structural response should be a Gaussian random process. But a lot of non-Gaussian processes caused by multi-excitation and non-linearity in structural responses or load itself are observed in many real engineering problems. In this study, the effect of non-Normality on fatigue damages are discussed through case study. The accuracy of four frequency domain methods for non-Gaussian processes are compared in the case study. Power-law and Hermite models which are derived for non-Gaussian narrow-banded process tend to estimate fatigue damages less accurate than time domain results in small kurtosis and in case of large kurtosis they give conservative results. Weibull model seems to give conservative results in all environmental conditions considered. Among the four methods, Benascuitti-Tovo model for non-Gaussian process gives the best results in case study. This study could serve as background material for understanding the effect of non-normality on fatigue damages.
https://doi.org/10.3744/SNAK.2018.55.6.466 인용 PDF KSCI