Browse > Article
http://dx.doi.org/10.7776/ASK.2008.27.3.119

Performance Comparison of GMM and HMM Approaches for Bandwidth Extension of Speech Signals  

Song, Geun-Bae (삼성전자)
Kim, Austin (삼성전자)
Abstract
This paper analyzes the relationship between two representative statistical methods for bandwidth extension (BWE): Gaussian Mixture Model (GMM) and Hidden Markov Model (HMM) ones, and compares their performances. The HMM method is a memory-based system which was developed to take advantage of the inter-frame dependency of speech signals. Therefore, it could be expected to estimate better the transitional information of the original spectra from frame to frame. To verify it, a dynamic measure that is an approximation of the 1st-order derivative of spectral function over time was introduced in addition to a static measure. The comparison result shows that the two methods are similar in the static measure, while, in the dynamic measure, the HMM method outperforms explicitly the GMM one. Moreover, this difference increases in proportion to the number of states of HMM model. This indicates that the HMM method would be more appropriate at least for the 'blind BWE' problem. On the other hand, nevertheless, the GMM method could be treated as a preferable alternative of the HMM one in some applications where the static performance and algorithm complexity are critical.
Keywords
Bandwidth extension; Gaussian mixture model; Hidden Markov Model; Baum-Welch Re-estimation algorithm;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 P. Jax and P. Vary, "Artificial bandwidth extension of speech signals using MMSE estimation based on a Hidden Markov Model," ICASSP 1, 680-683, April 2003
2 Y. Linde, A. Buzo, and R.M. Gray, "An algorithm for vector quantizer design," IEEE Trans. Commun. 28(1), 84-95, 1980   DOI
3 F. Norden and T. Eriksson, "A speech spectral distortion measure with interframe memory," ICASSP 1, 665-668, May 2001
4 F. Norden and T. Eriksson, "Time evolution in LPC spectrum coding," IEEE Trans. Speech, Audio Processing 12(3), 290-301, May 2004   DOI   ScienceOn
5 Y. Nakatoh, M. Tsushima, and T. Norimatsu, "Generation of broadband speech from narrowband speech using piecewise linear mapping," Proc. European Conf. Speech Commun., Technol., 1643-1646, Sept. 1997
6 P. Jax and P. Vary, "On artificial bandwidth extension of telephone speech," Signal Processing 83(8), 1707-1719, Aug. 2003   DOI   ScienceOn
7 Wei-shou Hsu, "Robust bandwidth extension of narrowband speech," M.A. thesis, McGill Univ., Dept. of Electrical & Computer Engineering, 26-29, Nov. 2004
8 J. S. Garofolo, L. F. Fisher, J. G. Fiscus, D. S. Pallett,and N. L. Dahlgren, DARPA-TIMIT: Acoustic-Phonetic Continuous Speech Corpus, (1990)
9 B. Geiser and P. Vary, "Backwards compatiblewideband telephony in mobile networks: CELP watermarking and bandwidth extension," ICASSP 4, 533-536, April 2007
10 H. P. Knagenhjelm and W. B. Kleijn, "Spectral dynamics is more important than spectral distortion," ICASSP 1, 665-668, May 1995
11 G. Chen and V. Parsa, "HMM-Based frequency bandwidth extension for speech enhancement using line spectra frequencies," ICASSP 1, 17-21, May 2004
12 S. Chennoukh, A. Gerrits, and R. Sluijter, "Speech enhancement via frequency bandwidth extension using line spectral frequencies," ICASSP 1, 665-668, May 2001
13 A. Rao and K. Rose, "Deterministically annealed design of Hidden Markov Model speech recognizers," IEEE Trans. Speech, Audio Processing 9(6), 111-126, Feb. 2001   DOI   ScienceOn
14 K. -Y. Park and H. S. Kim, "Narrowband to wideband conversion of speech using GMM based transformation," ICASSP 3, 1843-1846, June 2000
15 J. A. Bilmes, "A gentle tutorial of the EM Algorithm and its application to parameter estimation for Gaussian Mixture and Hidden Markov Models," U. C. Berkely, TR-97-021, April. 1998
16 F. K. Soong and A. E. Rosenberg, "On the use of instantaneous and transitional spectral information in speaker recognition," IEEE Trans. Acoust., Speech, Signal Processing 36(6), 871-879, June 1998   DOI   ScienceOn
17 E. Larson and R. M. Aarts, Audio Bandwidth Extension (John Wiley & Sons, Ltd., 2004), Chap. 6, 226-235
18 P. Jax and P. Vary, "Wideband extension of telephone speech using a hidden Markov model," IEEE Workshop on Speech Coding, 133-135, Sept. 2000
19 송근배, 김석호, "Baum-Welch 학습법을 이용한 HMM 기반 대역폭 확장법", 한국음향학회지, 26(6), 207-213, 2007   과학기술학회마을
20 N. Enbom and W. B. Kleijn, "Bandwidth expansion of speech based on vector quantization of the Mel frequency cepstral coefficients," IEEE Workshop on Speech Coding, 171-173, June 1999
21 L. R. Rabiner, "A tutorial on Hidden Markov Models and selected applications in speech recognition," Proceedings of the IEEE 77(2), 257-286, Feb. 1989