Damping 요소를 첨가한 매칭 퍼슈잇 정현파 모델링

Matching Pursuit Sinusoidal Modeling with Damping Factor

  • 정규혁 (충북대학교 전파공학과) ;
  • 김종학 (충북대학교 전파공학과) ;
  • 임정우 (충북대학교 전파공학과) ;
  • 주기호 (배재대학교 정보통신공학과) ;
  • 이인성 (충북대학교 전파공학과)
  • Jeong, Gyu-Hyeok (Dept. of Radio Science & Engineering, Chungbuk National University) ;
  • Kim, Jong-Hark (Dept. of Radio Science & Engineering, Chungbuk National University) ;
  • Lim, Joung-Woo (Dept. of Radio Science & Engineering, Chungbuk National University) ;
  • Joo, Gi-Ho (Dept. of Informations and Communications Engineering, PaiChai University) ;
  • Lee, In-Sung (Dept. of Radio Science & Engineering, Chungbuk National University)
  • 발행 : 2007.01.25

초록

본 논문은 정현파 모델 기반의 코덱을 위한 매칭 퍼슈잇(Matching Pursuit)의 성능을 개선시킨 새로운 정현파 모델링을 제안한다. 제안하는 damping 요소를 첨가한 매칭 퍼슈잇 정현파 모델링은 과거와 현재 프레임에서 파라미터들간의 상관성을 이용하여 damping 요소를 정의하고 현재 프레임에서 보다 정확한 정현파 파라미터를 damping 요소에 따라 매칭 퍼슈잇 방법으로 추출한 후 합성한다. 따라서 인접 프레임과의 보간 없이 현재 프레임에서의 정현파 파라미터만으로 효율적인 모델링이 가능하다. 제안한 모델링 방법은 보간법을 사용한 일반적인 정현파 모델과 달리 추가지연을 가지지 않으면서 유성음 구간 신호뿐만 아니라 모든 구간에서 개선된 음질을 보인다. 제안한 모델링 방법의 성능을 SNR, MOS값, LR(Itakura-Saito likelihood ratio), CD(cepstral distance)를 통해 보간법을 사용한 매칭 퍼슈잇과 비교 평가한다.

In this paper, we propose the matching pursuit with damping factors, a new sinusoidal model improving the matching pursuit, for the codecs based on sinusoidal model. The proposed model defines damping factors by using a correlativity of parameters between the current and adjacent frame, and estimates sinusoidal parameters more accurately in analysis frame by using the matching pursuit according to damping factor, and synthesizes the final signal. Then it is possible to model efficiently without interpolation schemes. The proposed sinusoidal model shows a better speech quality without an additional delay than the conventional sinusoidal model with interpolation methods. Through the SNR(signal to noise ratio), the MOS(Mean Opinion Score), LR(Itakura-Saito likelihood ratio), and CD(cepstral distance), we compare the performance of our model with that of matching pursuit using interpolation methods.

키워드

참고문헌

  1. R. J. McAulay and T. F. Quatieri, 'Speech analysis/synthesis based on a sinusoidal representation,' IEEE Trans. on ASSP, vol. 34, no. 4, pp. 744?754, Aug. 1986
  2. W. B. Kleijin and K. K. Paliwal, Speech coding and synthesis, Elevier Science Publishers, Amsterdam, 1995
  3. T. F. Quatieri and R. J. McAulay, 'Speech transformations based on a sinusoidalrepresentation,' IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-34, pp. 1449-1464, 1986 https://doi.org/10.1109/TASSP.1986.1164985
  4. E. B. George and M. J. T. Smith, 'Speech analysis/synthesis and modification using an analysis-by-synthesis/overlap-add sinusoidal model,' IEEE Trans. Speech Audio Processing, vol. 5, no. 5, pp. 389-406, 1997 https://doi.org/10.1109/89.622558
  5. Y. Stylianou, 'Applying the harmonic plus noise model in concatenative speech synthesis,' IEEE Trans. Speech Audio Processing, vol. 9, pp. 232-239, Mar. 2001 https://doi.org/10.1109/89.890068
  6. J. Jensen and J. H. L. Hansen, 'Speech enhancement using a constrained iterative sinusoidal model,' IEEE Trans. Speech Audio Processing, vol. 9, pp. 731-740, Oct. 2001 https://doi.org/10.1109/89.952491
  7. J. Nieuwenhuijse, R. Heusdens, and E.F. Deprettere, 'Robust exponential modeling of audio signals,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '98, Seattle, Washington, USA, vol. 6, pp. 3581?3584, May 1998 https://doi.org/10.1109/ICASSP.1998.679650
  8. T. S. Verma and T. H. Y. Meng, 'Sinusoidal modeling using frame-based perceptually weighted matching pursuits,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '99, Phoenix, Arizona, USA, vol. 2, pp. 981?984, May 1999 https://doi.org/10.1109/ICASSP.1999.759861
  9. Yuan Yuan and D. M. Monro, 'Improved Matching Pursuits Image Coding,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05, vol. 2, pp. 201-204, Mar. 2005
  10. K. Skretting, K. Engan and J.H. Husoy, 'ECG compression using signal dependent frames and matching pursuit,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '05, vol. 4, pp. 585-588, Mar. 2005 https://doi.org/10.1109/ICASSP.2005.1416076
  11. P. Vera-Candeas and N. Ruiz-Reyes, 'New matching pursuit based sinusoidal modelling method for audio coding,' IEE Proceedings on Vision, Image and Signal Processing, vol. 151, pp. 21-28, Feb. 2004 https://doi.org/10.1049/ip-vis:20040044
  12. T. Painter and A. Spanias, 'Perceptual segmentation and component selection in compact sinusoidal representations of audio,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '01, vol. 5, pp. 3289 - 3292, May 2001 https://doi.org/10.1109/ICASSP.2001.940361
  13. X. Serra and J. Smith, 'Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition,' Computer Music journal, vol. 14, pp. 12-24, Dec. 1990
  14. T. S Verma and T. H. Y. Meng,' Sinusoidal Modeling Using Frame-Based Perceptually Weighted Matching Pursuit,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '99, vol. 2, pp. 981-984, 1999 https://doi.org/10.1109/ICASSP.1999.759861
  15. Y. Ding and X. Qian, 'Estimating sinusoidal parameters of musical tones based on global waveform fitting,' Multimedia Signal Processing, pp. 95 - 100, Jun. 1997 https://doi.org/10.1109/MMSP.1997.602619
  16. I. Atkinson, S. Yeldner and A. Kondoz, 'High quality split band LPC vocoder operating at low bit rates,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '97, vol. 2, pp. 1559 - 1562, Apr. 1997 https://doi.org/10.1109/ICASSP.1997.596249
  17. T. F. Quatieri and R. J. McAulay, 'Phase modelling and its application to sinusoidal transform coding,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '86, vol. 3, pp. 1713-1715, Apr. 1986
  18. R. J. McAulay and T. F. Quatieri, 'Magnitude-only reconstruction using a sinusoidal speech model,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '84, vol. 2, pp. 27.6.1-27.6.4, Mar. 1984
  19. R. J. McAulay and T. F. Quatieri, 'Computationally ecient sine-wave synthesis and its application to sinusoidal transform coding,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '88, vol. 1, pp. 370-373, Apr. 1988 https://doi.org/10.1109/ICASSP.1988.196594
  20. J. Nieuwenhuijse, R. Heusdens, and E. F. Deprettere, 'Robust exponential modeling of audio signals,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '98, vol. 2, pp. 3581-3584, Mar. 1998 https://doi.org/10.1109/ICASSP.1998.679650
  21. J. Jensen, S. H. Jensen, and E. Hansen, 'Exponential sinusoidal modeling of transitional speech segments,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '99, pp. 473-476, 1999 https://doi.org/10.1109/ICASSP.1999.758165
  22. K. Hermus, W. Verhelst, and P. Wambacq, 'Psycho-acoustic modeling of audio with exponentially damped sinusoids,' IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '02, pp. 1821-1824, 2002
  23. 안 영욱, 정 규혁, 김 종학, 양 용호, 이 인성, '정현파 모델 부호화기를 위한 MP(MatchingPursuit)알고리즘과 파라미터 양자화기,' 음향학회지 제 24권 제 7호, pp. 402-409, 2005
  24. L. Girin, S. Marchand, J. di Martino, A. Robel and G. Peeters, 'Comparing the order of a polynomial phase model for the synthesis of quasi-harmonic audio signals,' Applications of Signal Processing to Audio and Acoustics, 2003 IEEE Workshop on. 19-22 pp. 193-196, Oct. 2003
  25. ITU-T Recommendation P.862, 'Perceptual evaluation of speech quality (PESQ) : An objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech codec', Feb. 2001
  26. S. Wang, A. Sekey and A. Gersho, 'An objective measure for predicting subjective quality of speech coders,' Selected Areas in Communications, IEEE Journal on vol. 10, pp. 819 - 823, Jun. 1992 https://doi.org/10.1109/49.138987