DOI QR코드

DOI QR Code

Artificial Bandwidth Extension Based on Harmonic Structure Extension and NMF

하모닉 구조 확장과 NMF 기반의 인공 대역 확장 기술

  • Kim, Kijun (Dept. of Electronics Engineering, Kwangwoon University) ;
  • Park, Hochong (Dept. of Electronics Engineering, Kwangwoon University)
  • Received : 2013.10.16
  • Accepted : 2013.12.02
  • Published : 2013.12.25

Abstract

In this paper, we propose a new method for artificial bandwidth extension of narrow-band signal in frequency domain. In the proposed method, a narrow-band signal is decomposed into excitation signal and spectral envelope, which are extended independently in frequency domain. The excitation signal is extended such that low-band harmonic structure is maintained in high band, and the spectral envelope is extended based on sub-band energy using NMF. Finally, the spectral phase is determined based on signal correlation between frames in time domain, resulting in the final wide-band signal. The subjective evaluation verified that the wide-band signal generated by the proposed method has a higher quality than the original narrow-band signal.

본 논문에서는 주파수 영역에서 협대역 신호를 광대역으로 확장하는 새로운 인공 대역 확장 기술을 제안한다. 제안한 기술은 협대역 신호를 여기 신호와 스펙트럼 포락선 성분으로 분리하고, 주파수 영역에서 각각 독립적인 방법으로 확장한다. 여기 신호는 저대역의 하모닉 구조가 고대역에서 유지되도록 확장하고, 스펙트럼 포락선은 부대역별 에너지를 기반으로 NMF방법으로 확장한다. 마지막으로 시간 축에서 프레임 사이의 상관관계를 기반으로 스펙트럼 위상을 결정하여 최종 광대역 신호를 생성한다. 주관적 청취 평가를 통하여 제안한 방법으로 대역 확장된 신호가 원 협대역 신호보다 음질이 향상된 것을 확인하였다.

Keywords

References

  1. J. Sung, H. W. Kim, D. Y. Kim, B. S. Lee and Y. H. Ko, "A candidate codec algorithm on superwideband extension to ITU-T G.711.1 and G.722," J. Institute of Electronics Engineers of Korea, vol. SP-47, no. 5, pp. 62-73, 2010. 9.
  2. P. Jax and P. Vary, "On artificial bandwidth extension of telephone speech," Signal Processing, vol. 83, no. 8, pp. 1707-1719, August 2003. https://doi.org/10.1016/S0165-1684(03)00082-3
  3. S. Chennoukh, A. Gerrits, G. Miet and R. Sluijter, "Speech enhancement via frequency bandwidth extension using line spectral frequencies," in Proc. IEEE Conf. on Acoustics, Speech, and Signal Processing, pp. 665-668, Salt Lake City, Utah, USA, May 2001.
  4. P. Jax and P. Vary, "Artificial bandwidth extension of speech signals using MMSE estimation based on a hidden Markov model," in Proc. IEEE Conf. on Acoustics, Speech, and Signal Processing, pp. 680-683, Hong Kong, China, April 2003.
  5. K. Y. Park and H. S. Kim, "Narrowband to wideband conversion of speech using GMM based transformation," in Proc. IEEE Conf. on Acoustics, Speech, and Signal Processing, pp. 1843-1846, Istanbul, Turkey, June 2000.
  6. K. B. Hong, G. H. Jeong and I. S. Lee, "Enhancement of super-wideband coder by considering audio feature in MDCT domain," J. Institute of Electronics Engineers of Korea, vol. SP-48, no. 5, pp. 129-136, 2011.9.
  7. D. D. Lee and H. S. Seung. "Learning the parts of objects by non-negative matrix factorization," Nature, vol. 401, pp. 788-791, August 1999. https://doi.org/10.1038/44565
  8. D. Bansal, B. Raj and P. Smaragdis, "Bandwidth expansion of narrowband speech using non-negative matrix factorization," in Proc. Interspeech, pp. 1505-1508, Lisbon, Portugal, September 2005.
  9. M. Dietz, L. Liljeryd, K. Kjorling and O. Kunz, "Spectral band replication, a novel approach in audio coding," in Proc. 112th AES Convention, pp. 10-13, Munich, Germany, May 2002.
  10. ITU-T Rec. P.800, "Methods for subjective determination of transmission quality," August 1996.