DOI QR코드

DOI QR Code

Single-Mode-Based Unified Speech and Audio Coding by Extending the Linear Prediction Domain Coding Mode

  • Beack, Seungkwon (Broadcasting & Media Research Laboratory, ETRI) ;
  • Seong, Jongmo (Broadcasting & Media Research Laboratory, ETRI) ;
  • Lee, Misuk (Broadcasting & Media Research Laboratory, ETRI) ;
  • Lee, Taejin (Broadcasting & Media Research Laboratory, ETRI)
  • 투고 : 2016.06.16
  • 심사 : 2017.01.04
  • 발행 : 2017.06.01

초록

Unified speech and audio coding (USAC) is one of the latest coding technologies. It is based on a switchable coding structure, and has demonstrated the highest levels of performance for both speech and music contents. In this paper, we propose an extended version of USAC with a single-mode of operation-which does not require a switching system-by extending the linear prediction-coding mode. The main concept of this extension is the adoption of the advantages of frequency-domain coding schemes, such as windowing and transition control. Subjective test results indicate that the proposed scheme covers speech, music, and mixed streams with adequate levels of performance. The obtained quality levels are comparable with those of USAC.

키워드

참고문헌

  1. A. Gersho, "Advances in Speech and Audio Compression," Proc. IEEE, vol. 82, no. 6, June 1994, pp. 900-918. https://doi.org/10.1109/5.286194
  2. ISO/IEC JTC1/SC29/WG11, Unified Speech and Audio Coding Verification Test Report, Torino, Italy, MPEG 2011/N12232, July 2011.
  3. J.D. Johnston, "Transform Coding of Audio Signals Using Perceptual Noise Criteria," IEEE J. Sel. Areas Commun., vol. 6, no. 2, Feb. 1988, pp. 314-323. https://doi.org/10.1109/49.608
  4. 3GPP, Adaptive Multi-Rate - Wideband (AMRWB) Speech Codec; General Description, 3GPP TS 26.171, 2002.
  5. B. Bessette et al., "The Adaptive Multirate Wideband Speech Codec (AMRWB)," IEEE Trans. Speech Audio Process., vol. 10, no. 8, Nov. 2002, pp. 620-636. https://doi.org/10.1109/TSA.2002.804299
  6. J. Makinen et al., "AMR-WB+: a New Audio Coding Standard for 3rd Generation Mobile Audio Services," IEEE Int. Conf. Acoustics Speech Signal Process., Philadelphia, PA, USA, Mar. 23, 2005, pp. 1109-1112.
  7. M. Neuendorf et al., "The ISO/MPEG Unified Speech and Audio Coding Standard: Consistent High Quality for All Content Types and at All Bit Rates," J. AES, vol. 61, no. 12, Dec. 2013, pp. 956-977.
  8. S. Quackenbush, "MPEG Unified Speech and Audio Coding," IEEE Multimedia, vol. 20, no. 2, Apr.-June 2013, pp. 72-78. https://doi.org/10.1109/MMUL.2013.24
  9. R.M. Aarts and R.T. Dekkers, "A Real-Time Speech-Music Discriminator," J. Audio Eng. Soc., vol. 47, no. 9, Sept. 1999, pp. 720-725.
  10. J.G.A. Barbedo and A. Lopes, "A Robust and Computationally Efficient Speech/Music Discriminator," J. Audio Eng. Soc., vol. 54, no. 7-8, July 2006, pp. 571-588.
  11. T. Lee et al., "Adaptive TCX Windowing Technology for Unified Structure MPEG-D USAC," ETRI J., vol. 34, no. 3, June 2012, pp. 474-477. https://doi.org/10.4218/etrij.12.0211.0404
  12. ISO/IEC 23003-3:2012, MPEG-D (MPEG audio technologies), Part 3: Unified Speech and Audio Coding, 2012.
  13. ISO/IEC SC29 WG11 N9638, Evaluation Guidelines for Unified Speech and Audio Proposals, MPEG, Jan. 2008.
  14. International Telecommunication Union, Method for the Subjective Assessment of Intermediate Sound Quality (MUSHRA), ITU-R, Recommendation BS, 1543-1, Geneva, Switzerland, 2001.

피인용 문헌

  1. 부가 정보를 이용하는 오토 인코더 기반의 오디오 고대역 부호화 기술 vol.24, pp.3, 2017, https://doi.org/10.5909/jbe.2019.24.3.387
  2. Two-Dimensional Audio Compression Method Using Video Coding Schemes vol.10, pp.9, 2021, https://doi.org/10.3390/electronics10091094