Browse > Article
http://dx.doi.org/10.4218/etrij.17.0116.0397

Single-Mode-Based Unified Speech and Audio Coding by Extending the Linear Prediction Domain Coding Mode  

Beack, Seungkwon (Broadcasting & Media Research Laboratory, ETRI)
Seong, Jongmo (Broadcasting & Media Research Laboratory, ETRI)
Lee, Misuk (Broadcasting & Media Research Laboratory, ETRI)
Lee, Taejin (Broadcasting & Media Research Laboratory, ETRI)
Publication Information
ETRI Journal / v.39, no.3, 2017 , pp. 310-318 More about this Journal
Abstract
Unified speech and audio coding (USAC) is one of the latest coding technologies. It is based on a switchable coding structure, and has demonstrated the highest levels of performance for both speech and music contents. In this paper, we propose an extended version of USAC with a single-mode of operation-which does not require a switching system-by extending the linear prediction-coding mode. The main concept of this extension is the adoption of the advantages of frequency-domain coding schemes, such as windowing and transition control. Subjective test results indicate that the proposed scheme covers speech, music, and mixed streams with adequate levels of performance. The obtained quality levels are comparable with those of USAC.
Keywords
USAC; HE-AACv2; AMR-WB+;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 J. Makinen et al., "AMR-WB+: a New Audio Coding Standard for 3rd Generation Mobile Audio Services," IEEE Int. Conf. Acoustics Speech Signal Process., Philadelphia, PA, USA, Mar. 23, 2005, pp. 1109-1112.
2 M. Neuendorf et al., "The ISO/MPEG Unified Speech and Audio Coding Standard: Consistent High Quality for All Content Types and at All Bit Rates," J. AES, vol. 61, no. 12, Dec. 2013, pp. 956-977.
3 S. Quackenbush, "MPEG Unified Speech and Audio Coding," IEEE Multimedia, vol. 20, no. 2, Apr.-June 2013, pp. 72-78.   DOI
4 R.M. Aarts and R.T. Dekkers, "A Real-Time Speech-Music Discriminator," J. Audio Eng. Soc., vol. 47, no. 9, Sept. 1999, pp. 720-725.
5 J.G.A. Barbedo and A. Lopes, "A Robust and Computationally Efficient Speech/Music Discriminator," J. Audio Eng. Soc., vol. 54, no. 7-8, July 2006, pp. 571-588.
6 T. Lee et al., "Adaptive TCX Windowing Technology for Unified Structure MPEG-D USAC," ETRI J., vol. 34, no. 3, June 2012, pp. 474-477.   DOI
7 ISO/IEC 23003-3:2012, MPEG-D (MPEG audio technologies), Part 3: Unified Speech and Audio Coding, 2012.
8 ISO/IEC SC29 WG11 N9638, Evaluation Guidelines for Unified Speech and Audio Proposals, MPEG, Jan. 2008.
9 International Telecommunication Union, Method for the Subjective Assessment of Intermediate Sound Quality (MUSHRA), ITU-R, Recommendation BS, 1543-1, Geneva, Switzerland, 2001.
10 A. Gersho, "Advances in Speech and Audio Compression," Proc. IEEE, vol. 82, no. 6, June 1994, pp. 900-918.   DOI
11 ISO/IEC JTC1/SC29/WG11, Unified Speech and Audio Coding Verification Test Report, Torino, Italy, MPEG 2011/N12232, July 2011.
12 B. Bessette et al., "The Adaptive Multirate Wideband Speech Codec (AMRWB)," IEEE Trans. Speech Audio Process., vol. 10, no. 8, Nov. 2002, pp. 620-636.   DOI
13 J.D. Johnston, "Transform Coding of Audio Signals Using Perceptual Noise Criteria," IEEE J. Sel. Areas Commun., vol. 6, no. 2, Feb. 1988, pp. 314-323.   DOI
14 3GPP, Adaptive Multi-Rate - Wideband (AMRWB) Speech Codec; General Description, 3GPP TS 26.171, 2002.