[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.4218/etrij.17.0116.0397

Single-Mode-Based Unified Speech and Audio Coding by Extending the Linear Prediction Domain Coding Mode

Beack, Seungkwon (Broadcasting & Media Research Laboratory, ETRI)
Seong, Jongmo (Broadcasting & Media Research Laboratory, ETRI)
Lee, Misuk (Broadcasting & Media Research Laboratory, ETRI)
Lee, Taejin (Broadcasting & Media Research Laboratory, ETRI)

Publication Information

ETRI Journal / v.39, no.3, 2017 , pp. 310-318 More about this Journal

Abstract

Unified speech and audio coding (USAC) is one of the latest coding technologies. It is based on a switchable coding structure, and has demonstrated the highest levels of performance for both speech and music contents. In this paper, we propose an extended version of USAC with a single-mode of operation-which does not require a switching system-by extending the linear prediction-coding mode. The main concept of this extension is the adoption of the advantages of frequency-domain coding schemes, such as windowing and transition control. Subjective test results indicate that the proposed scheme covers speech, music, and mixed streams with adequate levels of performance. The obtained quality levels are comparable with those of USAC.

Keywords

USAC; HE-AACv2; AMR-WB+;

Citations & Related Records

Times Cited By KSCI : 1 (Citation Analysis)

Reference
Cited By KSCI

1	J. Makinen et al., "AMR-WB+: a New Audio Coding Standard for 3rd Generation Mobile Audio Services," IEEE Int. Conf. Acoustics Speech Signal Process., Philadelphia, PA, USA, Mar. 23, 2005, pp. 1109-1112.
2	M. Neuendorf et al., "The ISO/MPEG Unified Speech and Audio Coding Standard: Consistent High Quality for All Content Types and at All Bit Rates," J. AES, vol. 61, no. 12, Dec. 2013, pp. 956-977.
3	S. Quackenbush, "MPEG Unified Speech and Audio Coding," IEEE Multimedia, vol. 20, no. 2, Apr.-June 2013, pp. 72-78. DOI
4	R.M. Aarts and R.T. Dekkers, "A Real-Time Speech-Music Discriminator," J. Audio Eng. Soc., vol. 47, no. 9, Sept. 1999, pp. 720-725.
5	J.G.A. Barbedo and A. Lopes, "A Robust and Computationally Efficient Speech/Music Discriminator," J. Audio Eng. Soc., vol. 54, no. 7-8, July 2006, pp. 571-588.
6	T. Lee et al., "Adaptive TCX Windowing Technology for Unified Structure MPEG-D USAC," ETRI J., vol. 34, no. 3, June 2012, pp. 474-477. DOI
7	ISO/IEC 23003-3:2012, MPEG-D (MPEG audio technologies), Part 3: Unified Speech and Audio Coding, 2012.
8	ISO/IEC SC29 WG11 N9638, Evaluation Guidelines for Unified Speech and Audio Proposals, MPEG, Jan. 2008.
9	International Telecommunication Union, Method for the Subjective Assessment of Intermediate Sound Quality (MUSHRA), ITU-R, Recommendation BS, 1543-1, Geneva, Switzerland, 2001.
10	A. Gersho, "Advances in Speech and Audio Compression," Proc. IEEE, vol. 82, no. 6, June 1994, pp. 900-918. DOI
11	ISO/IEC JTC1/SC29/WG11, Unified Speech and Audio Coding Verification Test Report, Torino, Italy, MPEG 2011/N12232, July 2011.
12	B. Bessette et al., "The Adaptive Multirate Wideband Speech Codec (AMRWB)," IEEE Trans. Speech Audio Process., vol. 10, no. 8, Nov. 2002, pp. 620-636. DOI
13	J.D. Johnston, "Transform Coding of Audio Signals Using Perceptual Noise Criteria," IEEE J. Sel. Areas Commun., vol. 6, no. 2, Feb. 1988, pp. 314-323. DOI
14	3GPP, Adaptive Multi-Rate - Wideband (AMRWB) Speech Codec; General Description, 3GPP TS 26.171, 2002.

3	(2017) 방송공학회논문지 부가 정보를 이용하는 오토 인코더 기반의 오디오 고대역 부호화 기술 / 24 (3) , 387
9	(2021) Electronics Two-Dimensional Audio Compression Method Using Video Coding Schemes / 10 (9) , 1094