Browse > Article
http://dx.doi.org/10.7776/ASK.2009.28.3.290

Efficient TTS Database Compression Based on AMR-WB Speech Coder  

Lim, jong-Wook (세종대학교 정보통신공학과)
Kim, Ki-Chul (세종대학교 정보통신공학과)
Kim, Kyeong-Sun ((주)에이치씨아이랩)
Lee, Hang-Seop (세종대학교 정보통신공학과)
Park, Hae-Young ((주)에이치씨아이랩)
Kim, Moo-Young (세종대학교 정보통신공학과)
Abstract
This paper presents an improved adaptive multi-rate wideband (AMR-WB) algorithm for the efficient Text-To-Speech (TTS) database compression. The proposed algorithm includes unnecessary common bit-stream (CBS) removal and parameter delta coding combined with speaker-dependent huffman coding to reduce the required bit-rate without any quality degradation. We also propose lossy coding schemes to produce the maximum bit-rate reduction with negligible quality degradation. The proposed lossless algorithm including CBS removal can reduce bit-rate by 12.40% without quality degradation compared with the 12.65 kbps AMR-WB mode. The proposed lossy algorithm can reduce bit-rate by 20.00% with 0.12 PESQ degradation.
Keywords
TIS; AMR-WB; Huffman coding; Speech Coding; Information theory;
Citations & Related Records
연도 인용수 순위
  • Reference
1 O. Derrien, P. DuhameI, M. Charbit, and G. Richard, "A New Quantization Optimization Algorithm for the MPEG Advanced Audio Coder using a Statistical Subband Model of the Quan-tization Noise," IEEE Trans. Audio Speech Language Pro-cessing, vol. 14, no. 4, pp. 1328-1339, 1998   DOI   ScienceOn
2 B. Bessette, R. Salami, R. Lefebvre, M. Jelinek, J. Rotola-Pukkila, J. Vainio, H. Mikkoa, and K.Jarvinen, “The Adaptive Multirate wideband Speed Codec(AMR-WB),” IEEE Trans. Speech Audio Processing, vol. 10, no, 8, pp. 620-636, 2002   DOI   ScienceOn
3 ITU-T Recommendation G.722.1, Coding at 24 and 32 kbit/s for Hands-Free Operation in System with Low Frame Loss
4 Y. Shoham, "Variable-size vector entropy coding of speech and audio," in Proc. IEEE Conf. Acoust., Speech, Signal Pro-cessing, vol.2, pp.769-772, 2001   DOI
5 C.-H. Lee, S.-K. Jung, and H.-G. Kang, “Applying a Speaker-Dependent Speech Compression Technique to Concate-native TTS Synthesizers,” IEEE Trans. Audio Speech Language Processing, vol. 15, no. 2, pp. 632-640, 2007   DOI   ScienceOn
6 R. Salami, C. Laflamme, J. P. Adoul, A. Kataoka, S. Hayashi, T. Moriya, C. Lamblin, U. Massaloux, S. Proust, P. Kroon, and Y. Shoham, "Design and Description ot CS-ACELP: A Toll Quality 8kb/s Speech Coder," IEEE Trans. Soeech Audio Processing, vol. 6, no. 2, pp. 116-130, 1998   DOI   ScienceOn
7 양희식, 한민수, "TTS DB 압축을 위한 광대역 파형보간 부호기 구현," 대한음성학회지, 말소리 55호, 143-158쪽, 2005   과학기술학회마을
8 3GPP TS 26.190, Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Transcoding functions, v.7.0.0., 2007
9 W. B. Kleijn, A Basis for Source Coding: Course Notes. KTH, Stockholm, 2008
10 3GPP TS 26.201, Adaptive Multi-Rate - Wideband (AMR-WB) speech codec; Frame structure, v.7.1.0., 2008
11 ITU-T Recommendation G.729, Coding of Speech at 8kbit/s using Conjugate-Structure Algebraic-Code-Excited Linear Prediction (CS-ACELP)
12 I. Singh, P. Agathoklis, and A. Antoniou, "Wavelet-based Compression of Speech Signals on the TMS320C30 Digital Signal Processor," in Proc. IEEE Symposium on Advances in Digital Filtering Signal Processing, pp. 178-182, 1998   DOI
13 X. Minjie, D. Lindbergh, and P. Chu, "ITU-T G.722.1 Annex c :A New Low-Complexity 14 kHz Audio Coding Standard," in Proc. IEEE Conf. Acoust., Speech, Signal Processing, pp. 173-176, 2006   DOI
14 ISO/IEC JTC1/SC29/WG11 No.71, Coding of Moving Pictures and Associated Audio for Digital Storage Media at up to about 1.5Mbit/s: Part 3-Audio, 1993