Browse > Article

A New Vocoder based on AMR 7.4Kbit/s Mode for Speaker Dependent System  

Min, Byung-Jae (㈜ SFA Engineering)
Park, Dong-Chul (명지대학교 정보공학과 지능컴퓨팅 연구실)
Abstract
A new vocoder of Code Excited Linear Predictive (CELP) based on Adaptive Multi Rate (AMR) 7.4kbit/s mode is proposed in this paper. The proposed vocoder achieves a better compression rate in an environment of Speaker Dependent Coding System (SDSC) and is efficiently used for systems, such as OGM(Outgoing message) and TTS(Text To Speech), which needs only one person's speech. In order to enhance the compression rate of a coder, a new Line Spectral Pairs(LSP) code-book is employed by using Centroid Neural Network (CNN) algorithm. In comparison with original(traditional) AMR 7.4 Kbit/s coder, the new coder shows 27% higher compression rate while preserving synthesized speech quality in terms of Mean Opinion Score(MOS).
Keywords
Linear Predictive Coding; Centroid Neural Network; Adaptive Multi Rate;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 ITU-T Recommendation G.723.1, Dual rate speech coder for multimedia communications transmitting at 5.3 and 6.3 kbit/s, 1996
2 C. Laflamme, J.P. Adoul, H.Y. Su, and S. Morissette, "On reducing computational complexity of codebook search in CELP coder through the use of algebraic codes," Proc. IEEE ICASSP, Vol.1, pp.177-180, 1990
3 H.J. Kim, D.G. Jee, M.H. Park, B.S. Yoon, and S.I. Choi, "The real-time implementations of AMR codec for IMT-2000 system," Advanced Communication Technology, ICACT, The 7th International Conference,(1), pp.362-365, 2005
4 M. Schroeder and B. Ata, "Code-Excited Linear Predictive (CELP): high quality speech at very bit rate." Proc. IEEE ICASSP, pp.937-940, 1985
5 3GPP TS 26.090 V7.0.0, "Adaptive Multi-Rate speech transcoding", 1999
6 D.C. Park, "Centroid Neural Network for Unsupervised Competitive Learning." IEEE Trans. Neural Networks, Vol.11, pp.520-528, 2000   DOI   ScienceOn
7 D.C. Park, O.-H. Kwon, and J. Chung, "Centroid Neural NetworkWith a Divergence Measure for GPDF Data Clustering," IEEE Trans. Neural Networks, Vol.19, No.6, pp.948-957, 2008   DOI   ScienceOn
8 ITU-T Recommendation G.729. "Coding of speech at 8kbit/s using conjugate structure algebraic- code-excited linear prediction (CS-ACELP), 1996
9 C.H. Lee, S.K. Jung, and H.G. Kang, "Applying a Speaker-Dependent SpeechCompression Technique to Concatenative TTS Synthesizers", IEEE Trans. Speech and Audio Proc.. Vol.15, pp.632-640, 2007
10 Kohonen, T. "The 'neural' phonetic typewriter". IEEE Computer, 21, pp.11-22, 1988
11 D.C. Park and Y.J. Woo, "Weighted centroid neural network for edge reserving image compression." IEEE Trans. Neural Networks, Vol.12, pp.1134-1146, 2001   DOI   ScienceOn
12 김경민, 윤성완, 최용수, 박영철, 최용수, 윤대 희, 강태익, "이중 전송률(2.4/4.0 kbps)을 갖는 개선된 하모닉-CELP 음성부호화기," 한국통신학회 논문지, 28권 제3C호 pp.457-462, 2003
13 3GPP TS 26.071 V 7.0.0, "Adaptive Multi-Rate speech processing functions; General description", 1999
14 M. Decina and G. Modena, "CCITT standards on digital speech processing," IEEE Journal on Selected Areas in Communication, 6, pp.227-234, 1988   DOI   ScienceOn
15 ETSI, Digital cellular telecommunications system( phase2): Enhanced full rate(EFR) speech transcoding (GSM 06.60 version 6.0.0), ETSI EN pp.300-726, 1997
16 안병호, 유지상, 이승훈, 김상훈, " TTS를 이용 한 멀티미디어 서비스", 한국통신학회지, 제16 권 제5호, pp.534-543, 1999
17 이송재, 박동철 "Bhattacharyya 커널을 적용한 Centroid Neural Network." 한국통신학회 논문지, 32권 9호, pp.861-866, 2008
18 J. Srinonchat, S. Danaher, and A. Murray, "Address vector quantisation applied to speech coding." Proc. IEEE Int. Symp. on Sig. Proc. and Info. Tech., pp.745-748, 2003
19 3GPP TS 26.073 V7.0.0, "Adaptive Multi Rate (AMR) speech; ANSI-C code for the AMR speech codec", 1999
20 Lei Zhang, Tian Wang and Cuperman. V. A "CELP variable rate speech codec with low average rate." IEEE International Conference on ICASSP, 2, pp.735-738, 1997
21 ISO/IEC 14496-3, information technology - very low bit rate audio-visual coding, part 3: Audio, Subpart 1-3, 1998