Transcoding Algorithm for AMR and EVRC Vocoders Via Direct Parameter Transformation

AMR과 EVRC 음성부호화기를 위한 파라미터 직접 변환 방식의 상호부호화 알고리듬

  • Lee, Sun-Il (Dept. of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology) ;
  • Yu, Chang-Dong (Dept. of Electrical Engineering and Computer Science, Korea Advanced Institute of Science and Technology)
  • 이선일 (韓國科學技術院 電子電算學科) ;
  • 유창동 (韓國科學技術院 電子電算學科)
  • Published : 2002.11.01

Abstract

In this paper, a novel transcoding algorithm for the Adaptive Multi Rate(AMR) and the Enhanced Variable Rate Codec(EVRC) vocoders via direct parameter transformation is proposed. In contrast to the conventional tandem transcoding algorithm, the proposed algorithm converts the parameters of one coder to the other without going through the decoding and encoding processes. The proposed algorithm consists of the parameter decoding, frame classification, mode decision, and transcoders for two frame types. The transcoders convert the parameters such as LSP, frame energy, pitch delay for the adaptive codebook, fixed codebook vector, and codebook gains. Evaluation results show that while exhibiting better computational and delay characteristics, the proposed algorithm produces equivalent speech quality to that produced by the tandem transcoding algorithm.

본 논문에서는 AMR과 EVRC 음성부호화기를 위한 새로운 파라미터 직접 변환 방식의 상호부호화 알고리듬을 제안한다. 상호부호화를 위하여 부가적인 복호화, 부호화 과정을 거쳐야하는 기존의 Tandem 방식과 달리 제안된 파라미터 직접 변환 방식에서는 양 음성부호화기가 음성을 부호화하기 위하여 공통적으로 사용하는 파라미터들이 직접 변환된다. 제안된 알고리듬은 파라미터 복호화, 프레임 분류, 모드 결정, 그리고 두가지 프레임형을 위한 상호부호화기로 구성된다. 상호부호화기는 LSP, 프레임 에너지, 적응 코드북을 위한 피치 지연, 고정 코드북 벡터, 그리고 양 코드북의 이득을 변환한다. 제안된 알고리듬을 다양한 방법으로 평가해본 결과 기존의 Tandem 방식과 비교하여 계산량과 지연 시간을 줄이면서도 동등한 음질을 구현함을 확인할 수 있었다.

Keywords

References

  1. 3GPP TS 26.071 V5.0.0, AMR Speech CODEC: General Description, Jun., 2002
  2. 3GPP TS 26.090 V5.0.0, Adaptive Multi-Rate (AMR) speech codec: Transcoding functions, Jun., 2002
  3. 3GPP TS 26.093 V5.0.0, Adaptive Multi-Rate (AMR) speech codec: Source controlled rate operation, Ju., 2002
  4. 3GPP TS 26.094 V5.0.0, Adaptive Multi-Rate (AMR): Voice Activity Detector(VAD), Jun., 2002
  5. 3GPP TS 26.092 V5.0.0, Adaptive Multi-Rate (AMR): Comfort noise aspects, Jun., 2002
  6. 3GPP TS 26.091 V5.0.0, Adaptive Multi-Rate (AMR): Error concealment of lost frames, Jun., 2002
  7. 3GPP TS 26.073 V5.0.0, ANSI-C code for Adaptive Multi-Rate(AMR) speech codec, Jun., 2002
  8. TIA/EIA/IS-127, Enhanced variable rate codec, speech service option 3 for wideband spread spectrum digital systems, 1997
  9. ITU-T Rec. G.711, Pulse code modulation of voice frequencies, 1988
  10. ITU-T Rec. G.723.1, Dual-rate Speech Codec for Multimedia Communications Transmitting at 5.3 and 6.3 kbit/s, 1996
  11. ITU-T Rec. G.729 Annex A, Reduced Complexity 8 kbit/s CS-ACELP Speech Codec, 1996
  12. ITU-T Rec. P.862, Perceptual evaluation of speech quality(PESQ), an objective method for end-to-end speech quality assessment of narrowband telephone networks and speech codecs, 2000
  13. W.B. Kleijn, 'Analysis-by-Synthesis Speech Coding Based on Relaxed Waveform-Matching Constraints,' Ph.D. dissertation, Delft University of Technology, 1991
  14. W.B. Kleijn, P. Kroon, 'The RCELP Speech-Coding Algorithm,' European Trans. On Telecom., Vol. 5, No. 5, pp. 573-582, 1994
  15. Manfred R. Schroeder, Bishnu S. Atal, 'Code Excited Linear Prediction(CELP): High-Quality Speech at Very Low Bit Rates,' Proc. Of ICASSP, pp. 937-940, 1985 https://doi.org/10.1109/ICASSP.1985.1168147
  16. F.K. Soong, B.H. Juang, 'Line Spectrum Pair (LSP) and speech data compression,' Proc. of ICASSP, pp. 1.10.1-1.10.4, 1984
  17. Hong-Goo Kang, Hong-Kook Kim, R.V. Cox, 'Improving transcoding capability of speech coders in clean and frame erasured channel environments,' Proc. of IEEE Workshop on Speech Coding, pp. 78-80, Jan., 2000 https://doi.org/10.1109/SCFT.2000.878403
  18. A.M. Kondoz, Digital Speech: Coding for Low Bit Rate Communications Systems, John Wiley & Sons Ltd, 1994
  19. K.K. Paliwal, B.S. Atal, 'Efficient vector quantization of LPC parameters at 24 bit/frame,' IEEE Trans. on Speech and Audio Processing, Vol. 1, No. 1, pp. 3-14, 1993 https://doi.org/10.1109/89.221363
  20. Yongmiao Hong, 'Testing for independence between two covariance stationary time series,' Biometrika, Vol. 83, No. 3, pp. 615-625, 1996 https://doi.org/10.1093/biomet/83.3.615
  21. E.J. Hannan, Multiple Time Series, John Wiley and Sons Inc., 1970
  22. 이선일, AMR과 EVRC 음성부호화기를 위한 새로운 Tandemless 방식의 상호 부호화 알고리듬, 석사 학위 논문, KAIST, 2002
  23. 이선일, 유창동, 'AMR과 EVRC 음성 부호화기간의 비탠덤 방식을 이용한 상호 부호화,' 한국음향학회지, 제 21권, 제 6호, pp. 531-542, 2002
  24. Kyung Tae Kim, Sung Kyo Jung, Young Cheol Park, Yong Soo Choi, Dae Hee Youn, 'An efficient transcoding algorithm for G.723.1 and EVRC speech coders,' Proc. of 54th IEEE VTC, Vol. 3, pp. 1561-1564, 2001 https://doi.org/10.1109/VTC.2001.956460
  25. 윤성완, 정성교, 박영철, 윤대희, '8kbps G.729A에서 5.3 kbps G.723.1로의 상호 부호화 알고리듬,' 신호처리 합동 학술대회 논문집, 제13권 제 1호, pp. 832-826, 2000
  26. Qualcomm Inc., CELP-based to CELP-based vocoder packet translation, US Patent No. US6260009B1, Jul., 2001