• Title/Summary/Keyword: Audio/Speech Coding Wideband Speech Coding

Search Result 8, Processing Time 0.028 seconds

Modified Generic Mode Coding Scheme for Enhanced Sound Quality of G.718 SWB (G.718 초광대역 코덱의 음질 향상을 위한 개선된 Generic Mode Coding 방법)

  • Cho, Keun-Seok;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.119-125
    • /
    • 2012
  • This paper describes a new algorithm for encoding spectral shape and envelope in the generic mode of G.718 super-wide band (SWB). In the G.718 SWB coder, generic mode coding and sinusoidal enhancement are used for the quantization of modified discrete cosine transform (MDCT)-based parameters in the high frequency band. In the generic mode, the high frequency band is divided into sub-bands and for every sub-band the most similar match with the selected similarity criteria is searched from the coded and envelope normalized wideband content. In order to improve the quantization scheme in high frequency region of speech/audio signals, the modified generic mode by the improvement of the generic mode in G.718 SWB is proposed. In the proposed generic mode, perceptual vector quantization of spectral envelopes and the resolution increase for spectral copy are used. The performance of the proposed algorithm is evaluated in terms of objective quality. Experimental results show that the proposed algorithm increases the quality of sounds significantly.

Audio /Speech Codec Using Variable Delay MDCT/IMDCT (가변 지연 MDCT/IMDCT를 이용한 오디오/음성 코덱)

  • Sangkil Lee;In-Sung Lee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.2
    • /
    • pp.69-76
    • /
    • 2023
  • A high-quality audio/voice codec using the MDCT/IMDCT process can perfectly restore the current frame through an overlap-add process with the previous frame. In the overlap-add process, an algorithm delay equal to the frame length occurs. In this paper, we propose a MDCT/IMDCT process that reduces algorithm delay by using a variable phase shift in MDCT/IMDCT process. In this paper, a low-delay audio/speech codec was proposed by applying the low delay MDCT/IMDCT algorithm to the ITU-T standard codec G.729.1 codec. The algorithm delay in the MDCT/IMDCT process can be reduced from 20 ms to 1.25 ms. The performance of the decoded output signal of the audio/speech codec to which low-delay MDCT/IMDCT is applied is evaluated through the PESQ test, which is an objective quality test method. Despite of the reduction in transmission delay, it was confirmed that there is no difference in sound quality from the conventional method.

A New Wideband Speech/Audio Coder Interoperable with ITU-T G.729/G.729E (ITU-T G.729/G.729E와 호환성을 갖는 광대역 음성/오디오 부호화기)

  • Kim, Kyung-Tae;Lee, Min-Ki;Youn, Dae-Hee
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.2
    • /
    • pp.81-89
    • /
    • 2008
  • Wideband speech, characterized by a bandwidth of about 7 kHz (50-7000 Hz), provides a substantial quality improvement in terms of naturalness and intelligibility. Although higher data rates are required, it has extended its application to audio and video conferencing, high-quality multimedia communications in mobile links or packet-switched transmissions, and digital AM broadcasting. In this paper, we present a new bandwidth-scalable coder for wideband speech and audio signals. The proposed coder spits 8kHz signal bandwidth into two narrow bands, and different coding schemes are applied to each band. The lower-band signal is coded using the ITU-T G.729/G.729E coder, and the higher-band signal is compressed using a new algorithm based on the gammatone filter bank with an invertible auditory model. Due to the split-band architecture and completely independent coding schemes for each band, the output speech of the decoder can be selected to be a narrowband or wideband according to the channel condition. Subjective tests showed that, for wideband speech and audio signals, the proposed coder at 14.2/18 kbit/s produces superior quality to ITU-T 24 kbit/s G.722.1 with the shorter algorithmic delay.

Artificial speech bandwidth extension technique based on opus codec using deep belief network (심층 신뢰 신경망을 이용한 오푸스 코덱 기반 인공 음성 대역 확장 기술)

  • Choi, Yoonsang;Li, Yaxing;Kang, Sangwon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.36 no.1
    • /
    • pp.70-77
    • /
    • 2017
  • Bandwidth extension is a technique to improve speech quality, intelligibility and naturalness, extending from the 300 ~ 3,400 Hz narrowband speech to the 50 ~ 7,000 Hz wideband speech. In this paper, an Artificial Bandwidth Extension (ABE) module embedded in the Opus audio decoder is designed using the information of narrowband speech to reduce the computational complexity of LPC (Linear Prediction Coding) and LSF (Line Spectral Frequencies) analysis and the algorithm delay of the ABE module. We proposed a spectral envelope extension method using DBN (Deep Belief Network), one of deep learning techniques, and the proposed scheme produces better extended spectrum than the traditional codebook mapping method.

Real-time Implementation of AMR-WB Speech Codec Using TeakLite DSP (TeakLite DSP를 이용한 적응형 다중 비트율 광대역 (AMR-WB) 음성부호화기의 실시간 구현)

  • 정희범;김경수;한민수;변경진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3
    • /
    • pp.262-267
    • /
    • 2004
  • AMR-WB (Adaptive Multi Rate Wideband) speech codec, the most recent voice codec standardized by 3GPP, has the wider audio bandwidth of 50∼7000 Hz and operates on nine speech coding bit rates between 6.60 and 23.85 kbit/s. This Paper presents the real-time implementation of AMR-WB speech codec by using a 16 bit fixed-point TeakLite DSP. The implemented AMR-WB codec requires the complexity of 52.2 MIPS at 23.85 kbit/s mode and also needs the program memory of 17.9 kwords, data RAM of 11.8 kwords, and data ROM of 10.1kwords. It was verified through passing the all test vectors provided by 3GPP with maintaining bit exactness. Stable operations on the real-time testing board were also proved without any distortions and delays for the audio in/out.

Frequency Band Selection Exited Linear Prediction Wideband Speech/Audio Coding Using SBR (SBR을 이용한 주파수 밴드선택 여기 선형예측 광대역 음성/오디오 부호화)

  • Jang, Sunghoon;Lee, Insung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.6
    • /
    • pp.556-562
    • /
    • 2013
  • This paper is aimed to improve performance of Band-Selection speech/audio Coder reconstucted band spectrum that is not sent by the comfort noise. To improve the performance, we use the Spectral Band Replication(SBR) technique instead of substitution of Comfort noise. To synthesize SBR signal, the SBR algorithm is referenced in selected signals and the spectrum synthesized by SBR is injected to non-selected band. Each sub-band spectrum has been energy-weighted by real audio signal. We propose the enhanced the Band-Selection Coder that utilizes synthesized SBR signal from selected signal instead of comfort noise.

Real-time Implementation or AMR-WB Speech Coder Using TMS320C5509 DSP (TMS320C5509 DSP를 이용한 AMR-WB 음성부호화기의 실시간 구현)

  • Choi Song-ln;Jee Deock-Gu
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.1
    • /
    • pp.52-57
    • /
    • 2005
  • The adaptive multirate wideband (AMR-WB) speech coder has an extended audio bandwidth from 50 Hz to 7 kBz and operates on nine speech coding bit-rates from 6.6 to 23.85 kbit/s. In this Paper, we present the real-time implementation of AMR-WB speech coder using 16bit fixed-point TMS320C5509 that has dual MAC units. Firstly, We implemented AMR-WB speech coder in C 1anguage level using intrinsics, and then performed optimization in assembly language. The computational complexity of the implemented AMR-WB coder at 23.85 kbit/s is 42.9 Mclocks. And this coder needs the program memory of 15.1 kwords, data ROM of 9.2 kwords and data RAM of 13.9 kwords.

Implementation of Internet Terminal using G.729.1 Wideband Speech Codec for Next Generation Network (차세대 통신망을 위한 G.729.1 광대역 음성 코덱을 활용한 인터넷 단말 구현)

  • So, Woon-Seob;Kim, Dae-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.10B
    • /
    • pp.939-945
    • /
    • 2008
  • Tn this paper we described the process and the results of an implementation of Internet terminal using G.729.1 wideband speech codec for next generation network. For this purpose firstly we chose a high performance RISC application processor having DSP features for speech codec processing and enhanced Multimedia Accelerator(eMMA) function for video codec. In the implementation of this terminal, we used G.729.1 codec recently standardized in ITU-T which is a new scalable speech and audio codec that extends 0.729 speech coding standard. To adopt G.729.1 codec to this terminal we transformed most of the fixed point C codes which require more complexity into assembly codes so as to minimize processing time in the processor. As a result of this work we reduced the execution time of the original C codes about 80% and operated in real time on the terminal. For video we used H.263/MPEG-4 codec which is supported by the eMMA with hardware in the processor. In the SIP call processing test connected to real network we obtained under looms end-to-end delay and 3.8 MOS value measured with PESQ instrument. Besides this terminal operated well with commercial terminals.