• Title/Summary/Keyword: Speech and audio coding

Search Result 37, Processing Time 0.024 seconds

MPEG-D USAC: Unified Speech and Audio Coding Technology (MPEG-D USAC: 통합 음성 오디오 부호화 기술)

  • Lee, Tae-Jin;Kang, Kyeong-Ok;Kim, Whan-Woo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.589-598
    • /
    • 2009
  • As mobile devices become multi-functional, and converge into a single platform, there is a strong need for a codec that is able to provide consistent quality for speech and music content MPEG-D USAC standardization activities started at the 82nd MPEG meeting with a CfP and approved WD3 at the 88th MPEG meeting. MPEG-D USAC is converged technology of AMR-WB+ and HE-AAC V2. Specifically, USAC utilizes three core codecs (AAC ACELP and TCX) for low frequency regions, SBR for high frequency regions and the MPEG Surround tool for stereo information. USAC can provide consistent sound quality for both speech and music content and can be applied to various applications such as multi-media download to mobile device Digital radio Mobile TV and audio books.

Real-time Implementation or AMR-WB Speech Coder Using TMS320C5509 DSP (TMS320C5509 DSP를 이용한 AMR-WB 음성부호화기의 실시간 구현)

  • Choi Song-ln;Jee Deock-Gu
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.1
    • /
    • pp.52-57
    • /
    • 2005
  • The adaptive multirate wideband (AMR-WB) speech coder has an extended audio bandwidth from 50 Hz to 7 kBz and operates on nine speech coding bit-rates from 6.6 to 23.85 kbit/s. In this Paper, we present the real-time implementation of AMR-WB speech coder using 16bit fixed-point TMS320C5509 that has dual MAC units. Firstly, We implemented AMR-WB speech coder in C 1anguage level using intrinsics, and then performed optimization in assembly language. The computational complexity of the implemented AMR-WB coder at 23.85 kbit/s is 42.9 Mclocks. And this coder needs the program memory of 15.1 kwords, data ROM of 9.2 kwords and data RAM of 13.9 kwords.

Fast Harmonic Synthesis Method for Sinusoidal Speech-Audio Model (정현파 음성-오디오 모델의 빠른 하모닉 합성 방법)

  • Kim, Gyu-Jin;Kim, Jong-Hark;Jung, Gyu-Hyeok;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.4 s.316
    • /
    • pp.109-116
    • /
    • 2007
  • Most harmonic synthesis methods using phase information employ a quadratic or cubic phase interpolation. The methods are computationally expensive to implement because every component sinewave must be synthesized on a per sample basis. In this paper, we propose a fast harmonic synthesis method for sinusoidal speech/audio coding based on the quadratic and cubic phase function to overcome the complexity problem. To derive the fast harmonic synthesis method, we define the over-sampling function and phase modulation function by constraining the parameter of phase function to be independent for harmonic index and derive the fast synthesis method using IFFT. Experimental results show that the proposed method significantly reduce the complexity of conventional cosine synthesis method while maintaining the performance.

Unified Speech and Audio Coding Technology (통합 음성 오디오 부호화 기술)

  • Lee, Taejin;Beack, Seungkwon;Kang, Kyeongok;Kim, Whan-Woo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2011.07a
    • /
    • pp.264-267
    • /
    • 2011
  • 다양한 기능을 가지는 모바일 기기들이 하나로 융합되어 가는 방향으로 기술이 발전함에 따라, 음성 및 오디오 모두에 대해 우수한 음질을 제공하는 부호화 기술에 대한 요구사항이 증대되고 있다. MPEG 에서는 2008 년 10 월부터 MPEG-D USAC 기술에 대해 CfP 를 시작으로 본격적으로 표준화를 진행하고 있으며, 2011 년 3 월 96 차 미팅에서 Study on DIS 까지 승인하였다. 본 논문에서는 LPD 모드의 TCX 윈도우의 변경을 통한 USAC 성능향상 방법은 제안한다. TCX 프레임의 연결에 고정된 크기의 중첩만을 이용하는 현재의 방식과는 달리, 이전 TCX 모드와 다음 TCX 모드, transient 의 존재 유무에 따라 적절하게 TCX 윈도우 중첩크기를 조절하여 음악 특성 신호에 대해 LPD 모드의 음질을 개선할 수 있다.

  • PDF

Fixed-point Implementation of LPD Decoder in MPEG-D USAC (MPEG-D USAC : LPD 복호화기의 고정 소수점 알고리즘 구현)

  • Song, Eunwoo;Song, Jeongook;Kang, Hong-Goo
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2012.07a
    • /
    • pp.254-256
    • /
    • 2012
  • 본 논문에서는 MPEG-D 오디오 서브그룹에서 진행 중인 Unified Speech and Audio Coding (USAC) 표준의 Linear Prediction Domain (LPD) 복호화기 모듈을 고정소수점 알고리즘으로 제안한다. USAC 부호화기는 두 개의 최신 음성-오디오 부호화기가 융합된 형태로, 음성 및 오디오 신호에 대하여 우수한 성능을 갖는 부호화기이다. USAC의 표준 완료와 본격적인 서비스화에 앞서서 USAC LPD 복호화기의 구조적인 특성을 분석하고, Digital Signal Processor (DSP)구현을 위한 LPD 복호화기의 고정소수점 알고리즘을 구축하는 동시에 모듈의 복잡도를 측정하고자 한다. 또한 고정소수점 알고리즘으로 구현된 LPD 복호화기와 기존의 부동소수점 복호화기의 성능을 비교하고, LPD 복호화기의 두 가지 부호화 모드에 따른 복잡도 이슈를 다루도록 한다.

  • PDF

Implementation of Internet Terminal using G.729.1 Wideband Speech Codec for Next Generation Network (차세대 통신망을 위한 G.729.1 광대역 음성 코덱을 활용한 인터넷 단말 구현)

  • So, Woon-Seob;Kim, Dae-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.10B
    • /
    • pp.939-945
    • /
    • 2008
  • Tn this paper we described the process and the results of an implementation of Internet terminal using G.729.1 wideband speech codec for next generation network. For this purpose firstly we chose a high performance RISC application processor having DSP features for speech codec processing and enhanced Multimedia Accelerator(eMMA) function for video codec. In the implementation of this terminal, we used G.729.1 codec recently standardized in ITU-T which is a new scalable speech and audio codec that extends 0.729 speech coding standard. To adopt G.729.1 codec to this terminal we transformed most of the fixed point C codes which require more complexity into assembly codes so as to minimize processing time in the processor. As a result of this work we reduced the execution time of the original C codes about 80% and operated in real time on the terminal. For video we used H.263/MPEG-4 codec which is supported by the eMMA with hardware in the processor. In the SIP call processing test connected to real network we obtained under looms end-to-end delay and 3.8 MOS value measured with PESQ instrument. Besides this terminal operated well with commercial terminals.

Audio Stream Delivery Using AMR(Adaptive Multi-Rate) Coder with Forward Error Correction in the Internet (인터넷 환경에서 FEC 기능이 추가된 AMR음성 부호화기를 이용한 오디오 스트림 전송)

  • 김은중;이인성
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.26 no.12A
    • /
    • pp.2027-2035
    • /
    • 2001
  • In this paper, we present an audio stream delivery using the AMR (Adaptive Multi-Rate) coder that was adopted by ETSI and 3GPP as a standard vocoder for next generation IMT-2000 service in which includes combined sender (FEC) and receiver reconstruction technique in the Internet. By use of the media-specific FEC scheme, the possibility to recover lost packets can be much increased due to the addition of repair data to a main data stream, by which the contents of lost packets can be recovered. The AMR codec is based on the code-excited linear predictive (CELP) coding model. So we use a frame erasure concealment for CELP-based coders. The proposed scheme is evaluated with ITU-T G.729 (CS-ACELP) coder and AMR - 12.2 kbit/s through the SNR (Signal to Noise Ratio) and the MOS (Mean Opinion Score) test. The proposed scheme provides 1.1 higher in Mean Opinion Score value and 5.61 dB higher than AMR - 12.2 kbit/s in terms of SNR in 10% packet loss, and maintains the communicab1e quality speech at frame erasure rates lop to 20%.

  • PDF