• Title/Summary/Keyword: audio coding

Search Result 214, Processing Time 0.027 seconds

Optimization of MPEG-4 AAC Codec on PDA (휴대 단말기용 MPEG-4 AAC 코덱의 최적화)

  • 김동현;김도형;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.3
    • /
    • pp.237-244
    • /
    • 2002
  • In this paper we mention the optimization of MPEG-4 VM (Moving Picture Expert Group-4 Verification Model) GA (General Audio) AAC (Advanced Audio Coding) encoder and the design of the decoder for PDA (Personal Digital Assistant) using MPEG-4 VM source. We profiled the VMC source and several optimization methods have applied to those selected functions from the profiling. Intel Pentium III 600 MHz PC, which uses windows 98 as OS, takes about 20 times of encoding time compared to input sample running time, with additional options, and about 10 times without any option. Decoding time on PDA was over 35 seconds for the 17 seconds input sample. After optimization, the encoding time has reduced to 50% and the real time decoding has achieved on PDA.

Sound Quality Enhancement in MPEG Surround by Using ILD Distortion (ILD DISTORTION을 이용한 MPEG SURROUND의 음질 개선)

  • Chon, Sang-Bae;Choi, In-Yong;Sung, Koeng-Mo
    • Proceedings of the IEEK Conference
    • /
    • 2006.06a
    • /
    • pp.241-242
    • /
    • 2006
  • MPEG Surround is an audio coding technology that represents multi-channel audio signal with downmixed audio signal(s) and very low bitrate side information based on Binaural Cue Coding. The side information consists of Inter-Channel Level Difference, Inter-Channel Correlation, and payloads. These two parameters are correspondent to the well-known spatial parameters in psycho-acoustics, Inter-aural Level Difference (ILD) and Inter-Aural Cross Correlation (IACC). Though ICLD is to provide perceptually equivalent ILD to the listener, however, the ILD of the original multi-channel audio signal and that of the MPEG Surround encoded signal was different. The difference between two ILD values is defined as ILD Distortion (ILDD). This paper provides how ILDD can be applied to enhance sound quality in MPEG Surround and how much ILDD is decreased.

  • PDF

Preprocessing method for enhancing digital audio quality in speech communication system (음성통신망에서 디지털 오디오 신호 음질개선을 위한 전처리방법)

  • Song Geun-Bae;Ahn Chul-Yong;Kim Jae-Bum;Park Ho-Chong;Kim Austin
    • Journal of Broadcast Engineering
    • /
    • v.11 no.2 s.31
    • /
    • pp.200-206
    • /
    • 2006
  • This paper presents a preprocessing method to modify the input audio signals of a speech coder to obtain the finally enhanced signals at the decoder. For the purpose, we introduce the noise suppression (NS) scheme and the adaptive gain control (AGC) where an audio input and its coding error are considered as a noisy signal and a noise, respectively. The coding error is suppressed from the input and then the suppressed input is level aligned to the original input by the following AGC operation. Consequently, this preprocessing method makes the spectral energy of the music input redistributed all over the spectral domain so that the preprocessed music can be coded more effectively by the following coder. As an artifact, this procedure needs an additional encoding pass to calculate the coding error. However, it provides a generalized formulation applicable to a lot of existing speech coders. By preference listening tests, it was indicated that the proposed approach produces significant enhancements in the perceived music qualities.

Enhancement of Super-wideband Coder by Considering Audio Feature in MDCT Domain (MDCT 도메인에서 오디오 신호 특징을 고려한 초광대역 코덱 개선)

  • Hong, Ki-Bong;Jeong, Gyu-Hyeok;Lee, In-Sung
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.5
    • /
    • pp.129-136
    • /
    • 2011
  • This paper presents the coding method that have multi-mode and efficiency of audio codecs using the feature of audio signal. Recently, the developed extension super-wideband codec based on G.718 wideband divides two mode between Generic and Sinusiodal. So codec efficently encode audio signal exist in super-wideband. But the codec is not as efficent coding for harmonic component of wind instrument and string instrument and individual-Line component of percussion instrument. The proposed method are modeling and encoding multiple pitch and individual-line feature using multi mode coding. For the performance evaluation, we used SNR in MDCT domain for objective test and MUSHRA test for subjective test. As a result, the performance of SNR and MUSHRA test of the proposed method have better performance than the G.718 super-wideband codec.

Efficient DSP Architecture For High- Quality Audio Algorithms (고음질 오디오 알고리즘을 위한 효율적인 DSP 설계)

  • Moon, Jong-Ha;SunWoo, Myung-Hoon
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.5
    • /
    • pp.112-117
    • /
    • 2007
  • This paper presents specialized DSP instructions and their hardware architecture for audio coding algorithms, such as the MPEG-2/4 Advanced Audio Coding(AAC), Dolby AC-3, MPEG-2 Backward Compatible(BC), etc. The proposed architecture is specially designed and optimized for the MDCT/IMDCT(Inverse Modified Discrete Cosine Transform), and Huffman decoding of the AAC decoding algorithm. Performance comparisons show a significant improvement compared with TMS320C62x and ASDSP21060 for the MDCT/IMDCT computation. In addition, the dedicated Huffman decoding accelerator performs decoding and preparing operand in only one cycle. The proposed DPU(Data Processing Unit) consists of 107,860 gates and achieves 150 MIPS.

Implementation of MPEG-4 BSAC Audio Decoder using ARM926EJ-S Processors (ARM926EJ-S 프로세서를 이용한 MPEG-4 BSAC 오디오 복호화기의 구현)

  • Jeon, Young-Taek;Park, Young-Cheol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.1 no.2
    • /
    • pp.91-98
    • /
    • 2008
  • Domestic standard for Korean T-DMB includes MPEG-4 BSAC (Bit Sliced Arithmetic Coding) audio coding that has been established in 2003. This paper presents an implementation and optimization of MPEG-4 BSAC Audio Decoder on ARM926EJ-S processor. Tools and modules of the BSAC audio decoder were implemented with 32-bit fixed point operations. Further optimization was accomplished using ARM926EJ-S Inline Assembly. The optimization was based on the total number of multiplications and MAC (Multiply and Accumulation) operations causing most of core cycles of ARM926EJ-S, and also based on analysis of ARMv5 instructions. The result of optimization was evaluated on the basis of MIPS (Million Instruction per second). Implementation results show that BSAC bitstream at 96kbps can be decoded in real-time at 65MHz CPU clocks.

  • PDF

Speech/Music Signal Classification Based on Spectrum Flux and MFCC For Audio Coder (오디오 부호화기를 위한 스펙트럼 변화 및 MFCC 기반 음성/음악 신호 분류)

  • Sangkil Lee;In-Sung Lee
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.16 no.5
    • /
    • pp.239-246
    • /
    • 2023
  • In this paper, we propose an open-loop algorithm to classify speech and music signals using the spectral flux parameters and Mel Frequency Cepstral Coefficients(MFCC) parameters for the audio coder. To increase responsiveness, the MFCC was used as a short-term feature parameter and spectral fluxes were used as a long-term feature parameters to improve accuracy. The overall voice/music signal classification decision is made by combining the short-term classification method and the long-term classification method. The Gaussian Mixed Model (GMM) was used for pattern recognition and the optimal GMM parameters were extracted using the Expectation Maximization (EM) algorithm. The proposed long-term and short-term combined speech/music signal classification method showed an average classification error rate of 1.5% on various audio sound sources, and improved the classification error rate by 0.9% compared to the short-term single classification method and 0.6% compared to the long-term single classification method. The proposed speech/music signal classification method was able to improve the classification error rate performance by 9.1% in percussion music signals with attacks and 5.8% in voice signals compared to the Unified Speech Audio Coding (USAC) audio classification method.

Digital Filter Design for the DSD Encoder with Multi-rate PCM Input (PCM 입력의 DSD 인코더를 위한 디지털 필터 설계)

  • Moon, Dong-Wook;Kim, Lark-Kyo
    • Proceedings of the KIEE Conference
    • /
    • 2005.05a
    • /
    • pp.170-172
    • /
    • 2005
  • The DSD(Direct Stream Digital) encoder, which is a standard for SACD(Super Audio Compact Disc) proposed by Sony and philips, use 1 bit representation with a sampling frequency of 2.8224 MHz (64 $\times$ 44.1 kHz). For multi-rate PCM (Pulse Code Modulation) input like as 48/96/192 kHz, a external sample-rate converter is necessary to the DSD encoder. This paper has been proposed a digital filter structure composed of sample-rate converter and interpolation filter for the DSD encoder with multi-rate (48/96/192 kHz) PCM input. without a external sample-rate converter.

  • PDF

Audio Watermarking Using Independent Component Analysis

  • Seok, Jong-Won
    • Journal of information and communication convergence engineering
    • /
    • v.10 no.2
    • /
    • pp.175-180
    • /
    • 2012
  • This paper presents a blind watermark detection scheme for an additive watermark embedding model. The proposed estimation-correlation-based watermark detector first estimates the embedded watermark by exploiting non-Gaussian of the real-world audio signal and the mutual independence between the host-signal and the embedded watermark and then a correlation-based detector is used to determine the presence or the absence of the watermark. For watermark estimation, blind source separation (BSS) based on independent component analysis (ICA) is used. Low watermark-to-signal ratio (WSR) is one of the limitations of blind detection with the additive embedding model. The proposed detector uses two-stage processing to improve the WSR at the blind detector; the first stage removes the audio spectrum from the watermarked audio signal using linear predictive (LP) filtering and the second stage uses the resulting residue from the LP filtering stage to estimate the embedded watermark using BSS based on ICA. Simulation results show that the proposed detector performs significantly better than existing estimation-correlationbased detection schemes.