• Title/Summary/Keyword: audio coding

Search Result 214, Processing Time 0.02 seconds

Extended Pilot-Based Coding for Lossless Bit Rate Reduction of MPEG Surround

  • Pang, Hee-Suk;Lim, Jae-Hyun;Oh, Hyen-O
    • ETRI Journal
    • /
    • v.29 no.1
    • /
    • pp.103-106
    • /
    • 2007
  • Pilot-based coding (PBC), which is used for lossless bit rate reduction of audio coding, has been recently proposed for MPEG Surround. We propose extended PBC for further lossless bit rate reduction of MPEG Surround. Extended PBC selects the number of pilots depending on the parameter band number and the type of spatial parameter. It then encodes the pilots and the relevant difference data. Experiments show that extended PBC is more effective than the original PBC, especially for high bit rate modes, with a negligible complexity increase on the decoder side.

  • PDF

다중밴드 양자화를 적용한 USAC 부호화 기술

  • Beack, Seungkwon;Lim, Wootaek;Lee, Taejin
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.329-332
    • /
    • 2020
  • 본 논문은 USAC(Unified Speech and Audio Coding) 오디오 부호화 기술의 성능 개선에 관련한 것이다. USAC 은 FD(Frequency domain) 양자화 모듈과 LPD(Linear prediction domain) 양자화 모듈을 탑재하고 있다. 본 논문에서는 LPD 모드로부터 생성되는 잔차신호에 대하여 주파수 영역에서 다중밴드로 분할하고 각 밴드 별 양자화를 독립적으로 수행함으로써 USAC 의 LPD 모드의 양자화 효율을 개선하였다. 그 결과 동일 조건에서 제안방법이 기존의 LPD 모드의 성능을 음질 측면에서 향상시킴을 확인할 수 있었다.

  • PDF

Performance analysis of audio super-resolution based on neural networks (신경망 기반 오디오 초 해상도 기술 성능 분석)

  • Lim, Wootaek;Beack, Seungkwon;Sung, Jongmo;Lee, Taejin
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2020.07a
    • /
    • pp.337-339
    • /
    • 2020
  • 오디오 초 해상도 기술은 저 해상도의 오디오 신호를 이용하여 고 해상도의 오디오를 복원 또는 생성해 내는 기술이다. 본 기술 분야는 기존에 주파수 대역 확장, 인공 대역 확장 기술 등으로 연구되었으나, 최근 딥러닝 기술의 발전, 이미지 초 해상도 기술 연구 등에 힘입어 오디오 초 해상도 기술 이라는 이름으로 주로 연구되고 있다. 본 논문에서는 이러한 오디오 초 해상도 기술에 연구 동향에 대하여 설명하고, 기존의 논문 들에서 주로 다루고 있는 음성 데이터 베이스가 아닌 MedleyDB 음악 데이터 베이스를 활용하여 실험을 수행하였다. 실험은 4-폴드 교차 검증을 통해 수행되었으며, 실험 결과 제안하는 컨벌루션 신경망 구조 기반 오디오 초 해상도 기술은 입력 저해상도 오디오 대비 SNR 이 3.41 dB 향상됨을 확인하였다.

  • PDF

A Study on the MDCT Design for MPEG-2 Audio (MPEG-2 오디오를 위한 MDCT 설계에 관한 연구)

  • 김정태;구대성;이강현
    • Proceedings of the IEEK Conference
    • /
    • 2000.11c
    • /
    • pp.97-100
    • /
    • 2000
  • The most important technology is the compression methods in the multimedia society. Audio files are rapidly propagated through internet. MP-3(MPEG-1 Layer3) is offered to CD tone quality in 128kbps, but 64kbps below tone-quality is abruptly down. On the other hand, MPEG-II AAC (Advanced Audio Coding) is not compatible with MPEG-I, but AAC has a high compression ratio 1.4 times better than MP-3 and it has max. 7.1 channel and 96KHz sampling rate. In this paper, we designed the optimized MDCT (Modified Discrete Cosine Transform) that could decrease the capacity of enormous computation and could increase the processing speed in the MPEG-2 AAC encoder.

  • PDF

A 3D Audio Codec Employing a Revised Noise Filling Method (수정된 잡음 채움 기법을 적용한 3D 오디오 부호기)

  • Kim, Rin Chul
    • Journal of Broadcast Engineering
    • /
    • v.26 no.3
    • /
    • pp.327-330
    • /
    • 2021
  • In this paper, a new noise filling method is proposed for improving the performance of the 3D audio codec. In the new method, the core band is limited up to MAX_SFB, not up to the IGF start frequency. And the noise filling is applied to all frequency range of the IGF source patches. We conduct the MUSHRA test and find that the proposed noise filling method demonstrates better performance than the conventional method.

Analysis of MPEG Audio Coding Technology (MPEG 오디오 부호화 기술 분석)

  • 홍진우
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.249-254
    • /
    • 1998
  • MPEG 오디오 그룹에서는 오디오 부호화 기술의 국제 표준으로 MPEG-1 오디오, MPEG-2 오디오 BC, MPEG-2 AAC의 규격 제정을 완료하였고, 현재 MPEG-4 오디오 및 MPEG-7 오디오의 국제 표준을 제정하고 있다. 본 논문에서는 이들 표준에 대한 요구 기능 및 기술 특징을 분석하고, 각각의 표준에 대한 응용분야와 향후의 계획에 대하여 기술한다.

  • PDF

A Common Synthesis Filter for MPEG-2 BC/AAC Audio Using Recursive Structure (Recursive 구조를 이용한 MPEG-2 BC/AAC 오디오 공용 합성 필터)

  • 강명수;박세기;오신범;이채욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.6C
    • /
    • pp.874-882
    • /
    • 2004
  • MPEG Audio compression algorithm is the international standard for the digital compression of high quality audio using mechanism of the perceptual coding based on psychoacoustic masking. It is necessary to discuss the constraints on designing of common filter banks for MPEG-2 BC and MPEG-2 AAC decoder system, which is not Down yet, mapping audio signals from the time domain into the frequency domain. In this paper, we present an architecture of common synthesis filter whcih can be used for MPEG-2 BC and MPEG-2 AAC decoder using recursive structure. The proposed algorithm is based on recursive architecture that effectively performs common compulsion.

Improvement of the TCX Module in AMR-WB+ Codec Using Pyramid VQ (Pyramid VQ를 이용한 AMR-WB+ 코덱 내 TCX 모듈의 성능 개선)

  • Park, Sang-Kuk;Park, Jung-Eun;Baik, Seung-Kweon;Seo, Jung-Il;Kang, Sang-Won
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.3
    • /
    • pp.109-114
    • /
    • 2007
  • In this paper, we Propose a pyramid VQ to quantize the transform coefficients of TCX module for the audio improvement of AMR-WB+ codec. The Proposed pyramid VQ is compared to the $RE_8$ Lattice VQ used in the AMR-WB+ standard codec. demonstrating improvement 4% and 5.7%. respectively, in Mean Squared Error (MSE) and 3.3% and 4.7%. respectively, in Perceptual Evaluation of Audio Quality (PEAQ) by 8-dimensional and 16-dimensional Pyramid VQ.

Salience of Envelope Interaural Time Difference of High Frequency as Spatial Feature (공간감 인자로서의 고주파 대역 포락선 양이 시간차의 유효성)

  • Seo, Jeong-Hun;Chon, Sang-Bae;Sung, Koeng-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.6
    • /
    • pp.381-387
    • /
    • 2010
  • Both timbral features and spatial features are important in the assessment of multichannel audio coding systems. The prediction model, extending the ITU-R Rec. BS. 1387-1 to multichannel audio coding systems, with the use of spatial features such as ITDDist (Interaural Time Difference Distortion), ILDDist (Interaural Level Difference Distortion), and IACCDist (InterAural Cross-correlation Coefficient Distortion) was proposed by Choi et al. In that model, ITDDistswere only computed for low frequency bands (below 1500Hz), and ILDDists were computed only for high frequency bands (over 2500Hz) according to classical duplex theory. However, in the high frequency range, information in temporal envelope is also important in spatial perception, especially in sound localization. A new model to compute the ITD distortions of temporal envelopes in high frequency components is introduced in this paper to investigate the role of such ITD on spatial perception quantitatively. The computed ITD distortions of temporal envelopes in high frequency components were highly correlated with perceived sound quality of multichannel audio sounds.

Modified Generic Mode Coding Scheme for Enhanced Sound Quality of G.718 SWB (G.718 초광대역 코덱의 음질 향상을 위한 개선된 Generic Mode Coding 방법)

  • Cho, Keun-Seok;Jeong, Sang-Bae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.3
    • /
    • pp.119-125
    • /
    • 2012
  • This paper describes a new algorithm for encoding spectral shape and envelope in the generic mode of G.718 super-wide band (SWB). In the G.718 SWB coder, generic mode coding and sinusoidal enhancement are used for the quantization of modified discrete cosine transform (MDCT)-based parameters in the high frequency band. In the generic mode, the high frequency band is divided into sub-bands and for every sub-band the most similar match with the selected similarity criteria is searched from the coded and envelope normalized wideband content. In order to improve the quantization scheme in high frequency region of speech/audio signals, the modified generic mode by the improvement of the generic mode in G.718 SWB is proposed. In the proposed generic mode, perceptual vector quantization of spectral envelopes and the resolution increase for spectral copy are used. The performance of the proposed algorithm is evaluated in terms of objective quality. Experimental results show that the proposed algorithm increases the quality of sounds significantly.