Search | Korea Science

Variable Bitrate MPEG Audio (가변 전송율 MPEG 오디오)

Nam, Seung-Hyon
- The Journal of Engineering Research
- /
- v.2 no.1
- /
- pp.57-62
- /
- 1997
Two psychoacoustic models used in MPEG-1 employ different masking patterns, different masking indexes, and different computational procedures. As a result, Model 1 is inferior to Model 2 due to its worst case approach in computing the SMR even though it determines tonality and masking levels accurately. In this study, we investigate the performances of psychoacoustic models when we modify the MPEG-1 audio coder for variable bitrates. Simulation results show that Model 2 has a gain of 30 kbps in the dual channel mode and 20 kbps in the joint stereo mode. It is generally known that the joint stereo mode has a gain in bitrate compare to the dual channel mode. For signals with frequent attacks, this gain becomes larger in Model 1 than in Model 2. This is due to the fact that Model 1 uses the worst case approach in computing the SMR to reduce pre-echo
PDF

Fixed-point Processing Optimization of MPEG Psychoacoustic Model-II Algorithm for ASIC Implementation (MPEG 심리음향 모델-ll 알고리듬의 ASIC 구현을 위한 고정 소수점 연산 최적화)

Lee Keun-Sup;Park Young-Cheol;Youn Dae Hee
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.29 no.11C
- /
- pp.1491-1497
- /
- 2004
The psychoacoustic model in MPEG audio layer-III (MP3) encoder is optimized for the fixed-point processing. The optimization process consists of determining the data word length of arithmetic unit and the algorithm for transcendental functions that are often used in the psychoacoustic model. In order to determine the data word length, we defined a statistical model expressing the relation between the fixed-point operation errors of the psychoacoustic model and the probability of alteration of the allocated bits doe to these errors. Based on the simulations using this model, we chose a 24-bit data path and constructed a 24-bit fixed-point MP3 encoder. Sound quality tests using the constructed fixed-point encoder showed a mean degradation of -0.2 on ITU-R 5-point audio impairment scale.
PDF KSCI

A Study of Optimum Time-Spread Echo Audio Watermarking via Listening Test (청취실험에 의한 에코확산 오디오 워터마킹방법의 최적화에 관한 검토)

Ko Byeong-Seob
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.545-546
- /
- 2004
서브밴드 분리에 의한 에코확산 오디오 워터마킹법은 호스트 신호를 특정 주파수 대역으로 분리하고, MPEG 심리음향 모델을 이용하여 각 대역별로 삽입되는 워터마크의 파워를 파라미터 설정 함수에 의하여 설정한다. 여기서, 본 방법의 강인성과 비지각성을 좌우하는 것은 파라미터 설정 함수가 된다. 따라서, 본 연구에서는 최대의 강인성과 최소의 음질 열화를 구현하기 위하여 청취실험을 실시하여 최적의 파라미터 설정 함수 설정방법에 대한 검토를 수행하였다.
PDF

Selecting Sound-Field Control Factors in the Image Model Method Using Head-Related Transfer Function (머리전달함수를 이용한 영상 음원법에서 음장 제어 요소 결정)

임정빈
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06d
- /
- pp.56-59
- /
- 1998
머리전달함수(Head-Related Transfer Function, HRTF)를 이용한 영상 음원법(Image Model Method, IMM)을 적용하여 3차원 음장을 제어하기 위한 요소결정 방법을 제안한다. 제어 요소들은 직방체 내부에서의 음 에너지에 관한 이론을 토대로 결정하였다. 각 제어요소를 3차원 음장 모델에 적용하고, 헤드폰을 사용하여 청취자에 의한 심리음향 실험한 결과, 제어된 음장에서는 음상의 두외 정위, 거리감, 공간감이 실내에서와 같이 자연스럽게 형성됨을 나타냈다.
PDF

Prediction of Efficient Adaptive Perceptual Filter Iterate Coefficient through Analysis of Noisy Signal (잡음에 열화된 오디오 신호의 분석을 통한 효율적인 적응지각필터 반복 수행 계수의 예측)

Ryu, Il-Hyun;Cha, Hyung-Tai;Koo, Kyo-Sik;Seo, Bo-Kook
- Proceedings of the Korea Institute of Convergence Signal Processing
- /
- 2005.11a
- /
- pp.238-241
- /
- 2005
디지털 미디어 기술의 발전은 코딩 분야를 비롯하여 다양하게 발전하고 있다. 특히 오디오 신호 처리 분야에서는 디지털 오디오 신호의 생성, 압축, 복원의 단계가 다양한 형태로 개발되고 있다. 오디오 신호 처리에서 인간의 청각 기관을 모델링한 심리음향 기법은 이용하여 압축뿐만 아니라 잡음 신호의 개선에서도 효과적으로 이용되고 있다. 이러한 심리음향모델을 기반으로 하여 구성된 적응지각필터는 지각필터를 이용하여 적응적으로 잡음에 열화된 신호를 개선한다. 이때, 적응지각필터 반복 수행 계수의 효과적인 결절은 오디오 신호의 청각적 손실을 줄이는 동시에 정확한 잡음 제거를 수행한다. 성능을 확인하기 위해서 SNR 및 NMR 비교를 수행하였다.
PDF

Speech Enhancement Based on Psychoacoustic Model (심리음향모델에 근거한 음성개선)

Lee Jingeol
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.337-338
- /
- 2000
The perceptual filter for speech enhancement was analytically derived where the frequency content of the input noisy signal was made the same as that of the estimated clean signal in auditory domain. However, the analytical derivation should rely on the deconvolution associated with the spreading function in the psychoacoustic model, which results in an ill-conditioned problem. In order to cope with the problem associated with the deconvolution, we propose a novel psychoacoustic model based speech enhancement filter whose principle is the same as the perceptual filter, however the filter is derived by a constrained optimization which provides solutions to the ill-conditioned problem.
PDF

Audio Transcoding Algorithm for Terrestrial DTV and Terrestrial DMB Systems (지상파 DTV와 지상파 DMB 방송을 위한 오디오 트랜스코딩 알고리듬)

Bang Kyoung Ho;Lee Jae Seong;Lee Chang Joon;Park Young Cheol;Seo Jeong Il
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.161-164
- /
- 2004
본 논문에서는 지상파 DTV 의 저작물을 지상파 DMB 방송에 활용할 수 있는 오디오 트랜스코딩 기법에 대해 제안한다. 지상파 DTV 에서는 오디오 신호를 AC-3 방식으로 압축하는 반면, 지상파 DMB 에서는 MPEG-4 BSAC 방식을 사용한다. 각 알고리듬이 사용하는 주파수 변환 방식과 심리음향모델에 의한 비트할당 기법이라는 유사성을 이용하면, 두 방식간의 트랜스코딩 효율을 향상시킬 수 있다 실시간 변환을 요구하는 경우나 휴대기기를 위한 응용분야에서는 지연시간과 전력소모를 줄일 수 있는 잇점을 갖는다.
PDF

Audio Watermarking Using An Effective PN Sequence Embedding Method (효율적인 PN 시퀀스 삽입을 통한 오디오 워터마킹)

Byun Youngbae;Park Changmok;Kim Jongweon;Choi Jonguk
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.331-334
- /
- 2002
의사잡음 시퀀스를 이용한 대역확산 기반의 오디오 워터마킹은 들리지 않으면서도 강인한 워터마크를 만들기 위해 심리음향모델이나 고정필터를 사용하여 의사잡음 시퀀스를 변형시킨다. 그런데, 이러한 방법을 이용하여 스펙트럼 변형된 의사잡음은 고주파 영역에 대부분의 에너지를 갖게 되므로 인위적으로 오디오 신호의 고주파 영역을 잘라내는 공격에 취약하다는 단점이 있다. 본 논문에서는 이러한 단점을 보안하고 강인성 및 잡음의 최소화를 위하여 중간값의 성질을 이용하여 의사잡음을 변형 후 삽입하는 워터마킹 시스템을 제안한다. 중간값 성질을 이용하여 변형한 의사잡음은 원 오디오 신호와의 상관성이 높으며 전주파수 대역에 고르게 분포하는 성질이 있으므로 고주파 영역의 공격에 강인하다. 제안 방법은 의사잡음의 고유성질을 최대로 살린 방법으로 각종 오디오 부호화, 부가잡음, 다운/업 샘플링, 채널변경, 진폭 공격과 같은 다양한 공격에도 워터마크 신호의 검출이 가능하다.
PDF

Voice Activity Detection Method Using Psycho-Acoustic Model Based on Speech Energy Maximization in Noisy Environments (잡음 환경에서 심리음향모델 기반 음성 에너지 최대화를 이용한 음성 검출 방법)

Choi, Gab-Keun;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.5
- /
- pp.447-453
- /
- 2009
This paper introduces the method for detect voices and exact end point at low SNR by maximizing voice energy. Conventional VAD (Voice Activity Detection) algorithm estimates noise level so it tends to detect the end point inaccurately. Moreover, because it uses relatively long analysis range for reflecting temporal change of noise, computing load too high for application. In this paper, the SEM-VAD (Speech Energy Maximization-Voice Activity Detection) method which uses psycho-acoustical bark scale filter banks to maximize voice energy within frames is introduced. Stable threshold values are obtained at various noise environments (SNR 15 dB, 10 dB, 5 dB, 0 dB). At the test for voice detection in car noisy environment, PHR (Pause Hit Rate) was 100%accurate at every noise environment, and FAR (False Alarm Rate) shows 0% at SNR15 dB and 10 dB, 5.6% at SNR5 dB and 9.5% at SNR0 dB.
https://doi.org/10.7776/ASK.2009.28.5.447 인용 PDF KSCI

High Quality Audio Watermarking using Spread Spectrum and Psychoacoustic Model (대역확산과 심리음향 모델을 이용한 고음질 오디오 워터마킹)

Noh Jin-Soo;Rhee Kang-Hyeon
- Journal of the Institute of Electronics Engineers of Korea CI
- /
- v.43 no.5 s.311
- /
- pp.48-56
- /
- 2006
In this paper, we proposed the high quality audio watermarking algorithm using MDCT/IMDCT (Modified DCT/Inverse Modified DCT) with psychoacoustic model. Generally, a digital audio watermark is embedding the frequency domain after frequency transform of the digital audio data but the digital audio quality is affected by watermarking. In our scheme, the digital audio data is spread with PN((Pseudo Noise) code and then audio watermark is embedded in MDCT processing that refers psychoacoustic model. In MDCT processing, according to the shape of filter bank output, the block switching selects a window sequence that has 256, 1,024 or 2,048 points interval for high quality audio. The author confirm that when watermark weight ${\alpha}$ is 2.5 below, the detection ratio of watermark is a satisfied to SDMI's(Secure Digital Music Initiative) recommendation 50% above and SM is $50{\sim}68dB$ range with mainly 4 kind of attacks(Compression, Cropping, FFT and Echo).
PDF KSCI

Search Result 71, Processing Time 0.036 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)