• Title/Summary/Keyword: auditory filter band

Search Result 8, Processing Time 0.02 seconds

An Objective Estimation for Simulating of Asymmetrical Auditory Filter of the Hearing Impaired According to Hearing Loss Degree (난청인의 난청 정도에 따른 비대칭 청각 필터 구현의 객관적 평가)

  • Joo, S.I.;Jeon, Y.Y.;Song, Y.R.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.3 no.1
    • /
    • pp.27-34
    • /
    • 2009
  • Hearing impaired person's hearing loss has personally various shape, so existing symmetrical auditory filter of frequency band method wasn't properly simulated the hearing impaired person's various hearing loss shape. The shapes of auditory filter are asymmetrical different with each center frequency and each input level. Hearing impaired person which has hearing loss was differently changed with that of normal hearing people and it has different value for speech of quality through auditory filter. In this study, the asymmetrical auditory filter was simulated and then some tests to estimate the filter's performance objectively were performed. The experiment as simulated auditory filter's performance evaluation method used perceptual evaluation of speech quality (PESQ) and log likelihood ratio (LLR) for speech through auditory filter. In the test, processed speech was evaluated objective speech quality and distortion using PESQ and LLR value. When hearing loss processed, PESQ and LLR value have big difference between symmetrical and asymmetrical auditory filter. It means that the difference of the shape auditory filter may affect to speech quality. Especially, when hearing loss existed, auditory filter changing according to asymmetrical shape for each center frequency affected to perceive speech quality of the hearing impaired.

  • PDF

Automatic Vowel Onset Point Detection Based on Auditory Frequency Response (청각 주파수 응답에 기반한 자동 모음 개시 지점 탐지)

  • Zang, Xian;Kim, Hag-Tae;Chong, Kil-To
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.13 no.1
    • /
    • pp.333-342
    • /
    • 2012
  • This paper presents a vowel onset point (VOP) detection method based on the human auditory system. This method maps the "perceptual" frequency scale, i.e. Mel scale onto a linear acoustic frequency, and then establishes a series of Triangular Mel-weighted Filter Bank simulate the function of band pass filtering in human ear. This nonlinear critical-band filter bank helps greatly reduce the data dimensionality, and eliminate the effect of harmonic waves to make the formants more prominent in the nonlinear spaced Mel spectrum. The sum of mel spectrum peaks energy is extracted as feature for each frame, and the instinct at which the energy amplitude starts rising sharply is detected as VOP, by convolving with Gabor window. For the single-word database which contains 12 vowels articulated with different kinds of consonants, the experimental results showed a good average detection rate of 72.73%, higher than other vowel detection methods based on short-time energy and zero-crossing rate.

Audio Watermarking Technique Based on Digital Filter (디지털 필터를 이용한 오디오 워터마킹 기술)

  • 신승원;김종원;최종욱
    • Proceedings of the Korea Institutes of Information Security and Cryptology Conference
    • /
    • 2001.11a
    • /
    • pp.464-468
    • /
    • 2001
  • In this paper, we propose a robust watermarking technique that accepts time scaling, pitch shift, add noise and a lot of lossy compression such as MP3, AAC, WMA. The technique is developed based on digital filtering. Being designed according to critical band of HAS (human auditory system), the digital filters nearly affect audio quality. Furthermore, before implementing digital filtering, wavelet transform decomposes the audio signal into several signals that is composed of specific frequencies. Designed digital filters scan the decomposed signal. The designed digital filter, band-stop filter, distorts and eliminates specific frequencies of audio signals. Watermarking detection can be accomplished by FFT (Fast Fourier Transform). Firstly, segments of audio signal are transformed by FFT. Then, the obtained amplitude spectrum by FFT is summed repeatedly. Finally the watermark detector can find filters used to watermark encoding based on eliminating frequencies. The suggested technique can embed 4bits/s in a robust manner.

  • PDF

Performance Improvement of Speech Recognizer in Noisy Environments Based on Auditory Modeling (청각 구조를 이용한 잡음 음성의 인식 성능 향상)

  • Jung, Ho-Young;Kim, Do-Yeong;Un, Chong-Kwan;Lee, Soo-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.5
    • /
    • pp.51-57
    • /
    • 1995
  • In this paper, we study a noise-robust feature extraction method of speech signal based on auditory modeling. The auditory model consists of a basilar membrane, a hair cell model and spectrum output stage. Basilar membrane model describes a response characteristic of membrane according to vibration in speech wave, and is represented as a band-pass filter bank. Hair cell model describes a neural transduction according to displacements of the basilar membrane. It responds adaptively to relative values of input and plays an important role for noise-robustness. Spectrum output stage constructs a mean rate spectrum using the average firing rate of each channel. And we extract feature vectors using a mean rate spectrum. Simulation results show that when auditory-based feature extraction is used, the speech recognition performance in noisy environments is improved compared to other feature extraction methods.

  • PDF

Performance Enhancement of SBC for Voice Signal Using Adaptive Postfiltering at the Medium Bit Rate (중간 전송율에서 적응 포스트 필터링을 이용한 음성용 SBC의 성능 향상)

  • 김원구;이남걸;윤대희;차일환
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.2
    • /
    • pp.121-131
    • /
    • 1992
  • In this paper, three methods are studied to enhance the performance of SBC ( Sub-Band Coding )schemes for voice signal at the medium bit rate between 12 kbps and If; kbps, and adaptive postfilteritng using human auditory characteristics Is (Bone at the decoder out put. First, GQMF(Generalized Quadrature Mirror Filter ) Is used instead of QME'((Quadrature MirrorFiltcr ) to have better performance. Second, by adaptive bit allocation to each sub-band, speech quality is enhanced and valuable rate ceding If possible. Third, corriparlson study oS thr: coder performance using APCM(Adaptive Pulse Code ModulatioTi) and ADPCM( Adaptive Differentiai Pulse Code Modulatiori) , Indicates that SB AfCM performance better than the other. Adaptive postfiltering at the decoder output enhances the quality of the coded speech. The two proposed postfiltering methods decrease the noise sufficiently at the expense of the low computational load.

  • PDF

Performance analysis of subjective Loudness meter with ITU-R BS. 1387-1 algorithm for digital audio (디지털 오디오 주관적 음향레벨 계측기 구현을 위한 ITU-R BS. 1387-1의 알고리즘 특성 분석)

  • Ngan, Nguyen Vo Bao;Park, Seonggyoon;Ro, Soonghwan;Han, Chankyu
    • Journal of IKEEE
    • /
    • v.16 no.4
    • /
    • pp.395-404
    • /
    • 2012
  • In this paper, the perceived loudness metering algorithm based on ITU-R BS.1387-1 was investigated and implemented, and its performance was evaluated by applying to 23 pure tones and 9 digital audio samples. Error of the tone test results compared with ISO226:2003 was below 5%, and sample test results, in comparison with Moore's algorithm, showed deviation of less than 4.7% and correlation of 0.96. On the other hand, it was investigated how the implemented algorithm's performance was subject to auditory pitch scale. Its result showed that the algorithm with 37 auditory filters, through correcting a bias effect, has a good performance of less than 2% in comparison with the one with 109 auditory filters.

Improving a Sound Localization Using 1/3-octave Band Pass Filter (1/3-옥타브 대역통과필터를 이용한 음상정위기법 성능 향상)

  • Hwang, Shin;Yang, Jin-Woo;Cheung, Wan-Sup;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.3
    • /
    • pp.98-103
    • /
    • 2001
  • The binaural auditory system of human has the capability of differentiating the direction and distance of sound sources. This feature is well characterised in terms of the inter-aural intensity difference (IID), the inter-aural time difference (ITD) and/or the spectral shape difference (SSD) arising from the acoustic transfer of a sound source to the outer ears. This paper proposes an effective way of extracting the three sound perception factors (IID, ITD, SSD) from the head-related transfer functions (HRTF's) that depends on the direction and distance of the acoustic source from the listener. It includes the estimation method of the equivalent ITD and 1/3-octave band-based IID factors and their usage to locate a sound source in space. Subjective and objective tests were carried out to examine the effectiveness of the proposed methodology and its applicability to real sound systems. Those experimental results are illustrated in this paper.

  • PDF

A New Wideband Speech/Audio Coder Interoperable with ITU-T G.729/G.729E (ITU-T G.729/G.729E와 호환성을 갖는 광대역 음성/오디오 부호화기)

  • Kim, Kyung-Tae;Lee, Min-Ki;Youn, Dae-Hee
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.2
    • /
    • pp.81-89
    • /
    • 2008
  • Wideband speech, characterized by a bandwidth of about 7 kHz (50-7000 Hz), provides a substantial quality improvement in terms of naturalness and intelligibility. Although higher data rates are required, it has extended its application to audio and video conferencing, high-quality multimedia communications in mobile links or packet-switched transmissions, and digital AM broadcasting. In this paper, we present a new bandwidth-scalable coder for wideband speech and audio signals. The proposed coder spits 8kHz signal bandwidth into two narrow bands, and different coding schemes are applied to each band. The lower-band signal is coded using the ITU-T G.729/G.729E coder, and the higher-band signal is compressed using a new algorithm based on the gammatone filter bank with an invertible auditory model. Due to the split-band architecture and completely independent coding schemes for each band, the output speech of the decoder can be selected to be a narrowband or wideband according to the channel condition. Subjective tests showed that, for wideband speech and audio signals, the proposed coder at 14.2/18 kbit/s produces superior quality to ITU-T 24 kbit/s G.722.1 with the shorter algorithmic delay.