Search | Korea Science

A study on sound source segregation of frequency domain binaural model with reflection (반사음이 존재하는 양귀 모델의 음원분리에 관한 연구)

Lee, Chai-Bong
- Journal of the Institute of Convergence Signal Processing
- /
- v.15 no.3
- /
- pp.91-96
- /
- 2014
For Sound source direction and separation method, Frequency Domain Binaural Model(FDBM) shows low computational cost and high performance for sound source separation. This method performs sound source orientation and separation by obtaining the Interaural Phase Difference(IPD) and Interaural Level Difference(ILD) in frequency domain. But the problem of reflection occurs in practical environment. To reduce this reflection, a method to simulate the sound localization of a direct sound, to detect the initial arriving sound, to check the direction of the sound, and to separate the sound is presented. Simulation results show that the direction is estimated to lie close within 10% from the sound source and, in the presence of the reflection, the level of the separation of the sound source is improved by higher Coherence and PESQ(Perceptual Evaluation of Speech Quality) and by lower directional damping than those of the existing FDBM. In case of no reflection, the degree of separation was low.
PDF KSCI

Salience of Envelope Interaural Time Difference of High Frequency as Spatial Feature (공간감 인자로서의 고주파 대역 포락선 양이 시간차의 유효성)

Seo, Jeong-Hun;Chon, Sang-Bae;Sung, Koeng-Mo
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.6
- /
- pp.381-387
- /
- 2010
Both timbral features and spatial features are important in the assessment of multichannel audio coding systems. The prediction model, extending the ITU-R Rec. BS. 1387-1 to multichannel audio coding systems, with the use of spatial features such as ITDDist (Interaural Time Difference Distortion), ILDDist (Interaural Level Difference Distortion), and IACCDist (InterAural Cross-correlation Coefficient Distortion) was proposed by Choi et al. In that model, ITDDistswere only computed for low frequency bands (below 1500Hz), and ILDDists were computed only for high frequency bands (over 2500Hz) according to classical duplex theory. However, in the high frequency range, information in temporal envelope is also important in spatial perception, especially in sound localization. A new model to compute the ITD distortions of temporal envelopes in high frequency components is introduced in this paper to investigate the role of such ITD on spatial perception quantitatively. The computed ITD distortions of temporal envelopes in high frequency components were highly correlated with perceived sound quality of multichannel audio sounds.
https://doi.org/10.7776/ASK.2010.29.6.381 인용 PDF KSCI

Sound Source Localization Method Based on Deep Neural Network (깊은 신경망 기반 음원 추적 기법)

Park, Hee-Mun;Jung, Jong-Dae
- Journal of IKEEE
- /
- v.23 no.4
- /
- pp.1360-1365
- /
- 2019
In this paper, we describe a sound source localization(SSL) system which can be applied to mobile robot and automatic control systems. Usually the SSL method finds the Interaural Time Difference, the Interaural Level Difference, and uses the geometrical principle of microphone array. But here we proposed another approach based on the deep neural network to obtain the horizontal directional angle(azimuth) of the sound source. We pick up the sound source signals from the two microphones attached symmetrically on both sides of the robot to imitate the human ears. Here, we use difference of spectral distributions of sounds obtained from two microphones to train the network. We train the network with the data obtained at the multiples of 10 degrees and test with several data obtained at the random degrees. The result shows quite promising validity of our approach.
https://doi.org/10.7471/ikeee.2019.23.4.1360 인용 PDF KSCI

A Study on the Loudness Model in Dichotic Conditions (다이코틱 조건에서의 라우드니스 모델에 관한 연구)

차정호;이정권;신성환
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2003.05a
- /
- pp.617-621
- /
- 2003
Existing loudness models are specified only to diotic sounds in spite of the fact that normal human beings hear dichotic sounds. Approximately, the arithmetic mean of loudness values of both ear signals has been suggested for the resultant perceived loudness. In this study, the dependence of overall loudness perception on the interaural level differences was investigated by the subjective tests. It was found that the larger the interaural level difference, the louder the perception than the mean of calculated loudness values at both ears and the lower the critical band rate or the reference level, the louder the perception than the mean value. A modified loudness model was proposed to he applicable to dichotic sounds by using the equivalent diotic levels.
PDF

CASA-based Front-end Using Two-channel Speech for the Performance Improvement of Speech Recognition in Noisy Environments (잡음환경에서의 음성인식 성능 향상을 위한 이중채널 음성의 CASA 기반 전처리 방법)

Park, Ji-Hun;Yoon, Jae-Sam;Kim, Hong-Kook
- Proceedings of the IEEK Conference
- /
- 2007.07a
- /
- pp.289-290
- /
- 2007
In order to improve the performance of a speech recognition system in the presence of noise, we propose a noise robust front-end using two-channel speech signals by separating speech from noise based on the computational auditory scene analysis (CASA). The main cues for the separation are interaural time difference (ITD) and interaural level difference (ILD) between two-channel signal. As a result, we can extract 39 cepstral coefficients are extracted from separated speech components. It is shown from speech recognition experiments that proposed front-end has outperforms the ETSI front-end with single-channel speech.
PDF

A study on the effect of leading sound and following sound on sound localization (선행음 및 후속음이 음원의 방향지각에 미치는 영향에 관한 연구)

Lee, Chai-Bong
- Journal of the Institute of Convergence Signal Processing
- /
- v.16 no.2
- /
- pp.40-43
- /
- 2015
In this paper, the effects of the leading and the following sounds with single frequency on sound localization are investigated. The sounds with different levels and ISIs(Inter Stimuli Intervals) were used. The width of test sound is 2ms, and those of the leading and the following sounds are 10ms. 1 kHz of the test sound is utilized. The arrival time difference in the subject's ears is set to be 0.5ms. The four kinds of level differences used for one ISI are 0, -10, -15, and -20dB interval. The leading sound is found to have more effect on sound localization than the following sound is. The effect of the leading sound is also found to be dependent on the value of ISI. When the value of the ISI is small, different effects affecting the sound localization are observed.
PDF KSCI

Automatic Directional-gain Control for Binaural Hearing Aids using Geomagnetic Sensors (지자기 센서를 이용한 양이 보청기의 방향성 이득 조절 연구)

Yang, Hyejin;An, Seonyoung;Jeong, Jaehyeon;Choi, Inyong;Woo, Jihwan
- Journal of Biomedical Engineering Research
- /
- v.37 no.6
- /
- pp.209-214
- /
- 2016
Binaural hearing aids with a voice transmitter have been widely used to enhance sound quality in noisy environment. However, this system has a limitation on sound-source localization. In this study, we investigated automatic directional-gain control method using geomagnetic sensors to provide directional information to binaural hearing aid user. The loudness gains of two hearing aids were differently controlled based on the directional information between a speaker position and a viewing direction of hearing aids user. This relative directional information was measured by two geomagnetic sensors on hearing aids user and a speaker. The results showed that the loudness gains were accurately controlled and could provide directional information based on the cue of interaural level differences.
https://doi.org/10.9718/JBER.2016.37.6.209 인용 PDF KSCI

Implementation of Transaural filter method for sound localization (공간 음상정위를 위한 Transaural 필터 구현기법)

Cheung Wan-Sup;Lee Jeung-Hoon;Bhang Seungbeum;Kim Soonhyob
- Proceedings of the Acoustical Society of Korea Conference
- /
- autumn
- /
- pp.207-212
- /
- 1999
본 논문에서는 공간에 위치한 음원으로부터 양 귀에 들리는 음향을 좌우 대칭형 스피커를 이용하여 재현하는 기술에 대한 문제점, 즉 좌우 스피커와 양 귀의 음압전달 특성에 수반되는 Cross-talk 제거와 음향학적 모델 선정에 대한 문제점들을 우선 소개한다. 이러한 문제점을 해결할 수 있는 Transaural 필터의 모델 제시와 본 모델의 음향학적 특성을 고찰한다. 본 연구에서는 인간 청각 기관의 공간 인지량적 인자인 ILB(interaural Level Difference)와 ITD(Tnteraural Time Difference)의 개념을 이용한 새로운 Cross-talk 제거 방법과 그리고 청각기관의 "Masking" 특성을 이용한 Transaural 필터의 진폭 보상 방법을 새로이 제안한다. 끝으로 제안된 기법은 음색 왜곡과 음질 저하를 최소화할 수 있는 장점 뿐 아니라 현장 음향 기사들이 직접 음향제작에 적용할 수 있는 장점 또한 제공한다.
PDF

The Implementation of Real-Time Speaker Localization Using Multi-Modality (멀티모달러티를 이용한 실시간 음원추적 시스템 구현)

Park, Jeong-Ok;Na, Seung-You;Kim, Jin-Young
- Proceedings of the KIEE Conference
- /
- 2004.11c
- /
- pp.459-461
- /
- 2004
This paper presents an implementation of real-time speaker localization using audio-visual information. Four channels of microphone signals are processed to detect vertical as well as horizontal speaker positions. At first short-time average magnitude difference function(AMDF) signals are used to determine whether the microphone signals are human voices or not. And then the orientation and distance information of the sound sources can be obtained through interaural time difference and interaual level differences. Finally visual information by a camera helps get finer tuning of the speaker orientation. Experimental results of the real-time localization system show that the performance improves to 99.6% compared to the rate of 88.8% when only the audio information is used.
PDF

Deep Learning-Based Sound Localization Using Stereo Signals Based on Synchronized ILD

Hwang, Hyeon Tae;Yun, Deokgyu;Choi, Seung Ho
- International Journal of Internet, Broadcasting and Communication
- /
- v.11 no.3
- /
- pp.106-110
- /
- 2019
The interaural level difference (ILD) used for the sound localization using stereo signals is to find the difference in energy that the sound source reaches both ears. The conventional ILD does not consider the time difference of the stereo signals, which is a factor of lowering the accuracy. In this paper, we propose a synchronized ILD that obtains the ILD after synchronizing these time differences. This method uses the cross-correlation function (CCF) to calculate the time difference to reach both ears and use it to obtain synchronized ILD. In order to prove the performance of the proposed method, we conducted two sound localization experiments. In each experiment, the synchronized ILD and CCF or only the synchronized ILD were given as inputs of the deep neural networks (DNN), respectively. In this paper, we evaluate the performance of sound localization with mean error and accuracy of sound localization. Experimental results show that the proposed method has better performance than the conventional methods.
https://doi.org/10.7236/IJIBC.2019.11.3.106 인용 PDF KSCI

Search Result 15, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)