Search | Korea Science

Sound Source Localization using HRTF database

Hwang, Sung-Mok;Park, Young-Jin;Park, Youn-Sik
- 제어로봇시스템학회:학술대회논문집
- /
- 2005.06a
- /
- pp.751-755
- /
- 2005
We propose a sound source localization method using the Head-Related-Transfer-Function (HRTF) to be implemented in a robot platform. In conventional localization methods, the location of a sound source is estimated from the time delays of wave fronts arriving in each microphone standing in an array formation in free-field. In case of a human head this corresponds to Interaural-Time-Delay (ITD) which is simply the time delay of incoming sound waves between the two ears. Although ITD is an excellent sound cue in stimulating a lateral perception on the horizontal plane, confusion is often raised when tracking the sound location from ITD alone because each sound source and its mirror image about the interaural axis share the same ITD. On the other hand, HRTFs associated with a dummy head microphone system or a robot platform with several microphones contain not only the information regarding proper time delays but also phase and magnitude distortions due to diffraction and scattering by the shading object such as the head and body of the platform. As a result, a set of HRTFs for any given platform provides a substantial amount of information as to the whereabouts of the source once proper analysis can be performed. In this study, we introduce new phase and magnitude criteria to be satisfied by a set of output signals from the microphones in order to find the sound source location in accordance with the HRTF database empirically obtained in an anechoic chamber with the given platform. The suggested method is verified through an experiment in a household environment and compared against the conventional method in performance.
PDF

Implementation of Sound Source Localization Based on Audio-visual Information for Humanoid Robots (휴모노이드 로봇을 위한 시청각 정보 기반 음원 정위 시스템 구현)

Park, Jeong-Ok;Na, Seung-You;Kim, Jin-Young
- Speech Sciences
- /
- v.11 no.4
- /
- pp.29-42
- /
- 2004
This paper presents an implementation of real-time speaker localization using audio-visual information. Four channels of microphone signals are processed to detect vertical as well as horizontal speaker positions. At first short-time average magnitude difference function(AMDF) signals are used to determine whether the microphone signals are human voices or not. And then the orientation and distance information of the sound sources can be obtained through interaural time difference. Finally visual information by a camera helps get finer tuning of the angles to speaker. Experimental results of the real-time localization system show that the performance improves to 99.6% compared to the rate of 88.8% when only the audio information is used.
PDF

An efficient space dividing method for the two-dimensional sound source localization (2차원 상의 음원위치 추정을 위한 효율적인 영역분할방법)

Kim, Hwan-Yong;Choi, Hong-Sub
- The Journal of the Acoustical Society of Korea
- /
- v.35 no.5
- /
- pp.358-367
- /
- 2016
SSL (Sound Source Localization) has been applied to several applications such as man-machine interface, video conference system, smart car and so on. But in the process of sound source localization, angle estimation error is occurred mainly due to the non-linear characteristics of the sine inverse function. So an approach was proposed to decrease the effect of this non-linear characteristics, which divides the microphone's covering space into narrow regions. In this paper, we proposed an optimal space dividing way according to the pattern of microphone array. In addition, sound source's 2-dimensional position is estimated in order to evaluate the performance of this dividing method. In the experiment, GCC-PHAT (Generalized Cross Correlation PHAse Transform) method that is known to be robust with noisy environments is adopted and triangular pattern of 3 microphones and rectangular pattern of 4 microphones are tested with 100 speech data respectively. The experimental results show that triangular pattern can't estimate the correct position due to the lower space area resolution, but performance of rectangular pattern is dramatically improved with correct estimation rate of 67 %.
https://doi.org/10.7776/ASK.2016.35.5.358 인용 PDF KSCI

3-D Near Field Localization Using Linear Sensor Array in Multipath Environment with Inhomogeneous Sound Speed (비균일 음속 다중경로환경에서 선배열 센서를 이용한 근거리 표적의 3차원 위치추정 기법)

Lee Su-Hyoung;Choi Byung-Woong
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.4
- /
- pp.184-190
- /
- 2006
Recently, Lee et al. have proposed an algorithm utilizing the signals from different paths by using bottom mounted simple linear array to estimate 3-D location of oceanic target. But this algorithm assumes that sound velocity is constant along depth of sea. Consequently, serious performance loss is appeared in real oceanic environment that sound speed is changed variously. In this paper, we present a 3-D near field localization algorithm for inhomogeneous sound speed. The proposed algorithm adopt localization function that utilize ray propagation model for multipath environment with linear sound speed profile(SSP), after that, the proposed algorithm searches for the instantaneous azimuth angle, range and depth from the localization cost function. Several simulations using linear SSP and non linear SSP similar to that of real oceans are used to demonstrate the performance of the proposed algorithm. The estimation error in range and depth is decreased by 100m and 50m respectively.
https://doi.org/10.7776/ASK.2006.25.4.184 인용 PDF KSCI

Improvement of virtual speaker localization characteristics using grouped HRTF (머리전달함수의 그룹화를 이용한 가상 스피커의 정위감 개선)

Seo, Bo-Kug;Cha, Hyung-Tai
- Journal of the Korean Institute of Intelligent Systems
- /
- v.16 no.6
- /
- pp.671-676
- /
- 2006
A convolution with HRTF DB and the original sound is generally used to make the method of sound image localization for virtual speaker realization. But it can decline localization by the confusion between up and down or front and back directions due to the non-individual HRTF depending on each listener. In this paper, we study a virtual speaker using a new HRTF, which is grouping the HRTF around the virtual speaker to improve localization between up and down or front and back directions. To effective HRTF grouping, we decide the location and number of HRTF using informal listening test. A performance test result of virtual speaker using the grouped HRTF shows that the proposed method improves the front-back and up-down sound localization characteristics much better than the conventional methods.
https://doi.org/10.5391/JKIIS.2006.16.6.671 인용 PDF KSCI

Considering Microphone Positions in Sound Source Localization Methods: in Robot Application (로봇 플랫폼에서 마이크로폰 위치를 고려한 음원의 방향 검지 방법)

Kwon, Byoung-Ho;Kim, Gyeong-Ho;Park, Young-Jin
- Proceedings of the Korean Society for Noise and Vibration Engineering Conference
- /
- 2007.05a
- /
- pp.1080-1084
- /
- 2007
Many different methods for sound source localization have been developed. Most of them mainly depend on time delay of arrival (TDOA) or on empirical or analytic head related transfer functions (HRTFs). In real implementation, since the direct path between a source and a sensor is interrupted by obstacles as like a head or body of robot, it has to be considered the number of sensors as well as their positions. Therefore, in this paper, we present the methods, which are included sensor position problem, to localize the sound source with 4 microphones to cover the 3D space. Those are modified two-step TDOA methods. Our conclusion is that the different method has to be applied in case to be different microphone position on real robot platform.
PDF

Spatially Mapped GCC Function Analysis for Multiple Source and Source Localization Method (공간좌표로 사상된 GCC 함수의 다 음원에 대한 해석과 음원 위치 추정 방법)

Kwon, Byoung-Ho;Park, Young-Jin;Park, Youn-Sik
- Journal of Institute of Control, Robotics and Systems
- /
- v.16 no.5
- /
- pp.415-419
- /
- 2010
A variety of methods for sound source localization have been developed and applied to several applications such as noise detection system, surveillance system, teleconference system, robot auditory system and so on. In the previous work, we proposed the sound source localization using the spatially mapped GCC functions based on TDOA for robot auditory system. Performance of the proposed one for the noise effect and estimation resolution was verified with the real environmental experiment under the single source assumption. However, since multi-talker case is general in human-robot interaction, multiple source localization approaches are necessary. In this paper, the proposed localization method under the single source assumption is modified to be suitable for multiple source localization. When there are two sources which are correlated, the spatially mapped GCC function for localization has three peaks at the real source locations and imaginary source location. However if two sources are uncorrelated, that has only two peaks at the real source positions. Using these characteristics, we modify the proposed localization method for the multiple source cases. Experiments with human speeches in the real environment are carried out to evaluate the performance of the proposed method for multiple source localization. In the experiments, mean value of estimation error is about $1.4^{\circ}$ and percentage of multiple source localization is about 62% on average.
https://doi.org/10.5302/J.ICROS.2010.16.5.415 인용 PDF KSCI

Artificial Intelligence Computing Platform Design for Underwater Localization (수중 위치측정을 위한 인공지능 컴퓨팅 플랫폼 설계)

Moon, Ji-Youn;Lee, Young-Pil
- The Journal of the Korea institute of electronic communication sciences
- /
- v.17 no.1
- /
- pp.119-124
- /
- 2022
Successful underwater localization requires a large-scale, parallel computing environment that can be mounted on various underwater robots. Accordingly, we propose a design method for an artificial intelligence computing platform for underwater localization. The proposed platform consists of a total of four hardware modules. Transponder and hydrophone modules transmit and receive sound waves, and the FPGA module rapidly pre-processes the transmitted and received sound wave signals in parallel. Jetson module processes artificial intelligence based algorithms. We performed a sound wave transmission/reception experiment for underwater localization according to distance in an actual underwater environment. As a result, the designed platform was verified.
https://doi.org/10.13067/JKIECS.2022.17.1.119 인용 PDF KSCI

HRTF Enhancement Algorithm for Stereo ground Systems (스테레오 시스템을 위한 머리전달함수의 개선)

Koo, Kyo-Sik;Cha, Hyung-Tai
- The Journal of the Acoustical Society of Korea
- /
- v.27 no.4
- /
- pp.207-214
- /
- 2008
To create 3D sound, we usually use two methods which are two channels or multichannel sound systems. Because of cost and space problems, we prefer two channel sound system to multi-channel. Using a headphone or two speakers, the most typical method to create 3D sound effects is a technology of head related transfer function (HRTF) which contains the information that sound arrives from a sound source to the ears of the listener. But it causes a problem to localize a sound source around a certain places which is called cone-of-confusion. In this paper, we proposed the new algorithm to reduce the confusion of sound image localization. HRTF grouping and psychoacoustics theory are used to boost the spectral cue with spectrum difference among each directions. Informal listening tests show that the proposed method improves the front-back sound localization characteristics much better than conventional methods.
https://doi.org/10.7776/ASK.2008.27.4.207 인용 PDF KSCI

Two Simultaneous Speakers Localization using harmonic structure (하모닉 구조를 이용한 두 명의 동시 발화 화자의 위치 추정)

Kim, Hyun-Kyung;Lim, Sung-Kil;Lee, Hyon-Soo
- Proceedings of the KSPS conference
- /
- 2005.11a
- /
- pp.121-124
- /
- 2005
In this paper, we propose a sound localization algorithm for two simultaneous speakers. Because speech is wide-band signal, there are many frequency sub-bands in that two speech sounds are mixed. However, in some sub-bands, one speech sound is more dominant than other sounds. In such sub-bands, dominant speech sounds are little interfered by other speech or noise. In speech sounds, overtones of fundamental frequency have large amplitude, and that are called 'Harmonic structure of speech'. Sub-bands inharmonic structure are more likely dominant. Therefore, the proposed localization algorithm is based on harmonic structure of each speakers. At first, sub-bands that belong to harmonic structure of each speech signal are selected. And then, two speakers are localized using selected sub-bands. The result of simulation shows that localization using selected sub-bands are more efficient and precise than localization methods using all sub-bands.
PDF

Search Result 254, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)