1. Introduction
Acoustic source localization is a passive positioning method that can be applied to target detection. The time difference of arrival (TDOA) is the usual method used in acoustic source localization, which estimates the source position using the information on the geometry [1, 2]. The TDOA at the sensors on the geometry is a function of source direction, so the TDOA at the peaks of the cross-correlation function can therefore be used to identify the source direction of arrival. Although the algorithm is easy to carry out, it can only deal with a single sound source at a given time [3].
The high-resolution subspace techniques, Multiple Signal Classification (MUSIC) [4-6] and Estimation of Signal Parameters via Rotational Invariance Techniques (ESPRIT) [7-9], are widely used in direction of arrival (DOA) in the past few decades. They can resolve the multiple sources localization problem [10]. Conventionally, various linear [11], circular [12] and spherical sensor arrays [13-15] are used for DOA estimation under free field propagation conditions between the source and sensors. Especially, benefiting from the three-dimensional symmetry, spherical microphone arrays offer an ideal tool for capturing and analyzing three dimensional sound fields. To get better performance, spherical harmonic analysis considering the physical characteristics of wave propagation in the air is applied to MUSIC and ESPRIT algorithms based on spherical microphone arrays [16]. The sound field is decomposed into spherical harmonic components by sampling the field using spherical harmonic analysis technique [17, 18]. One of the main advantages of performing the analysis in the spherical harmonic component domain is the fact that the frequency-dependent components are decoupled from the angular-dependent components which provides a new perspective on related array processing problems such as source localization and detection. And owing to the efficient algorithm, the spherical harmonic analysis is more practical [19]. Teutsch and Kellermann [3, 20] proposed the “eigenbeams” (EBs)-ESPRIT localization method for multiple wideband acoustic sources and the “eigenbeams” is actually the spherical harmonic. In [21], the spherical harmonic based MUSIC algorithm (SH-MUSIC), where the spherical harmonic transformation is operated before the MUSIC algorithm, is used to do the estimation.
The aim of our study is to improve the performance of SHMUSIC algorithm by using a new microphone array. Notice that when using the spherical harmonic analysis techniques, the microphones of the traditional array are arranged regularly, such as circular array or spherical array [18, 22-24]. More importantly, only single array is used in all the former related study [25-28]. In our application, the microphones on each sphere are distributed regularly, but a set of spheres are placed on a plane, aiming at obtaining higher accuracy and efficiency of estimation. In this multiple spheres structure, each sphere has its own coordinate, and a globe coordinate is built up to take account the phase difference among microphones from different spheres. This multiple spherical arrays (MSA) can not only meet the randomly placing requirement of spheres but also reduce the number of microphones on each sphere which is restricted by the finite spherical region when using single spherical array. Besides, an improved SH-MUSIC algorithm for multiple spherical arrays is discussed. The simulation results based on the novel structure verify our claim.
The remainder of this paper is organized as follows. In Section 2, we introduce the signal model and the spherical harmonic decomposition of sound field. Section 3 briefly discusses the simplified localization algorithm using spherical harmonic analysis. The localization method using multiple spherical arrays is discussed in section 4 and the simulation examples are presented in Section 5. Finally, the conclusions are drawn in Section 6.
2. Signal model and Problem Formulation
2.1 Sound field
Assume that all the waves applied in this paper are plane waves, the model of two planar waves impinging on a spherical aperture from two far field sources S1 and S2 is shown in Fig. 1.
Fig. 1.Geometry of two planar waves impinging on a spherical aperture.
Supposing that two plane waves with unit magnitude impinge on a sphere from directions (θ1 , ϕ1) and (θ2 , ϕ2) , so the incident field P1 (r, ϑ, φ) at the observe point on the sphere surface can be expressed as [16, 29]:
where k1 and k2 are vectors pointing to the directions of the 1st and 2nd source (θ1 , ϕ1) and (θ2 , ϕ2) respectively, and |k1| = k1 , |k2| = k2 , with k1 and k2 are the wavenumber of 1st and 2nd sources, r is a vector describing the location of the observe point, and w1 and w2 are the weights of the sources. bn (kr) is [30]:
where jn (kr) is the spherical Bessel function and hn (kr) is the spherical Hankel function. In this paper, we only consider open spheres. '*' denote complex conjugation, Ynm is the spherical harmonic of order n and degree m :
where Pn|m| is the associated Legendre function. In practical application, the number of terms in (1) cannot be infinite that we can replace it with a finite N which is mainly determined by the size of the sphere [31].
For S sources, (1) can be rewritten as:
Where ws and ks are the weight and direction vector of the sth source.
2.2 Sound field decomposition
Spherical harmonic describes an orthonormal decomposition of the pressure of sound field, and the core of spherical harmonic decomposition is the forward and inverse spherical fourier transform. Assume a spherical coordinate, there is a function f (ϑ, φ) which is square integrable on the unit sphere, then the spherical fourier transform of f , denoted by the fourier coefficient fnm , is given by [32]:
where (5) and (6) represent the forward and inverse spherical fourier transform, respectively.Ynm is the spherical harmonic of order n and degree m. We define a decomposition coefficient anm (k) which is expressed as:
and shows that the coefficient is independent from the radial and angular information of the observer points. Combining (1) and (7), the sound field at the point Pl (kr, ϑ, φ) can also be expressed in terms of spherical harmonic expansion as:
According to the spherical fourier transform introduced above, (8) can be transformed to:
In practical application, the spherical integration cannot be worked out, thus Pl (kr, ϑ, φ) is measured by the spherical microphone array. Considering a sphere with T × H uniformly distributed microphones, then (9) can be rewritten as:
where αt is the coefficient to make sure the accuracy of approximation from the integral to the summation, and αt can be expressed in the form of:
where Δϑ = π / T , Δφ = 2 π / H .
3. Multiple Sources Localization Algorithm
Considering S sources with directions (θl, ϕl) (l = 1, 2,⋯,L) , the decomposition coefficient of multiple sources can be obtained based on (4) and (10) :
Suppose M observe points with directions (θl, ϕl) (l = 1, 2,⋯,L) on the sphere, we can express the decomposition coefficient as matrix forms: pnm = B × C × W , where
pnm is a ( N + 1 )2 × 1 matrix.
and C is an M × S matrix:
And B = (ΔϑΔφ) × Y , where Y is a (N + 1)2 × M matrix:
Assume A = B × C , then (13) can be simplified to
where A is a (N +1)2 × S steering matrix, W is the receival signal array. Now we can use pnm to do the DOA estimation using MUSIC algorithm through the following processes. First, we can get the correlation matrix:
In practical application, suppose there are K samples, so Rnm can be estimated as:
Taking eigenvalue decomposition of :
where Ea and En are the signal subspace and noise subspace respectively. Then we can get the spectrum equation respect to the direction of the source:
Where a is the steering vector which is a column of A for any direction. By scanning (θ , ϕ) , the peaks of the spectrum will be in conformity to the signals’ directions.
4. Multiple Spherical Arrays
Giving the spherical harmonic up to order N , the maximum source number that the single spherical array can deal with is N −1 . The more sources there are, the more microphones the array need. However, there are geometry limitations for single spherical array. In this paper, a multiple spherical arrays structure is proposed to estimate the directions of multiple sources.
Assuming that there are L irregularly placed spheres of radius r with the same uniform distribution of microphones on each of them, while their self-coordinates are located as a circle of radius R on a plane as shown in Fig. 2. Suppose M = T × H microphones on each sphere.
Fig. 2.Geometry of three spherical arrays
We can rewrite (10) and (4) in the following forms respectively:
where P(l) (ϑt, φt) is the measured pressure on the lth sphere at point (ϑt, φt), ws is the receival signal of the sth source, k = 2πf / c with frequency f and c is the speed of sound, and is the direction vector respect to the sth source. rol is the direction vector from the origin of the coordinate to the centre of the lth sphere while rt is the direction vector from the centre of the lth sphere to the tth microphone on the sphere (|rol| = R, |rt| = r) . By substituting (21) into (20) we can get:
Also it can be expressed as:
where is the decomposition coefficients for multiple spherical array, B and W has the same structure as single spherical array, and C(l) is an M × S matrix given by
Using matrix AL to substitute the L ( N + 1 )2 × M matrix in (23):
AL is steering matrix of multiple spherical array, (23) can be simplified as:
Then, according to (15)-(19), we can find the direction of the sources using multiple spherical arrays.
5. Simulations
In this section, single spherical array (SSA) and multiple spherical arrays (MSA) simulations are provided to illustrate the efficiency of our methods. For all the following simulations, the comparison of performance is given in terms of root mean square error (RMSE) (averaged over the sources).
5.1 Single and multiple arrays
In this subsection, two incoherent sound signals (S1, S2) impinge from approximate directions (θ, ϕ) = (120° , 120°) and (270° , 60°) , respectively, their frequency range from 0.8 kHz to 1 kHz. Assuming that kr = 1.75 and the ratio of signal to noise (SNR) for the two sound signals are both 0 dB .
Twenty microphones are placed on a sphere of radius 0.1 m , six microphones are distributed on each latitude and two microphones on the two poles as shown in Fig. 3.
Fig. 3.Example of single spherical array structure
As for the multiple spherical arrays model, three spheres are placed as a circle of radius 0.5 m on a limited plane as Fig. 2, twenty microphones are distributed on each sphere with the same arrangement as single spherical array. When the pitch angle ϕ of the two sources are fixed at 120° and 60° , scanning the azimuth angle θ of them can obtain their spectrum in this direction. Fig. 4 shows the DOA estimation result for open sphere using single and multiple arrays (MUSIC in Fig. 4 means: single spherical array using MUSIC algorithm; SHMUSIC means: single spherical array using SHMUSIC algorithm; MSHMUSIC means: multiple spherical arrays using SH-MUSIC algorithm). The sidelobe in the third simulation is lower than that of the first two, through which we can distinguish the direction of sources more accurately. The performances of single spherical array and multiple spherical arrays using SH-MUSIC are shown in Fig. 5 and Fig. 6, respectively. The spectrum of multiple arrays has more pointed peak which can improve the estimation accuracy efficiently.
Fig. 4.DOA estimation using different algorithm and structures
Fig. 5.3-D spectrum of DOA estimation using single spherical array
Fig. 6.3-D spectrum of DOA estimation using multiple spherical arrays
Making the SNR of the two sources range from -10 dB to 10 dB simultaneously. For different SNR, simulations are operated of 200 times in order to get RMSE and estimation probability. As Fig. 7 shows, when the SNR increases, the RMSE decreases, and the multiple arrays has lower RMSE than that of the single one at the same SNR, which can lead to better estimation performance in noisy environment.
Fig. 7.RMSE of three algorithms under different SNR values
5.2 Adjacent Sources
The lower sidelobe of the multiple spherical arrays model makes it much more effective to locate two adjacent sources than the single spherical array model as Fig. 8 shows. In this subsection, the single spherical array (SSA) model and multiple spherical arrays(MSA) model are the same as subsection 5.1. The true angles for the two sources are 270° and 279° respectively. The single spherical array model cannot distinguish the two sources while the multiple spherical arrays model can locate them accurately. In another multiple sources case, DOA estimation of five sound sources with direction (126° , 60°) , (132° , 66°) , (114° , 54°) , (138° , 48°) , (108° , 72°) ,was shown in Fig. 9.
Fig. 8.Spectrums of two adjacent sources using two structures
Fig. 9.DOA estimation of five adjacent sources
5.3 Various number of array
All of the previous comparative simulations between the single and multiple arrays were based on the fact that the distributions of microphones on each sphere are the same, so the total number of microphone in those two structures are not equal. In this part, four structures are designed, and the total number of microphone for each structure are equal, 180 microphones in every structure. Each structure consists various number of spheres of radius 0.1m. The distributions of microphones on each sphere are vary from structure to structure, while the distributions are identical within the same structure. Table 1 shows the microphone distribution of four structures used in this subsection. The distribution of microphones on each latitude is referred to as equalangle distribution. Assuming six sound sources with frequency range of (0.5kHz-1kHz), and their directions are (90° , 30°) , (270° , 60°) , (60° , 150°) , (180° , 90°) , (240° , 45°) and (300° , 120°) , respectively. Supposing that kr = 1.67 and the (SNR) for all the six sound signals are 0 dB. Fig. 10 shows the estimation results for all four structures. The performance of those structures under different SNR values is shown in Fig. 11. The single spherical array can only locate those sources vaguely, while the multiple arrays can obtain their locations accurately even when the total number of microphone for each structures are equal. The estimation of single array has such huge error that the SNR can hardly affect its RMSE, and the performance of multiple arrays can be affected by the SNR. When the SNR increases, the RMSE decrease. Although structure 4 has the same amount of microphone as other structures, this one makes all the four spheres separate more far away from each other, which will provide with increased resolution than fewer spherical arrays due to spatial diversity. As shown in (1), the kr value will affect the response of Bessel function. When the bandwidth of the signal is fixed, increasing R will make the modal response of Bessel function more far away from its zeros, which will give higher resolution. H.Teutsch’s book [33] has very detailed explanation about this problem.
Table 1.a Distribution indicates the distribution of microphone on each sphere. b For a *b, a depicts the five latitude on the sphere while b indicates the number of microphones on each latitude.
Fig. 10.RMSE of four structures under different SNR
Fig. 11.DOA estimation of four structures
5.4 Various number of sources
For the simulations in this subsection, structure 4 described in subsection 5.3 was used. The frequency of the sources range from 0.5 kHz to 1 kHz, and kr = 1.67 , SNR=0 dB. Table 2 indicates the direction of those sources.
Table 2.a 1 degree = π / 180 radians
Fig. 12 depicts the estimation performance with 5, 10 and 15 sources. The structure can distinguish the five sources accurately, and when the number of source increases to 10, the performance degrades, but the structure can still figure out their locations with some errors. However, when 15 sources were applied, several fake peaks came out, and the estimation has such huge errors that the RMSE cannot vary from the change of SNR as Fig. 13 indicates.
Fig. 12.DOA estimation of multiple sources, and the position of sources are given in Table 2
Fig. 13.RMSE of multiple sources under different SNR values
5.5 Dependency of error on direction of source
As for multiple spherical array, all those spheres are placed on a plane, this kind of distribution may affect the estimation of sources which are defined within the plane. To figure out the dependency of error on the pitch direction of sources, we designed two other multiple arrays simulations. Structure 4 in subsection 5.3 was used, and the directions of the sources are (30° , 90°) , (90° , 90°) , (150° , 90°) , (180° , 90°) , (240° , 90°) , (300° , 90°) . The estimation results were shown in Fig. 14. In another case, the azimuth angle of the source was fixed at 60° , its pitch angle changes from 0° to 180° with step of 10° . Fig. 15 shows the RMSE of estimation under different SNR values. When ϕ = 90° , the RMSE reaches its peak, indicates that the performance will degrades when sources were placed around the plane on which the multiple arrays were distributed.
Fig. 14.Dependency of error under different SNR
Fig. 15.DOA estimation of multiple sources when
6. Conclusion
A multiple spherical arrays structure is developed for estimating the directions of multiple sources based on spherical harmonic decomposition of the sound field. The structure has following characteristics: (1) multiple spheres are placed as a circle on a plane, (2) on each sphere, the microphones are distributed uniformly as the usual single spherical array. Simulation results show that the multiple spherical arrays can deal with multiple sources more accurately by using less microphones on each sphere, while single spherical array in the same arrangement of microphones cannot provide the accurate results unless the number of microphones increase. Besides, the multiple spherical arrays can be applied to distinguish adjacent sources with high accuracy. Simulations verify the new structure’s efficiency.
참고문헌
- C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay,” IEEE Transaction on Acoustic, Speech and Signal Processing, vol. 24, pp. 320-327, Aug. 1976. https://doi.org/10.1109/TASSP.1976.1162830
- J. Benesty, “Adaptive eigenvalue decomposition algorithm for passive acoustic source localization,” Journal of the Acoustical Society of America, vol. 107, pp. 384-391, Jan. 2000. https://doi.org/10.1121/1.428310
- H. Teutsch and W. Kellermann, “Eb-esprit: 2d localization of multiple wideband acoustic sources using eigen-beams,” in ICASSP IEEE International Conference on Acoustic Speech and Signal Processing, Philadelphia, PA, United states, Mar. 2005, pp. 89-92.
- R. O. Schmidt, “Multiple emitter location and signal parameter estimation,” IEEE Transactions on Antennas and Propagation, vol. 3, pp. 276-280, Mar. 1986.
- A. R. Leyman and T. S. Durrani, “Signal subspace technique for doa estimation using higher order statistics,” in ICASSP IEEE International Conference on Acoustic Speech and Signal Processing, Detroit, MI, USA, May 1995, pp. 1956-1959.
- M. L. McCloud and L. L. Scharf, “A new subspace identification algorithm for high-resolution doa estimation,” IEEE Transactions on Antennas and Propagation, vol. 50, pp. 1382-1390, Oct. 2002. https://doi.org/10.1109/TAP.2002.805244
- R. Roy and T. Kailath, “Esprit-estimation of signal parameters via rotational invariance techniques,” IEEE Transaction on Acoustic, Speech and Signal Processing, vol. 37, pp. 984-995, Jul. 1989. https://doi.org/10.1109/29.32276
- J. S. McGarrity, J. J. Soraghan, and T. S. Durrani, “A fast implementation of the esprit algorithm,” in ICASSP IEEE International Conference on Acoustic Speech and Signal Processing, Albuquerque, New Mexico, USA, Apr. 1990, pp. 1001-1004.
- R. Hamza and K. Buckley, “Resolution enhanced esprit,” IEEE Transactions on Signal Processing, vol. 42, pp. 688-691, Mar. 1994. https://doi.org/10.1109/78.277867
- H. Wang and M. Kaveh, “Coherent signal-subspace processing for the detection and estimation of angles of arrival of multiple wideband sources,” IEEE Transaction on Acoustic, Speech and Signal Processing, vol. 33, pp. 823-831, Aug. 1985. https://doi.org/10.1109/TASSP.1985.1164667
- J. E. F. del Rio and M. F. Catedra-Perez, “A comparison between matrix pencil and root-music for direction-of-arrival estimation making use of uniform linear arays,” Digital Signal Processing(USA), vol. 7, pp. 153-162, Jul. 1997. https://doi.org/10.1006/dspr.1997.0290
- H. Teutsch and W. Kellermann, “Acoustic source detection and localization based on wavefield decomposition using circular microphone arrays,” Journal of the Acoustical Society of America, vol. 120, pp. 2724-2736, Nov. 2006. https://doi.org/10.1121/1.2346089
- B. Rafaely, “Analysis and design of spherical microphone arrays,” IEEE Transaction on audio, speech and language processing, vol. 13, pp. 135-143, Jan. 2005. https://doi.org/10.1109/TSA.2004.839244
- I. Balmages and B. Rafaely, “Open-sphere designs for spherical microphone arrays,” IEEE Transaction on audio, speech and language processing, vol. 15, pp. 727-732, Feb. 2007. https://doi.org/10.1109/TASL.2006.881671
- T. Noohi, N. Epain, and C. T. Jin, “Direction of arrival estimation for spherical microphone arrays by combination of independent component analysis and sparse recovery,” in ICASSP IEEE International Conference on Acoustic Speech and Signal Processing, Vancouver, BC, Canada, May 2013, pp. 346-349.
- H. Teutsch and W. Kellermann, “Detection and localization of multiple wideband acoustic sources based on wavefield decomposition using spherical apertures,” in ICASSP IEEE International Conference on Acoustic Speech and Signal Processing, Las Vegas, NV, United states, Mar. 2008, pp. 5276-5279.
- S. Argentieri, P. Danes, and P. Soueres, “Modal analysis based beamforming for nearfield or farfield speaker localization in robotics,” in IEEE International Conference on Intelligent and Robot System, Beijing, China, Oct. 2006, pp. 866-871.
- T. D. Abhayapala and A. Gupta, “Spherical harmonic analysis of wavefields using multiple circular sensor arrays,” IEEE Transaction on audio, speech and language processing, vol. 18, pp. 1655-1666, Aug. 2010. https://doi.org/10.1109/TASL.2009.2038821
- S. Kunis and D. Potts, “Fast spherical fourier algorithms,” Journal of Computational and Applied Mathematics, vol. 161, pp. 75-98, Dec. 2003. https://doi.org/10.1016/S0377-0427(03)00546-6
- H. Teutsch and W. Kellermann, “Acoustic source detection and localization based on wavefield decomposition using circular microphone arrays,” Journal of the Acoustical Society of America, vol. 120, pp. 2724-2736, Nov. 2006. https://doi.org/10.1121/1.2346089
- X. Li, S. F. Yan, X. C. Ma, and C. H. Hou, “Spherical harmonics music versus conventional music,” Applied Acoustic, vol. 72, pp. 646-652, 2011. https://doi.org/10.1016/j.apacoust.2011.02.010
- X. Mestre and M. A. Lagunas, “Modified subspace algorithms for doa estimation with large arrays,” IEEE Transactions on Signal Processing, vol. 56, pp. 598-614, Feb. 2008. https://doi.org/10.1109/TSP.2007.907884
- A. Gupta and T. D. Abhayapala, “Three-dimensional sound field reproduction using multiple circular loudspeaker arrays,” IEEE Transaction on audio, speech and language processing, vol. 19, pp. 1149-1159, Jul. 2011. https://doi.org/10.1109/TASL.2010.2082533
- P.K.T. Wu, N. Epain, and C.T. Jin, “A dereverberation algorithm for spherical microphone arrays using compressed sensing techniques,” in ICASSP IEEE International Conference on Acoustic Speech and Signal Processing, Kyoto, Japan, Mar. 2012, pp. 4053-4056.
- H. H. Chen and S. C. Chan, “Adaptive beamforming and doa estimation using uniform concentric spherical arrays with frequency invariant characteristics,” Journal of VLSI Signal Processing Systems for Signal, Image, and Video Technology, vol. 46, pp. 15-34, Jan. 2007. https://doi.org/10.1007/s11265-006-0005-x
- Q. H. Huang and T. Song, “Doa estimation of mixed near-field and far-field sources using spherical array,” in IEEE 11th International Conference on Signal Processing, Beijing, China, Oct. 2012, pp. 382-385.
- C.-I. C. Nilsen, I. Hafizovic, and S. Holm, “Robust 3-d sound source localization using spherical microphone arrays,” in 134th Audio Engineering Society Convention, Rome, Italy, May 2013, pp. 570-576.
- Y. Peled and B. Rafaely, “Linearly-constrained minimum-variance method for spherical microphone arrays based on plane-wave decomposition of the sound field,” IEEE Transaction on audio, speech and language processing, vol. 21, pp. 2532-2540, Dec. 2013. https://doi.org/10.1109/TASL.2013.2277939
- D. B. Ward and T. D. Abhayapala, “Theory and design of higher order sound field microphones using spherical microphone array,” in ICASSP IEEE International Conference on Acoustic Speech and Signal Processing, Orlando, FL, United states, May 2002, pp. 1949-1952.
- J. Meyer and G. Elko, “A highly scalable spherical microphone array based on a orthonormal decomposition of the soundfield,” in ICASSP IEEE International Conference on Acoustic Speech and Signal Processing, Orlando, FL, United states, May 2002, pp. 1781-1784.
- T. S. Pollock, T. D. Abhayapala, and R. A. Kennedy, “Characterization of 3d spatial wireless channels,” in IEEE 58th Vehicular Technology Conference, Orlando, FL, United states, Oct. 2003, pp. 123-127.
- E. G. Williams, Fourier acoustic sound radiation and near field acoustical holography. Academic press, 1999.
- H. Teutsch, Modal Array Signal Processing: Principles and Applications of Acoustic Wavefield Decomposition. Springer, 2007.
피인용 문헌
- Rules-of-thumb to design a uniform spherical array for direction finding—Its Cramér–Rao bounds' nonlinear dependence on the number of sensors vol.145, pp.2, 2019, https://doi.org/10.1121/1.5088592