DOI QR코드

DOI QR Code

Direction Estimation of Multiple Sound Sources Using Circular Probability Distributions

순환 확률분포를 이용한 다중 음원 방향 추정

  • Received : 2011.05.25
  • Accepted : 2011.07.21
  • Published : 2011.08.31

Abstract

This paper presents techniques for estimating directions of multiple sound sources ranging from $0^{\circ}$ to $360^{\circ}$ using circular probability distributions having a periodic property. Phase differences containing direction information of sources can be modeled as mixtures of multiple probability distributions and source directions can be estimated by maximizing log-likelihood functions. Although the von Mises distribution is widely used for analyzing this kind of periodic data, we define a new class of circular probability distributions from Gaussian and Laplacian distributions by adopting a modulo operation to have $2{\pi}$-periodicity. Direction estimation with these circular probability distributions is done by implementing corresponding EM (Expectation-Maximization) algorithms. Simulation results in various reverberant environments confirm that Laplacian distribution provides better performance than von Mises and Gaussian distributions.

본 논문에서는 주기성을 갖는 순환 확률분포를 이용하여 $0^{\circ}{\sim}360^{\circ}$ 범위의 다중 음원의 방향을 추정하는 기법을 제안한다. 음원의 방향 정보를 담고 있는 마이크로폰간의 위상차는 확률분포의 혼합물로 간주될 수 있으며, 음원 방향은 이 확률분포의 혼합물에 적용된 로그-우도함수 (log-likelihood function)를 최대화함으로써 추정된다. 주기성을 갖는 데이터의 분석에 von Mises 확률분포가 널리 활용된다는 사실은 잘 알려져 있지만, 본 논문에서는 기존의 Gaussian이나 Laplacian 확률분포에 $2{\pi}$ 모듈로 (modulo) 연산을 적용함으로써 $0^{\circ}{\sim}360^{\circ}$ 범위의 주기성을 갖는 순환 확률분포를 정의하고 이를 방향 추정에 활용한다. 순환 확률분포의 혼합물에 대한 로그-우도함수를 최대가 되게 하는 음원의 방향은 EM (Expectation-Maximization) 알고리즘을 이용하여 추정된다. 다양한 반향 환경에서의 실험 결과 Laplacian 확률분포가 von Mises나 Gaussian 확률분포보다 우수한 성능을 제공함을 확인할 수 있다.

Keywords

References

  1. J. Benesty, J. Chen, and Y. Huang, Microphone array signal processing, Springer, 2008.
  2. C. H. Knapp and G. Carter, "The generalized correlation method for estimation of time delay," IEEE Trans. Acoust. Speech Signal Process., vol. 24, pp. 320-327, 1976. https://doi.org/10.1109/TASSP.1976.1162830
  3. R. O. Schmidt, "Multiple emitter location and signal parameter estimation," IEEE Trans. Antennas Propag., vol. 34, pp. 276- 280, 1986. https://doi.org/10.1109/TAP.1986.1143830
  4. H. Wang and M. Kaveh, "Coherent signal-subspace processing for the detection and estimation of angles of arrival of multiple wide-band sources," IEEE Trans. Acoust. Speech Signal Process., vol. 33, pp. 823-831, 1985. https://doi.org/10.1109/TASSP.1985.1164667
  5. L. A. Jeffress, "A place theory of sound localization," J. Comparative Physiol. Psychol., vol. 41, no. 1, pp. 35-39, 1948. https://doi.org/10.1037/h0061495
  6. P. Aarabi, "Self-localizing dynamic microphone arrays," IEEE Trans. Syst., Man, Cybern. C, vol. 32, no. 4, pp. 474-484, 2002. https://doi.org/10.1109/TSMCB.2002.804369
  7. M. I. Mandel, D. P. W. Ellis, and T. Jebara, "An EM algorithm for localizing multiple sound sources in reverberant environments," in Adv. Neural Info. Process. Syst., B. Schölkopf, J. Platt, and T. Hoffman, Eds. Cambridge, MA: MIT Press, pp. 953- 960, 2007.
  8. N. Mitianoudis and T. Stathaki, "Batch and online underdetermined source separation using laplacian mixture models," IEEE Trans. on Audio, Speech, and Lang. Proc. vol. 15, pp. 1818-1832, 2007. https://doi.org/10.1109/TASL.2007.899281
  9. C. M. Bishop, Pattern recognition and machine learning, Springer, 2006.
  10. C. Liu, B. C. Wheeler, Jr, R. C. Bilger, C. R. Lansing, and A. S. Feng, "Localization of multiple sound sources with two microphones," J. Acoust. Soc. Amer., vol. 108, no. 4, pp. 1888-1905, 2000. https://doi.org/10.1121/1.1290516
  11. N. T. Thom, and S. H. Nam, "An expectation-maximization method for the permutation problem in frequency-domain blind speech separation," in Proc. of ICASSP2010, 2010.
  12. Y. Hioka, M. Matsuo, and N. Hamada, "Multiple-speechsource localization using advanced histogram mapping method," Acousitical Sicence and Technology, vol. 30, no. 2, 2009.
  13. P. Smaragdis, and P. Boufounos, "Position and trajectory learning for microphone arrays," IEEE Trans. on Speech and Audio Proc., Jan. 2007.
  14. D. R. Campbell, K. J. Palomaki, and G. J. Brown, "Roomsim, a matlab simulation of shoebox room acoustics for use in teaching and research," in http://media.paisley.ac.uk/-campbell/ Roomsim/, 2008.