Comparison of Sound Source Localization Methods Based on Zero Crossings

  • Park, Yong-Jin (Department of Electronic Engineering, Sogang University) ;
  • Lee, Soo-Yeon (Department of Electronic Engineering, Sogang University) ;
  • Park, Hyung-Min (Department of Electronic Engineering, Sogang University)
  • Published : 2009.09.30

Abstract

This paper reviews several multi-source localization methods which estimate ITDs based on zero crossings (ZCs). Employing signal-to-noise ratio (SNR) estimation from ITD variances, these ZC-based source localization methods are more robust to diffuse noise than the cross-correlation (CC)-based one with less computational complexity. In order to take reverberant environments into account, two approaches detect intervals which dominantly contain direct-path components from sources to sensors because they may effectively provide reliable ITDDs corresponding to source directions. One accomplishes the detection by comparing the original and cepstral-prefiltering-processed envelopes, and the other searches sudden increase of acoustic energy by considering typical characteristics of acoustic reverberation. Experiments for comparison of these methods demonstrate that the approach with energy-based detection efficiently achieves multi-source localization in reverberant environments.

Keywords

References

  1. J. W. Strutt, 'On our perception of sound direction,' Philosophical Magazine, vol. 13, PP. 214-232, 1907 https://doi.org/10.1080/14786440709463595
  2. L. A. Jeffress, 'A place theory of sound localization,' Journal of Computational Physiology and Psychology, vol. 41, PP. 35-39, 1948 https://doi.org/10.1234/12345678
  3. R. M. Stern and H. Colburn, 'Theory of binaural interaction based on auditory-nerve data. IV. A model of subjective lateral position,' The Journal of the Acoustical Society of America, vol. 64, PP. 127-140, 1978 https://doi.org/10.1121/1.381978
  4. G. J. Brown and M. p. Cooke, 'Computational auditory scene analysis,' Computer Speech and Language, vol. 8, PP. 297-336, 1994 https://doi.org/10.1006/csla.1994.1016
  5. M. P. Cooke, Modeling Auditory Processing and Organisation, Cambridge University Press, 1993
  6. Y.-I. Kim and R. M. Kil, 'Estimation of interaural time differences based on zero-crossings in noisy multisource environments,' IEEE Transactions on Audio, Speech, and Language Processing, Vol. 15, pp. 734-743, 2007 https://doi.org/10.1109/TASL.2006.881669
  7. D.-S. Kim, S.-Y. Lee, and R. M. Kil, 'Auditory processing of speech signals for robust speech recognition in real-world noisy environment,' IEEE Transactions on Speech and Audio Processing, Vol. 7, no. 1, PP. 55-69, 1999 https://doi.org/10.1109/89.736331
  8. N. Blachman, 'Zero-crossing rate for the sum of two sinusoids or a signal plus noise,' IEEE Transactions on Information Theory, Vol. 21, no. 6, PP. 671-675, 1975 https://doi.org/10.1109/TIT.1975.1055466
  9. B. Kedem, Time Series Analysis by Higher Order Crossings, IEEE Press, 1994
  10. Y.-J. Park, S.-Y. Lee, H.-M. Park, 'Zero-crossing-based source direction estimation using a cepstral prefiltering technique,' Journal of the Korean Society of Phonetic Sciences and Speech Technology, no. 67, PP. 121-133, 2008
  11. Y.-J. Park and H.-M. Park, 'Non-stationary sound source localization based on zero crossings with the detection of onset intervals,' IEICE Electronics Express, Vol. 5, no. 24, PP. 1054-1060, 2008 https://doi.org/10.1587/elex.5.1054
  12. N. Roman, D. Wang, and G. Brown, 'Speech segregation based on sound localization,' The Journal of Acoustic Society of America, vol. 114, PP. 2236-2252, 2003 https://doi.org/10.1121/1.1610463
  13. H. Viste and G. Evangelista. 'Binaural source localization,' in Proceedings of International Conference on Digital Audio Elfects, pp. 145-150, Oct. 2004
  14. B. Shinn-Cunningham, N. Kopco, and T. Martin, 'Localizing nearby sound sources in a classroom: Binaural room impulse responses,' The Journal of Acoustic Society of America, Vol. 117, no. 5, pp. 3100-3115, 2005 https://doi.org/10.1121/1.1872572
  15. B. Champagne, S. Bedard, and A. Stephenne, 'Performance of time-delay estimation in the presence of room reverberation,' IEEE Transaction on Speech and Audio Processing, Vol. 4, PP. 148-152, 1996 https://doi.org/10.1109/89.486067
  16. A. V. Oppenheim, R. W. Schafer, Discrete-Time Signal Processing, Prentice-Hall, Englewood Cliffs, NJ, 1975
  17. A. Stephenne and B. Champagne, 'A new cepstral prefiltering technique for estimating time delay under reverberant conditions,' Signal Processing, Vol. 59, PP. 253-266, 1997 https://doi.org/10.1016/S0165-1684(97)00051-0
  18. K. J. Palomaki, G. J. Brown, and D. Wang, 'A binaural processor for missing data speech recognition in the presence of noise and small-room reverberation,' Speech Communication, Vol. 43, PP. 361-378, 2004 https://doi.org/10.1016/j.specom.2004.03.005
  19. J. Benesty, 'Adaptive eigenvalue decomposition algorithm for passive acoustic source localization,' The Journal of the Acoustical Society of America, Vol. 107, PP. 384-391, 2000 https://doi.org/10.1121/1.428310
  20. S. Doclo and M. Moonen, 'Robust adaptive time delay estimation for speaker localization in noisy and reverberant acoustic environments,' EURASIP Journal on Applied Signal Processing, Vol. 11, PP. 1110-1124, 2003 https://doi.org/10.1155/S111086570330602X
  21. F. Antonacci, D. Lonoce, M. Motta, A. Sarti, and S. Tubaro, 'Efficient source localization and tracking in reverberant environments using microphone arrays,' in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'05), PP. 1061-1064, 2005 https://doi.org/10.1109/ICASSP.2005.1416195
  22. J. B. Allen and D. A. Berkley, 'Image method for efficiently Simulating small-room acoustics,' The Journal of the Acoustical Society of America, Vol. 65, PP. 943-950, 1979 https://doi.org/10.1121/1.382599
  23. J. S. Garofolo, L. F. Lamel, W. M. Fisher, J. G. Fiscus, D. S. Pallett, and N. L. Dahlgren, 'DARPA TIMIT acoustic-phonetic continuous speech corpus,' National Institute of Standards and Technology (NIST), Tech. Rep., 1993