DOI QR코드

DOI QR Code

인접 배치된 스테레오 무지향성 마이크로폰 환경에서 양이간 강도차를 활용한 음원 분리 기법

Sound Source Separation Using Interaural Intensity Difference in Closely Spaced Stereo Omnidirectional Microphones

  • 전찬준 (광주과학기술원 정보통신공학부) ;
  • 정석희 (광주과학기술원 정보통신공학부) ;
  • 김홍국 (광주과학기술원 정보통신공학부)
  • Chun, Chan Jun (Gwangju Institute of Science and Technology, School of Information and Communications) ;
  • Jeong, Seok Hee (Gwangju Institute of Science and Technology, School of Information and Communications) ;
  • Kim, Hong Kook (Gwangju Institute of Science and Technology, School of Information and Communications)
  • 투고 : 2013.10.16
  • 심사 : 2013.11.24
  • 발행 : 2013.12.25

초록

본 논문에서는 실제 환경에서 인접 배치된 무지향성 스테레오 마이크로폰을 활용하여 녹음받은 스테레오 오디오 신호를 양이간 강도차에 기반하여 원하는 방위각에 존재하는 음원을 추출하는 음원 분리 기법을 제안한다. 먼저, 최소 분산 무손실 응답빔형성기를 활용하여 스테레오 오디오 신호의 양이간 강도차를 극대화하고, 강도차 기반의 음원 분리 기법을 적용한다. 제안된 기법의 성능을 검증하기 위하여 stereo audio source separation evaluation campaign (SASSEC)에서 제공하는 객관적 성능평가 지표인 source-to-distortion ratio (SDR), source-to-interference ratio (SIR), sources-to-artifacts ratio (SAR)을 측정하였다. 측정한 결과, 음원 분리 기법에 빔형성기까지 적용한 결과가 높은 성능을 보인 것으로 평가되었다.

In this paper, the interaural intensity difference (IID)-based sounr source separation method in closely spaced stereo omnidirectional microphones is proposed. First, in order to improve the channel separability, a minimum variance distortionless response (MVDR) beamformer is employed to increase the intensity difference between stereo channels. After that, IID-based sound source separation method is applied. In order to evaluate the performance of the proposed method, source-to-distortion ratio (SDR), source-to-interference ratio (SIR), and sources-to-artifacts ratio (SAR), which are defined as objective evaluation criteria in stereo audio source separation evaluation campaign (SASSEC), are measured. As a result, it was shown from the objective evaluation that the proposed method outperforms a sound source separation method without applying a beamformer.

키워드

참고문헌

  1. A. Hyvarinen, "Survey on independent component analysis," Neural Computing Surveys, vol. 2, pp. 94-128, 1999.
  2. D. F. Rosenthal and H. G. Okuno, Computational Auditory Scene Analysis, LEA Publishers, Mahwah, NJ, 1998.
  3. P. Divenyi, Speech Separation by Humans and Machines, Kluwer Academic Publishers, Norwell, MA, 2005.
  4. D. Barry, B. Lawlor, and E. Coyle, "Sound source separation: azimuth discrimination and resyntesis," in Proc. of International. Conference on Digital Audio Effects (DAFX-04), pp. 1-5, Oct. 2004.
  5. H. Cox, R. M. Zeskind, and M. M. Owen, "Robust adaptive beamforming," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 35, no. 10, pp. 1365-1375, Oct. 1987. https://doi.org/10.1109/TASSP.1987.1165054
  6. M. Brandstein and D. Ward, Microphone Arrays: Signal Processing Techniques and Applications, Springer, Berlin, Germany, 2001.
  7. J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Springer, Berlin, Germany, 2008.
  8. E. Vincent, H. Sawada, P. Bofill, S. Makino, and J. P. Rosca, "First stereo audio source separation evaluation campaign: data, algorithms and results," in Proc. of International Conference on Independent Component Analysis and Signal Separation, pp. 552-559, Feb. 2007.
  9. EBU Technical Document 3253, Sound Quality Assessment Material Recordings for Subjective Tests-Users' Handbook for the EBU-SQAM Compact Disc, 1988.