[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5909/JBE.2016.21.2.169

Non-uniform Linear Microphone Array Based Source Separation for Conversion from Channel-based to Object-based Audio Content

Chun, Chan Jun (School of Electrical Engineering and Computer Science)
Kim, Hong Kook (School of Electrical Engineering and Computer Science)

Publication Information

Journal of Broadcast Engineering / v.21, no.2, 2016 , pp. 169-179 More about this Journal

Abstract

Recently, MPEG-H has been standardizing for a multimedia coder in UHDTV (Ultra-High-Definition TV). Thus, the demand for not only channel-based audio contents but also object-based audio contents is more increasing, which results in developing a new technique of converting channel-based audio contents to object-based ones. In this paper, a non-uniform linear microphone array based source separation method is proposed for realizing such conversion. The proposed method first analyzes the arrival time differences of input audio sources to each of the microphones, and the spectral magnitudes of each sound source are estimated at the horizontal directions based on the analyzed time differences. In order to demonstrate the effectiveness of the proposed method, objective performance measures of the proposed method are compared with those of conventional methods such as an MVDR (Minimum Variance Distortionless Response) beamformer and an ICA (Independent Component Analysis) method. As a result, it is shown that the proposed separation method has better separation performance than the conventional separation methods.

Keywords

Sound source separation; channel-based audio content; object-based audio content; non-uniform linear microphone array; frequency-dependent source separation;

Citations & Related Records

Reference

1	J. Herre, J. Hilpert, A. Kuntz, and J, Plogsties, “MPEG-H 3D audio—the new standard for coding of immersive spatial audio,” IEEE Journal of Selected Topics in Signal Processing, vol. 9, no. 5, pp. 770-779, Aug. 2015. DOI
2	J. Herre, J. Hilpert, A. Kuntz, and J. Plogsties, “MPEG-H audio—the new standard for universal spatial/3D audio coding,” Journal of the Audio Engineering Society, vol. 62, no. 12, pp. 821-830, Dec. 2014. DOI
3	S. Makino, T.-W. Lee, and H. Sawada, Blind Speech Separation, Springer, Netherlands, 2007.
4	J. Benesty, J. Chen, and Y. Huang, Microphone Array Signal Processing, Springer, Berlin, Germany, 2008.
5	M. Brandstein and D. Ward, Microphone Arrays: Signal Processing Techniques and Applications, Springer, Berlin, Germany, 2001.
6	A. Hyvärinen and E. Oja, “A fast fixed-point algorithm for independent component analysis,” Neural Computation, vol. 9, no. 7, pp. 1483-1492, Oct. 1997. DOI
7	J. Breebaart, and C. Faller, Spatial Audio Processing: MPEG Surround and Other Applications, John Wiley & Sons, Ltd., Chichester, UK, 2007.
8	J. Dmochowski, J. Benesty, and S. Affes, “On spatial aliasing in microphone arrays,” IEEE Transactions on Signal Processing, vol. 57, no. 4, pp. 1383-1395, Apr. 2009. DOI
9	D. F. Rosenthal and H. G. Okuno, Computational Auditory Scene Analysis, LEA Publishers, Mahwah, NJ, 1998.
10	A. Hyvärinen, J. Karhunen, and E. Oja, Independent Component Analysis, John Wiley & Sons, Inc., Canada, 2001.
11	H. Cox, R. M. Zeskind, and M. M. Owen, “Robust adaptive beamforming,” IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 35, no. 10, pp. 1365-1375, Oct. 1987. DOI
12	O. Yilmaz and S. Rickard, “Blind separation of speech mixtures via time-frequency masking,” IEEE Transactions on Signal Processing, vol. 52, no. 7, pp. 1830-1847, July 2004. DOI
13	H. Adel, M. Souad, A. Alaqeeli, and A. Hamid, “Beamforming techniques for multichannel audio signal separation,” International Journal of Digital Content Technology and its Applications, vol. 6, no. 20, pp. 659-667, Nov. 2012. DOI
14	D. Barry, B. Lawlor, and E. Coyle, "Sound source separation: azimuth discrimination and resynthesis," in Proceedings of International Conference on Digital Audio Effects (DAFX-04), pp. 1-5, Naples, Italy, Oct. 2004.
15	C. J. Chun and H. K. Kim, "Sound source separation using interaural intensity difference in real environments," in Proceedings of 135th Audio Engineering Society (AES) Convention, Preprint 8976, New York, NY, Oct. 2013.
16	E. Vincent, R. Gribonval, and C. Fevotte, “Performance measurement in blind audio source separation,” IEEE Transactions on Audio, Speech and Language Processing, vol. 14, no. 4, pp. 1462–1469, July 2006. DOI
17	A. L. Casanovas, G. Monaci, P. Vandergheynst, and R. Gribonval, “Blind audiovisual source separation based on sparse redundant representations,” IEEE Transactions on Multimedia, vol. 12, no. 5, pp. 358-371, Aug. 2010. DOI

KSCI

Non-uniform Linear Microphone Array Based Source Separation for Conversion from Channel-based to Object-based Audio Content 채널 기반에서 객체 기반의 오디오 콘텐츠로의 변환을 위한 비균등 선형 마이크로폰 어레이 기반의 음원분리 방법

Non-uniform Linear Microphone Array Based Source Separation for Conversion from Channel-based to Object-based Audio Content