Browse > Article
http://dx.doi.org/10.7776/ASK.2010.29.8.509

Mono-To-Stereo Blind Upmix Using Non-Negative Matrix Factorization and Decorrelator  

Choi, Keun-Woo (서울대학교 전기컴퓨터공학부 음향공학연구실)
Chon, Sang-Bae (서울대학교 전기컴퓨터공학부 음향공학연구실)
Lee, Seok-Jin (서울대학교 전기컴퓨터공학부 음향공학연구실)
Sung, Koeng-Mo (서울대학교 전기컴퓨터공학부 음향공학연구실)
Abstract
This paper presents a new method for upmixing mono signal to stereo signal with guaranteeing high stereophonic image quality (SIQ) and large apparent source width (ASW). The proposed method consists of analysis phase and synthesis phase. In analysis phase, a mono signal is first decomposed into multiple sound sources by the use of high-rank nonnegative matrix factorization. Then the multiple sources are clustered into two groups based on tonality criterion. In synthesis phase, one group is directly fed into left and right channels while the other group is decorrelated before being fed into each channel. Subjective tests reveals that the proposed method gives listener high SIQ and large ASW with minimizing timbral distortions.
Keywords
NMF; Mono Upmix;
Citations & Related Records
연도 인용수 순위
  • Reference
1 ITU-R (1997). Recommendation BS. 1116-1 : Recommendation BS. 1116: Methods for subjective assessment of small impairments in audio systems including multichannel sound systems, International communication union.
2 M. Morimoto, "The role of rear loudspeaker in spatial impression", AES 103th Convention, paper no. 4554, September, 1997.
3 K. Brandenburg and J. D. Johnston, "Second generation perceptual audio coding : The hybrid coder," AES 88th Convention, March, 1990.
4 M. O. J. Hawksford and N. Harris, "Diffuse signal processing and acoustic source characterization for applications in synthetic loudspeaker arrays," AES 122nd Convention, April. 2002.
5 C. Uhle, "Ambience separation from mono recordings using non-negative matrix factorization," AES 30th Conference, March, 2007.
6 M. Helen, "Separation of drums from polyphonic music using non-negative matrix factorization and support vector machine," EURASIP 13th Conference, September, 2005.
7 D. Lee and H. Seung, "Algorithms for non-negative matrix factorization," in Proc. NIPS, 2001.
8 C. Uhle, A. Walther, and M. Ivertowski, "Blind one-to-N upmixing," AudioMostly 2nd Conference, pp. 110-115, September, 2007.
9 M. Lagrange, L. G. Martins, and G. Tzanetakis, "Semi-automatic mono to stereo up-mixing using sound source formation", AES 125th Convention, paper no. 7042, May, 2007.
10 E. Benetos, "Musical instrument classification using non-negative matrix factorization algorithms and subset feature selection," in Proc. IEEE Conference on Acoustics, Speech, and Signal Processing, May, 2006.
11 P. Smaragdis, "Non-negative matrix factorization for polyphonic music transcription," in Proc. IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, October, 2003.
12 M. R. Schroeder and B. F. Logan, "Colorless artificial reverberation," J. of AES, no. 3, pp. 192-197, 1961.
13 ITU-T (1996). Recommendation P.800 : Method for object and subject assessment of quality, International communication union.
14 F. Rumsey, "Subject assessment of the spatial attributes of reproduced sound," AES 15th Conference, October, 1998.
15 F. Rumsey, S. ZielinCski, and R. Kassier, "On the relative importance of spatial and timbral fidelities in judgments of degraded multichannel audio quality," J. of ASA, vol. 118, Issue 2, pp. 968-976, August, 2005.
16 F. Rumsey, "Spatial audio and sensory evaluation techniques-context, history and aims," Spatial audio and sensory evaluation techniques conference, April, 2006.
17 ITU-R (2001). Recommendation BS. 1534-1 : Method for the subjective assessment of intermediate quality level of coding systems, International communication union.