DOI QR코드

DOI QR Code

Harmonic and Percussive Separation Based on NMF and Tonality Mask

  • Choi, Keunwoo (Broadcasting & Telecommunications Convergence Research Laboratory, ETRI) ;
  • Chon, Sang Bae (DMC Research Center, Samsung Electronics) ;
  • Kang, Kyeongok (Broadcasting & Telecommunications Convergence Research Laboratory, ETRI)
  • Received : 2012.03.07
  • Accepted : 2012.08.27
  • Published : 2012.12.31

Abstract

In this letter, we present a new algorithm for the harmonic and percussive separation of jazz music. Using a short-time Fourier transform and nonnegative matrix factorization, the signal is decomposed into rank components. Each component is then split into harmonic and percussive parts using masks calculated based on their tonalities. Finally, the harmonic and percussive parts are separated after applying the masks and a summation. We evaluate the algorithm based on real audio examples using both objective and subjective assessments. The proposed algorithm performs well for the separation of harmonic and percussive parts of jazz excerpts.

Keywords

References

  1. I. Jang, J. Seo, and K. Kang, "File Format Design for Interactive Music Service," ETRI J., vol. 33, no. 1, Feb. 2011, pp. 128-131. https://doi.org/10.4218/etrij.11.0210.0129
  2. P. Smaragdis and J.C. Brown, "Non-negative Matrix Factorization for Polyphonic Music Transcription," Appl. Signal Process. Audio Acoustics, IEEE Workshop, 2003, pp. 177-180.
  3. M. Helen and T. Virtanen, "Separation of Drums from Polyphonic Music Using Non-negative Matrix Factorization and Support Vector Machine," European Signal Process. Conf., 2005.
  4. M. Kim et al., "Nonnegative Matrix Partial Co-Factorization for Spectral and Temporal Drum Source Separation," IEEE J. Sel. Topics Signal Process., vol. 5, 2011, pp. 1192-1204. https://doi.org/10.1109/JSTSP.2011.2158803
  5. N. Ono et al., "Separation of a Monaural Audio Signal into Harmonic/Percussive Components by Complementary Diffusion on Spectrogram," European Signal Process. Conf., Lausanne, Switzerland, 2008.
  6. D. Fitzgerald, "Harmonic/Percussive Separation Using Median Filtering," Int. Conf. Digital Audio Effects, Graz, Austria, 2010.
  7. D.D. Lee and H.S. Seung, "Algorithms for Non-negative Matrix Factorization," Advances Neural Inf. Process. Syst., vol. 13, 2001.
  8. O. Gillet and G. Richard, "Transcription and Separation of Drum Signals from Polyphonic Music," IEEE Trans. Audio, Speech, Language Process., vol. 16, 2008, pp. 529-540. https://doi.org/10.1109/TASL.2007.914120
  9. K. Brandenburg and J.D. Johnston, "Second Generation Perceptual Audio Coding: The Hybrid Coder," Audio Eng. Soc. (AES) Conv., 1990.
  10. R.G.E. Vincent et al., "BASS-dB: The Blind Audio Source Separation Evaluation Database." Available: http://bass-db.gforge.inria.fr/BASS-dB
  11. E. Vincent, R. Gribonval, and C. Fevotte, "Performance Measurement in Blind Audio Source Separation," IEEE Trans. Audio, Speech, Language Process., vol. 14, 2006, pp. 1462-1469. https://doi.org/10.1109/TSA.2005.858005
  12. E. Vincent, "Musical Source Separation Using Time-Frequency Source Priors," IEEE Trans. Audio, Speech, Language Process., vol. 14, 2006, pp. 91-98. https://doi.org/10.1109/TSA.2005.860342
  13. International Telecommunication Union, "Methods for the Subjective Assessment of Small Impairments in Audio Systems Including Multi-channel Sound Systems," Rec. ITU-R BS.1116, Geneva, Switzerland, 1994.