Browse > Article
http://dx.doi.org/10.5909/JBE.2015.20.3.408

Vocal Separation Using Selective Frequency Subtraction Considering with Energies and Phases  

Kim, Hyuntae (Dept. of Multimedia Engineering, Dongeui University)
Park, Jangsik (Dept. of Electronics Engineering, Kyungsung University)
Publication Information
Journal of Broadcast Engineering / v.20, no.3, 2015 , pp. 408-413 More about this Journal
Abstract
Recently, According to increasing interest to original sound Karaoke instrument, MIDI type karaoke manufacturer attempt to make more cheap method instead of original recoding method. The specific method is to make the original sound accompaniment to remove only the voice of the singer in the singer music album. In this paper, a system to separate vocal components from music accompaniment for stereo recordings were proposed. Proposed system consists of two stages. The first stage is a vocal detection. This stage classifies an input into vocal and non vocal portions by using SVM with MFCC. In the second stage, selective frequency subtractions were performed at each frequency bin in vocal portions. In this case, it is determined in consideration not only the energies for each frequency bin but also the phase of the each frequency bin at each channel signal. Listening test with removed vocal music from proposed system show relatively high satisfactory level.
Keywords
MFCC; SVM; Vocal Remover; Selective Frequency Subtraction; Inter-Channel Phase Difference;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 H. Kim, G. Lee, J. park, and Y. Yu, “Vehicle Detection in Tunnel using Gaussian Mixture Model and Mathematical Morphological Processing,” J. of the Korea Institute of Electronic Communication Science, vol. 7, no. 5, 2012, pp. 967-974.   DOI
2 K. Park and H. Kim, "A Study for Video-based Vehicle Surveillance on Outdoor Road," J. of the Korea Institute of Electronic Communication Science, vol. 8, no. 11, 2013, pp. 1647-1653.   DOI   ScienceOn
3 H. Kim and J. Park, “Smoke Detection in Outdoor Using Its Statistical Characteristics,” J. of the Korea Institute of Electronic Communication Science, vol. 9, no. 2, 2014, pp. 149-154.   DOI   ScienceOn
4 T. Leung, C. Ngo, and R.W.H. Lau, "Ica-fx features for classification of singing voice and instrumental sound," in Proc. International Conference on Pattern Recognition, Cambridge, UK, 2004, vol. 2.
5 A. Berenzweig and D.P.W. Ellis, "Locating singing voice segments within music signals," in IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA'2001), New York, USA, October, 2001.
6 T. Virtanen, A. Mesaros, and M. Ryynänen, “Combining Pitch-Based Inference and Non-Negative Spectrogram Factorization in Separating Vocals from Polyphonic Music,” Proc. Statistical and Perceptual Audition, Brisbane, Australia, September 2008.
7 J.-L. Durrieu, A. Ozerov, C. Févotte, G. Richard and B. David, “Main instrument separation from stereophonic audio signals using a source/filter model,” 17th European Signal Processing Conference (EUSIPCO 2009) Glasgow, Scotland, August 2009.
8 Hae Y. Park, Kwan Y. Lee, "Pattern and Machine Learning from Fundamental to Applications, Ihan Press, Goyang, South Korea, 2011.
9 H. Kim, “Vocal Separation in Music Using SVM and Selective Frequency Subtraction” J. of the Korea Institute of Electronic Communication Science, vol. 10, no. 1, 2015, pp. 1-6.   DOI   ScienceOn