Browse > Article
http://dx.doi.org/10.5909/JBE.2019.24.6.1024

Investigation of Timbre-related Music Feature Learning using Separated Vocal Signals  

Lee, Seungjin (SK Telecom)
Publication Information
Journal of Broadcast Engineering / v.24, no.6, 2019 , pp. 1024-1034 More about this Journal
Abstract
Preference for music is determined by a variety of factors, and identifying characteristics that reflect specific factors is important for music recommendations. In this paper, we propose a method to extract the singing voice related music features reflecting various musical characteristics by using a model learned for singer identification. The model can be trained using a music source containing a background accompaniment, but it may provide degraded singer identification performance. In order to mitigate this problem, this study performs a preliminary work to separate the background accompaniment, and creates a data set composed of separated vocals by using the proven model structure that appeared in SiSEC, Signal Separation and Evaluation Campaign. Finally, we use the separated vocals to discover the singing voice related music features that reflect the singer's voice. We compare the effects of source separation against existing methods that use music source without source separation.
Keywords
Singer identification; Music recommendation; Music representation learning; Timbre-related music similarity; Deep learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 B. Mcfee, C. Raffel, D. Liang, D. P. Ellis, M. McVicar, E. Battenberg, O. Neito, "Librosa: Audio and music signal analysis in python", Proceeding of the 14th Python in Science Conference, 2015.
2 M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Kudlur, "Tensorflow: a system for large-scale machine learning", Proceeding of the 12th USENIX conference on OSDI, 2016.
3 J. Park, J. Lee, J. Park, J. Ha, J. Nam, "Representation Learning of Music Using Artist Labels", Proceeding of International Society for Music Information Retrieval Conference, Paris, France, pp. 717-724, 2018.
4 B. Logan, A. Salomon, "A Music Similarity Function Based on Signal Analysis", ICME, Tokyo, Japen, pp. 22-25, 2001.
5 H. Eghbal-Zadeh, B. Lehner, M. Schedl, G. Widmer, "I-Vectors for Timbre-Based Music Similarity and Music Artist Classification", Proceeding of International Society for Music Information Retrieval Conference, Malaga ,Spain pp. 554-560, 2015.
6 C. I. Wang, G. Tzanetakis, "Singing style investigation by residual siamese convolutional neural networks", Proceeding of International Conference Acoustic, Speech and Signal Processing, Calgary, Canada, pp. 116-120, 2018.
7 K. Lee, J. Nam, "LEARNING A JOINT EMBEDDING SPACE OF MONOPHONIC AND MIXED MUSIC SIGNALS FOR SINGING VOICE", Proceeding of International Society for Music Information Retrieval Conference, Delft, Netherlands, 2019.
8 S. Uhlich, M. Porcu, F. Giron, M. Enenkl, T. Kemp, N. Takahashi, Y. Mitsufuji, "Improving music source separation based on deep neural networks through data augmentation and network blending", Proceeding of International Conference Acoustic, Speech and Signal Processing, New Orleans, LA, USA, pp. 261-265, 2017.
9 A. Jansson, E. Humphrey, N. Montecchio, R. Bittner, A. Kumar, T. Weyde, "Singing voice separation with deep U-Net convolutional networks", Proceeding of International Society for Music Information Retrieval Conference, Suzhou, China, pp. 745-751, 2017.
10 D. Stoller, S. Ewert, S. Dixon, "Wave-u-net: A multi- scale neural network for end-to-end source separation", Proceeding of International Society for Music Information Retrieval Conference, Paris, France, pp. 334-340, 2018.
11 D. Ward, R. D. Mason, C. Kim, F. R. Stoter, A. Liutkus, M. Plumbley, "SISEC 2018: state of the art in musical audio source separation-Subjective selection of the best algorithm", proceeding of the 4th Workshop on Intelligent Music Production, 2018.
12 Z. Rafii, A. Liutkus, F. R. Stoter, S. I. Mimilakis, R. Bittener, "The MUSDB18 corpus for music separation", 2017 Zafar Rafii, Antoine Liutkus, Fabian Rovert-Stoter, Stylianos loannis Mimiiakis, Rachel Bittner. MUSDB18 - a corpus for music separation, 2017, <10.5281/zenodo.1117371>.
13 R. Bittener, J. Salamon, M. Tierney, M. Mauch, C. Cannam, J. P. Bello, "MedleyDB: A multitrack dataset for annotation-intensive mir research, Proceeding of International Society for Music Information Retrieval Conference, Taipei, Taiwan, pp. 155-160, 2014.
14 J. Schluter, T. Grill, "Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks", Proceeding of International Society for Music Information Retrieval Conference, Malaga, Spain, pp. 121-126, 2015.
15 K. Lee, K. Choi, J. Nam, "Revisiting Singing Voice Detection: a quantitative review and the future outlook", Proceeding of International Society for Music Information Retrieval Conference, Paris, France, pp. 506-513, 2018.
16 J. Schluter, "Learning to pinpoint singing voice from weakly labeled examples", Proceeding of International Society for Music Information Retrieval Conference, New York, USA, pp. 44-50, 2016.