[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5909/JBE.2019.24.6.1024

Investigation of Timbre-related Music Feature Learning using Separated Vocal Signals

Lee, Seungjin (SK Telecom)

Publication Information

Journal of Broadcast Engineering / v.24, no.6, 2019 , pp. 1024-1034 More about this Journal

Abstract

Preference for music is determined by a variety of factors, and identifying characteristics that reflect specific factors is important for music recommendations. In this paper, we propose a method to extract the singing voice related music features reflecting various musical characteristics by using a model learned for singer identification. The model can be trained using a music source containing a background accompaniment, but it may provide degraded singer identification performance. In order to mitigate this problem, this study performs a preliminary work to separate the background accompaniment, and creates a data set composed of separated vocals by using the proven model structure that appeared in SiSEC, Signal Separation and Evaluation Campaign. Finally, we use the separated vocals to discover the singing voice related music features that reflect the singer's voice. We compare the effects of source separation against existing methods that use music source without source separation.

Keywords

Singer identification; Music recommendation; Music representation learning; Timbre-related music similarity; Deep learning;

Citations & Related Records

Reference

1	B. Mcfee, C. Raffel, D. Liang, D. P. Ellis, M. McVicar, E. Battenberg, O. Neito, "Librosa: Audio and music signal analysis in python", Proceeding of the 14th Python in Science Conference, 2015.
2	M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Kudlur, "Tensorflow: a system for large-scale machine learning", Proceeding of the 12th USENIX conference on OSDI, 2016.
3	J. Park, J. Lee, J. Park, J. Ha, J. Nam, "Representation Learning of Music Using Artist Labels", Proceeding of International Society for Music Information Retrieval Conference, Paris, France, pp. 717-724, 2018.
4	B. Logan, A. Salomon, "A Music Similarity Function Based on Signal Analysis", ICME, Tokyo, Japen, pp. 22-25, 2001.
5	H. Eghbal-Zadeh, B. Lehner, M. Schedl, G. Widmer, "I-Vectors for Timbre-Based Music Similarity and Music Artist Classification", Proceeding of International Society for Music Information Retrieval Conference, Malaga ,Spain pp. 554-560, 2015.
6	C. I. Wang, G. Tzanetakis, "Singing style investigation by residual siamese convolutional neural networks", Proceeding of International Conference Acoustic, Speech and Signal Processing, Calgary, Canada, pp. 116-120, 2018.
7	K. Lee, J. Nam, "LEARNING A JOINT EMBEDDING SPACE OF MONOPHONIC AND MIXED MUSIC SIGNALS FOR SINGING VOICE", Proceeding of International Society for Music Information Retrieval Conference, Delft, Netherlands, 2019.
8	S. Uhlich, M. Porcu, F. Giron, M. Enenkl, T. Kemp, N. Takahashi, Y. Mitsufuji, "Improving music source separation based on deep neural networks through data augmentation and network blending", Proceeding of International Conference Acoustic, Speech and Signal Processing, New Orleans, LA, USA, pp. 261-265, 2017.
9	A. Jansson, E. Humphrey, N. Montecchio, R. Bittner, A. Kumar, T. Weyde, "Singing voice separation with deep U-Net convolutional networks", Proceeding of International Society for Music Information Retrieval Conference, Suzhou, China, pp. 745-751, 2017.
10	D. Stoller, S. Ewert, S. Dixon, "Wave-u-net: A multi- scale neural network for end-to-end source separation", Proceeding of International Society for Music Information Retrieval Conference, Paris, France, pp. 334-340, 2018.
11	D. Ward, R. D. Mason, C. Kim, F. R. Stoter, A. Liutkus, M. Plumbley, "SISEC 2018: state of the art in musical audio source separation-Subjective selection of the best algorithm", proceeding of the 4th Workshop on Intelligent Music Production, 2018.
12	Z. Rafii, A. Liutkus, F. R. Stoter, S. I. Mimilakis, R. Bittener, "The MUSDB18 corpus for music separation", 2017 Zafar Rafii, Antoine Liutkus, Fabian Rovert-Stoter, Stylianos loannis Mimiiakis, Rachel Bittner. MUSDB18 - a corpus for music separation, 2017, <10.5281/zenodo.1117371>.
13	R. Bittener, J. Salamon, M. Tierney, M. Mauch, C. Cannam, J. P. Bello, "MedleyDB: A multitrack dataset for annotation-intensive mir research, Proceeding of International Society for Music Information Retrieval Conference, Taipei, Taiwan, pp. 155-160, 2014.
14	J. Schluter, T. Grill, "Exploring Data Augmentation for Improved Singing Voice Detection with Neural Networks", Proceeding of International Society for Music Information Retrieval Conference, Malaga, Spain, pp. 121-126, 2015.
15	K. Lee, K. Choi, J. Nam, "Revisiting Singing Voice Detection: a quantitative review and the future outlook", Proceeding of International Society for Music Information Retrieval Conference, Paris, France, pp. 506-513, 2018.
16	J. Schluter, "Learning to pinpoint singing voice from weakly labeled examples", Proceeding of International Society for Music Information Retrieval Conference, New York, USA, pp. 44-50, 2016.

KSCI

Investigation of Timbre-related Music Feature Learning using Separated Vocal Signals 분리된 보컬을 활용한 음색기반 음악 특성 탐색 연구

Investigation of Timbre-related Music Feature Learning using Separated Vocal Signals