Browse > Article
http://dx.doi.org/10.5909/JBE.2017.22.6.693

Music Genre Classification using Spikegram and Deep Neural Network  

Jang, Woo-Jin (Dept. of Electronics Engineering, Kwangwoon University)
Yun, Ho-Won (Dept. of Electronics Engineering, Kwangwoon University)
Shin, Seong-Hyeon (Dept. of Electronics Engineering, Kwangwoon University)
Cho, Hyo-Jin (Dept. of Electronics Engineering, Kwangwoon University)
Jang, Won (Dept. of Electronics Engineering, Kwangwoon University)
Park, Hochong (Dept. of Electronics Engineering, Kwangwoon University)
Publication Information
Journal of Broadcast Engineering / v.22, no.6, 2017 , pp. 693-701 More about this Journal
Abstract
In this paper, we propose a new method for music genre classification using spikegram and deep neural network. The human auditory system encodes the input sound in the time and frequency domain in order to maximize the amount of sound information delivered to the brain using minimum energy and resource. Spikegram is a method of analyzing waveform based on the encoding function of auditory system. In the proposed method, we analyze the signal using spikegram and extract a feature vector composed of key information for the genre classification, which is to be used as the input to the neural network. We measure the performance of music genre classification using the GTZAN dataset consisting of 10 music genres, and confirm that the proposed method provides good performance using a low-dimensional feature vector, compared to the current state-of-the-art methods.
Keywords
music genre; genre classification; spikegram; deep neural network;
Citations & Related Records
연도 인용수 순위
  • Reference
1 M. Patil and U. Nemade, "Music Genre Classification Using MFCC, K-NN and SVM Classifier," International Journal of Computer Engineering In Research Trends, Vol.4, No.2, pp.43-47, Feb. 2017.
2 P. Manzagol, T. Bertin-Mahieux and D. Eck, "On The Use of Sparse Time-Relative Auditory Codes for Music," Proceeding of International Society for Music Information Retrieval Conference (ISMIR), pp.603-608, Sep. 2008.
3 G. Tzanetakis and P. Cook, "Musical Genre Classification of Audio Signals," IEEE Transactions on Speech and Audio Processing, Vol.10, No.5, pp. 293-302, July 2002.   DOI
4 E. Smith and M. Lewicki, "Efficient Auditory Coding," Nature, Vol.439, No.7079, pp.978-982, Feb. 2006.   DOI
5 G. Mather, Foundations of Perception, Psychology Press, 2006.
6 J. Tropp and A. Gilbert, "Signal Recovery From Random Measurements Via Orthogonal Matching Pursuit," IEEE Transactions on Information Theory, Vol.53, No.12, Dec. 2007.
7 N. Srivastava, G. Hinton, A. Krizhevsky and R. Salakhutdinov, "Dropout: A Simple Way to Prevent Neural Networks from Overfitting," Journal of Machine Learning Research, Vol.15, No.1, pp.1929-1958, June 2014.
8 M. Henaff, K. Jarrett, K. Kavukcuoglu and Y. LeCun, "Unsupervised Learning of Sparse Features for Scalable Audio Classification," Proceeding of International Society for Music Information Retrieval Conference (ISMIR), pp.681-686, Sep. 2011.
9 S. H. Kim, D. S. Kim and B. W. Suh, "Music Genre Classification Using Multimodal Deep Learning," Proceeding of Human Computer Interaction Korea, pp.389-395, Jan. 2016.
10 D. Bhalke, B. Rajesh and D. Bormane, "Automatic Genre Classification Using Fractional Fourier Transform Based Mel Frequency Cepstral Coefficient and Timbral Features," Archives of Acoustics, Vol.42, No.2, pp.213-222, 2017.   DOI