Browse > Article

A Study on the Signal Processing for Content-Based Audio Genre Classification  

윤원중 (단국대학교 컴퓨터과학 및 통계학과)
이강규 (단국대학교 컴퓨터과학 및 통계학과)
박규식 (단국대학교 컴퓨터과학 및 통계학과)
Publication Information
Abstract
In this paper, we propose a content-based audio genre classification algorithm that automatically classifies the query audio into five genres such as Classic, Hiphop, Jazz, Rock, Speech using digital sign processing approach. From the 20 seconds query audio file, the audio signal is segmented into 23ms frame with non-overlapped hamming window and 54 dimensional feature vectors, including Spectral Centroid, Rolloff, Flux, LPC, MFCC, is extracted from each query audio. For the classification algorithm, k-NN, Gaussian, GMM classifier is used. In order to choose optimum features from the 54 dimension feature vectors, SFS(Sequential Forward Selection) method is applied to draw 10 dimension optimum features and these are used for the genre classification algorithm. From the experimental result, we can verify the superior performance of the proposed method that provides near 90% success rate for the genre classification which means 10%∼20% improvements over the previous methods. For the case of actual user system environment, feature vector is extracted from the random interval of the query audio and it shows overall 80% success rate except extreme cases of beginning and ending portion of the query audio file.
Keywords
Audio Genre Classification; Audio Information Retrieval; Audio Signal Processing; SFS; k-NN;
Citations & Related Records
연도 인용수 순위
  • Reference
1 G. Tzanetakis and P. Cook, 'Musical Genre Classification of audio Signals', IEEE Transactions on Speech and Audio Processing, 2002
2 A. Ghias, J. Logan, D. Chamberlin, and B. Smith, 'Query by Humming: Musical Information Retrieval in an Audio Database', ACM Multimedia, pp. 213-236, 1995   DOI
3 M. Melucci and N. Orio, 'Musical Information Retrieval using Melodic Surface', Proceedings of the fourth ACM conference on Digital libraries, pp. 152-160, August 1999   DOI
4 R. J. McNab, L. Smith, I. H. Witten, C. L. Henderson, 'Tune Retrieval in the Multimedia Library', Multimedia Tools and Applications, vol.10, pp. 113-132, 2000   DOI   ScienceOn
5 Lutz Prechelt and Rainer Typke, 'An Interface for Melody Input', ACM Transactions on Computer-Human Interaction, Vol. 8, No.2, pp. 133-149, June 2001   DOI   ScienceOn
6 S. R. Subramanya, A. Youssef, B. Narahari, and R. Simha, 'Automated Classification of Audio Data and Retrieval Based on Audio Classes', International Conference on Computers and Their Applications(ISCA), Cancun, Mexico, April 1999
7 J. M. Gray. An Exploration of Musical Timbre. PhD thesis, Dept. of Psychology, Stanford University, 1975
8 M. J. Carey, E. S. Parris, and H. Lloyd-Thomas, 'A comparison of features for speech, music discrimination', In Proc. ICASSP, pp. 1432-1436, March 1999   DOI
9 J. Makhoul, 'Linear prediction: A tutorial overview', Proceedings of the IEEE, Apr. 1975
10 M. Slaney, 'A critique of pure audition', Computational Auditory Scene Analysis, 1997
11 E. Wold, T. Blum, D. Keislar, and J. Wheaton, 'Content-based classification, search and retrieval of audio', IEEE Multimedia, 3(2), 1996   DOI   ScienceOn
12 G. Tzanetakis and P. Cook. 'Multifeature audio segmentation for browsing and annotation', In Proc. Workshop on applications of signal processing to audio and acoustics(WASPAA), New Paltz, NY, 1999. IEEE   DOI
13 T. Zhang and C. -C. Jay Kuo, 'Hierarchical System for Content-based Audio Classification and Retrieval', Proceedings of SPIE's Conference on Multimedia Storage and Archiving Systems III, SPIE Vol.3527, pp. 398-409, Boston, Nov. 1998   DOI