[KSCI] Korea Science Citation Index Service

A Study on the Signal Processing for Content-Based Audio Genre Classification

윤원중 (단국대학교 컴퓨터과학 및 통계학과)
이강규 (단국대학교 컴퓨터과학 및 통계학과)
박규식 (단국대학교 컴퓨터과학 및 통계학과)

Publication Information

Journal of the Institute of Electronics Engineers of Korea SP / v.41, no.6, 2004 , pp. 271-278 More about this Journal

Abstract

In this paper, we propose a content-based audio genre classification algorithm that automatically classifies the query audio into five genres such as Classic, Hiphop, Jazz, Rock, Speech using digital sign processing approach. From the 20 seconds query audio file, the audio signal is segmented into 23ms frame with non-overlapped hamming window and 54 dimensional feature vectors, including Spectral Centroid, Rolloff, Flux, LPC, MFCC, is extracted from each query audio. For the classification algorithm, k-NN, Gaussian, GMM classifier is used. In order to choose optimum features from the 54 dimension feature vectors, SFS(Sequential Forward Selection) method is applied to draw 10 dimension optimum features and these are used for the genre classification algorithm. From the experimental result, we can verify the superior performance of the proposed method that provides near 90% success rate for the genre classification which means 10%∼20% improvements over the previous methods. For the case of actual user system environment, feature vector is extracted from the random interval of the query audio and it shows overall 80% success rate except extreme cases of beginning and ending portion of the query audio file.

Keywords

Audio Genre Classification; Audio Information Retrieval; Audio Signal Processing; SFS; k-NN;

Citations & Related Records

Reference

1	G. Tzanetakis and P. Cook, 'Musical Genre Classification of audio Signals', IEEE Transactions on Speech and Audio Processing, 2002
2	A. Ghias, J. Logan, D. Chamberlin, and B. Smith, 'Query by Humming: Musical Information Retrieval in an Audio Database', ACM Multimedia, pp. 213-236, 1995 DOI
3	M. Melucci and N. Orio, 'Musical Information Retrieval using Melodic Surface', Proceedings of the fourth ACM conference on Digital libraries, pp. 152-160, August 1999 DOI
4	R. J. McNab, L. Smith, I. H. Witten, C. L. Henderson, 'Tune Retrieval in the Multimedia Library', Multimedia Tools and Applications, vol.10, pp. 113-132, 2000 DOI ScienceOn
5	Lutz Prechelt and Rainer Typke, 'An Interface for Melody Input', ACM Transactions on Computer-Human Interaction, Vol. 8, No.2, pp. 133-149, June 2001 DOI ScienceOn
6	S. R. Subramanya, A. Youssef, B. Narahari, and R. Simha, 'Automated Classification of Audio Data and Retrieval Based on Audio Classes', International Conference on Computers and Their Applications(ISCA), Cancun, Mexico, April 1999
7	J. M. Gray. An Exploration of Musical Timbre. PhD thesis, Dept. of Psychology, Stanford University, 1975
8	M. J. Carey, E. S. Parris, and H. Lloyd-Thomas, 'A comparison of features for speech, music discrimination', In Proc. ICASSP, pp. 1432-1436, March 1999 DOI
9	J. Makhoul, 'Linear prediction: A tutorial overview', Proceedings of the IEEE, Apr. 1975
10	M. Slaney, 'A critique of pure audition', Computational Auditory Scene Analysis, 1997
11	E. Wold, T. Blum, D. Keislar, and J. Wheaton, 'Content-based classification, search and retrieval of audio', IEEE Multimedia, 3(2), 1996 DOI ScienceOn
12	G. Tzanetakis and P. Cook. 'Multifeature audio segmentation for browsing and annotation', In Proc. Workshop on applications of signal processing to audio and acoustics(WASPAA), New Paltz, NY, 1999. IEEE DOI
13	T. Zhang and C. -C. Jay Kuo, 'Hierarchical System for Content-based Audio Classification and Retrieval', Proceedings of SPIE's Conference on Multimedia Storage and Archiving Systems III, SPIE Vol.3527, pp. 398-409, Boston, Nov. 1998 DOI

KSCI

A Study on the Signal Processing for Content-Based Audio Genre Classification 내용기반 오디오 장르 분류를 위한 신호 처리 연구

A Study on the Signal Processing for Content-Based Audio Genre Classification