Browse > Article
http://dx.doi.org/10.7776/ASK.2015.34.3.247

A Study on the Performance of Music Retrieval Based on the Emotion Recognition  

Seo, Jin Soo (Department of Electronic Engineering Gangneung-Wonju National University)
Abstract
This paper presents a study on the performance of the music search based on the automatically recognized music-emotion labels. As in the other media data, such as speech, image, and video, a song can evoke certain emotions to the listeners. When people look for songs to listen, the emotions, evoked by songs, could be important points to consider. However; very little study has been done on the performance of the music-emotion labels to the music search. In this paper, we utilize the three axes of human music perception (valence, activity, tension) and the five basic emotion labels (happiness, sadness, tenderness, anger, fear) in measuring music similarity for music search. Experiments were conducted on both genre and singer datasets. The search accuracy of the proposed emotion-based music search was up to 75 % of that of the conventional feature-based music search. By combining the proposed emotion-based method with the feature-based method, we achieved up to 14 % improvement of search accuracy.
Keywords
Music retrieval; Music emotion recognition; Music similarity;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 M. Casey, R. Veltkamp, M. Goto, M. Leman, C. Rhodes, and M. Slaney, "Content-based music information retrieval: Current directions and future challenges," Proc. IEEE 96, 668-696 (2008).
2 P. Cano, E. Battle, T. Kalker, and J. Haitsma, "A review of audio fingerprinting," J. VLSI Sig. Process. 41, 271-84 (2005).   DOI
3 J. Seo, "A robust audio fingerprinting method based on segmentation boundaries" (in Korean), J. Acoust. Soc. Kr. 31, 260-265 (2012).   DOI   ScienceOn
4 G. Tzanetakis and P. Cook, "Musical genre classification of audio signals," IEEE Speech Audio Process. 10, 293-302 (2002).   DOI   ScienceOn
5 B. Logan and A. Salomon, "A music similarity function based on signal analysis," in Proc. ICME-2001, 745-748 (2001).
6 J. Seo, "A music similarity function based on the centroid model," IECIC Trans. Info. and Sys. 96, 1573-1576 (2013).
7 D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital. Sig. Process. 10, 19-41 (2000).   DOI   ScienceOn
8 C. Cao and M. Li, "Thinkit's submissions for MIREX 2009 audio music classification and similarity tasks," in Proc. ISMIR-2009 (2009).
9 C. Charbuillet, D. Tardieu, and G. Peeters, "GMM supervector for content based music similarity," in Proc. DAFX-2011, 425-428 (2011).
10 W. M. Campbell, D. E. Sturim, and D. A. Reynolds, "Support vector machines using GMM supervectors for speaker verification," IEEE Signal Process. Lett. 13, 308-311 (2006).   DOI   ScienceOn
11 Y. H. Yang, Y. C. Lin, Y. F. Su, and H. H. Chen, "A regression approach to music emotion recognition," IEEE Trans. Audio, Speech, Language Process. 16, 448- 457 (2008).   DOI   ScienceOn
12 T. Eerola, O. Lartillot, and P. Toiviainen, "Prediction of multidimensional emotional ratings in music from audio using multivariate regression models," in Proc. ISMIR-2009, 621-626 (2009).
13 M. Barthet, G. Fazekas, and M Sandler, "Music emotion recognition: from content-to context-based models," From Sounds to Music and Emotions, 228-252 (2013).
14 J. A. Russell, "A circumplex model of affect," J. pers. soc. psychol. 39, 1161-1178 (1980).   DOI
15 E. Bigand, S. Vieillard, F. Madurell, J. Marozeau, and A. Dacquet, "Multidimensional scaling of emotional responses to music: The effect of musical expertise and of the duration of the excerpts," Cognition & Emotion 19, 1113-1139 (2005).   DOI   ScienceOn
16 Y. E. Kim, E. Schmidt, and L. Emelle, "Moodswing: A collaborative game for music mood label collection," in Proc. ISMIR-2008, 231-236 (2008).
17 U. Schimmack and R. Reisenzein, "Experiencing activation: Energetic arousal and tense arousal are not mixtures of valence and activation," Emotion 2, 412-417 (2002).   DOI   ScienceOn
18 J. Skowronek, M. McKinney, and S. van de Par, "A demonstrator for automatic music mood estimation," in Proc. ISMIR-2007, 345-346 (2007).
19 X. Hu, M. Bay, and J. S. Downie, "Creating a simplified music mood classification ground-truth set," in Proc. ISMIR-2007, 309-310 (2007).
20 J. H. Lee and X. Hu, "Generating ground truth for music mood classification using mechanical turk," in Proc. JCDL-2012, 129-138 (2012).
21 O. Lartillot and P. Toiviainen, "A Matlab toolbox for musical feature extraction from audio," in Proc. Digital Audio Effects, 237-244 (2007).
22 W.-J. Yoon, K.-K. Lee, and K.-S. Park, "A Study on the Efficient Feature Vector Extraction for Music Information Retrieval System" (in Korean), J. Acoust. Soc. Kr. 23, 532-539 (2004).
23 C. Park, M. Park, S. Kim, and H. Kim, "Music Identification Using Pitch Histogram and MFCC-VQ Dynamic Pattern" (in Korean), J. Acoust. Soc. Kr. 24, 178-185 (2005).
24 J. Lee, "How similar is too similar?: Exploring users' perceptions of similarity in playlist evaluation," in Proc. ISMIR-2011, 109-114 (2011).
25 A. Novello, M. M. F. McKinney, and A. Kohlrausch, "Perceptual evaluation of inter-song similarity in western popular music," J. New Music Res. 40, 1-26 (2011).   DOI   ScienceOn