Browse > Article
http://dx.doi.org/10.7776/ASK.2009.28.2.112

Salient Region Detection Algorithm for Music Video Browsing  

Kim, Hyoung-Gook (광운대학교 전파공학과)
Shin, Dong (광운대학교 전파공학과)
Abstract
This paper proposes a rapid detection algorithm of a salient region for music video browsing system, which can be applied to mobile device and digital video recorder (DVR). The input music video is decomposed into the music and video tracks. For the music track, the music highlight including musical chorus is detected based on structure analysis using energy-based peak position detection. Using the emotional models generated by SVM-AdaBoost learning algorithm, the music signal of the music videos is classified into one of the predefined emotional classes of the music automatically. For the video track, the face scene including the singer or actor/actress is detected based on a boosted cascade of simple features. Finally, the salient region is generated based on the alignment of boundaries of the music highlight and the visual face scene. First, the users select their favorite music videos from various music videos in the mobile devices or DVR with the information of a music video's emotion and thereafter they can browse the salient region with a length of 30-seconds using the proposed algorithm quickly. A mean opinion score (MOS) test with a database of 200 music videos is conducted to compare the detected salient region with the predefined manual part. The MOS test results show that the detected salient region using the proposed method performed much better than the predefined manual part without audiovisual processing.
Keywords
Salient region detection; Highlight detection; Automatic music emotion classification; Face detection; Music video browsing;
Citations & Related Records
연도 인용수 순위
  • Reference
1 X. Zhu, Y.Y. Shi, H.-G. Kim and K.-W. Eom, "An integrated music recommendation system," IEEE Transaction on Con-sumer Electronics, vol. 52, no. 3, pp. 917-925, Aug. 2006   DOI   ScienceOn
2 P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," Proc. Computer Vision and Pattern Recognition (CVPR), Netherlands, pp. 511-518, 2001   DOI
3 M. A. Goto, "Chorus-section detecting method for music audio signals," Proc. IEEE International Conference on Acou-stics, Speech, and Signal Processing (ICASSP), New York, U.S.A., pp. 437-440, Apr. 2003
4 C. Xu, X. Shao, N.C. Maddage and M.S. Kankanhalli, "Automatic music video summarization based on audio-visual-text analysis and alignment," Proc. 28th Annual ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador, Brazil, pp. 361-368, 2005   DOI
5 H.-G. Kim, N. Moreau and T. Sikora, “Audio classification based on MPEG-7 spectral basis representations,” IEEE Trans-action Circuits and Systems for Video Technology, vol. 14, no. 5, pp. 716-725, May 2004   DOI   ScienceOn
6 C. H. Yeh and H. H. Lin, "The extraction of popular music chorus structural content analysis," Proc. Industrial Electronics Soceity (IECON): 33rd Annual Conference IEEE, Taipei, Taiwan, pp. 2532-2536, 2007   DOI