Browse > Article
http://dx.doi.org/10.5909/JBE.2013.18.2.185

Spatiotemporal Saliency-Based Video Summarization on a Smartphone  

Lee, Won Beom (School of Information and Communication Engineering, Inha University)
Williem, Williem (School of Information and Communication Engineering, Inha University)
Park, In Kyu (School of Information and Communication Engineering, Inha University)
Publication Information
Journal of Broadcast Engineering / v.18, no.2, 2013 , pp. 185-195 More about this Journal
Abstract
In this paper, we propose a video summarization technique on a smartphone, based on spatiotemporal saliency. The proposed technique detects scene changes by computing the difference of the color histogram, which is robust to camera and object motion. Then the similarity between adjacent frames, face region, and frame saliency are computed to analyze the spatiotemporal saliency in a video clip. Over-segmented hierarchical tree is created using scene changes and is updated iteratively using mergence and maintenance energies computed during the analysis procedure. In the updated hierarchical tree, segmented frames are extracted by applying a greedy algorithm on the node with high saliency when it satisfies the reduction ratio and the minimum interval requested by the user. Experimental result shows that the proposed method summaries a 2 minute-length video in about 10 seconds on a commercial smartphone. The summarization quality is superior to the commercial video editing software, Muvee.
Keywords
video summary; video segmentation; scene change detection; saliency; smartphone;
Citations & Related Records
연도 인용수 순위
  • Reference
1 E. P. Bennett and L. McMillan, "Computational time-lapse video," ACM Trans. on Graphics, vol. 26, no. 3, Article No. 102, July 2007.
2 D. DeMenthon, V. Kobla, and D. Doermann, "Video summarization by curve simplification," Proc. of 6th ACM Conference on Multimedia, September 1998.
3 Y. Zhuang, Y. Rui, T. S. Huang, and S. Mehrotra, "Adaptive key frame extraction using unsupervised clustering," Proc. of International Conference on Image Processing, vol. 1, pp. 866-870, October 1998.
4 A. Hanjalic and H. Zhang, "An integrated scheme for automated video abstraction based on unsupervised cluster-validity analysis," IEEE Trans. on Circuits and Systems for Video Technology, vol. 9, no. 8, pp. 1280-1289, December 1999.   DOI   ScienceOn
5 H. S. Chang, S. Sull, and S. U. Lee, "Efficient video indexing scheme for content-based retrieval," IEEE Trans. on Circuits and Systems for Video Technology, vol. 9, no. 8, pp. 1269-1279, December 1999.   DOI   ScienceOn
6 H. W. Kang, X. Q. Chen, Y. Matsushita, and X. Tang, "Space-time video montage," Proc. IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1331-1338, June 2006.
7 Y. Pritch, A. Rav-Acha, and S. Peleg, "Nonchronological video synopsis and indexing," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 30, no. 11, pp. 1971-1984, November 2008.   DOI   ScienceOn
8 T. Mei, B. Yang, S. Q. Yang, and X. S. Hua, "Video collage: presenting a video sequence using a single image," The Visual Computer, vol. 25, no. 1, pp. 39-51, December 2008.
9 C. W. Ngo, Y. F. Ma, and H. J. Zhang, "Video summarization and scene detection by graph modeling," IEEE Trans. on Circuits and Systems for Video Technology, vol. 15, no. 2, pp. 296-305, February 2005.   DOI   ScienceOn
10 Y. Gong and X. Liu, "Video summarization using singular value decomposition," Proc. IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 174-180, June 2000.
11 B. Yu, W. Y. Ma, K. Nahrstedt, and H. J. Zhang, "Video summarization based on user log enhanced link analysis," Proc. of 11th ACM Conference on Multimedia, pp. 382-391, November 2003.
12 L. Herranz and J. M. Martinez, "A framework for scalable summarization of video," IEEE Trans. on Circuits and Systems for Video Technology, vol. 20, no. 9, pp. 1265-1270, September 2010.   DOI   ScienceOn
13 Y. Fu, Y. Guo, Y. Zhu, F. Liu, C. Song, and Z. H. Zhou, "Multi-view video summarization," IEEE Trans. on Multimedia, vol. 12, no. 7, pp. 717-729, November 2010.   DOI   ScienceOn
14 B. L. Tseng, C. Y. Lin, and J. R. Smith, "Video summarization and personalization for pervasive mobile devices," Proc. SPIE : Storage and Retrieval for Media Databases, vol. 4676, pp. 359-784, December 2001.
15 Muvee, http://www.muvee.com/en/
16 L. Itti, C. Koch, and E. Niebur, "A model of saliency based visual attention for rapid scene analysis," IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254-1259, November 1998.   DOI   ScienceOn
17 P. Viola and M. Jones, "Robust real-time face detection," International Journal of Computer Vision, vol. 57, no. 2, pp. 137-154, May 2004.   DOI   ScienceOn
18 L. Xu, C. Lu, Y. Xu, and J. Jia, "Image smoothing via $L_0$ gradient minimization," ACM Trans. on Graphics, vol. 30, no. 6, Article No. 174, December 2011.