[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2019.10.017

Football match intelligent editing system based on deep learning

Wang, Bin (Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai Institute of Advanced Communication and Data Science, Shanghai University)
Shen, Wei (Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai Institute of Advanced Communication and Data Science, Shanghai University)
Chen, FanSheng (Key Laboratory of Intelligent Infrared Perception, Chinese Academy of Sciences)
Zeng, Dan (Key Laboratory of Specialty Fiber Optics and Optical Access Networks, Joint International Research Laboratory of Specialty Fiber Optics and Advanced Communication, Shanghai Institute of Advanced Communication and Data Science, Shanghai University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.13, no.10, 2019 , pp. 5130-5143 More about this Journal

Abstract

Football (soccer) is one of the most popular sports in the world. A huge number of people watch live football matches by TV or Internet. A football match takes 90 minutes, but viewers may only want to watch a few highlights to save their time. As far as we know, there is no such a product that can be put into use to achieve intelligent highlight extraction from live football matches. In this paper, we propose an intelligent editing system for live football matches. Our system can automatically extract a series of highlights, such as goal, shoot, corner kick, red yellow card and the appearance of star players, from the live stream of a football match. Our system has been integrated into live streaming platforms during the 2018 FIFA World Cup and performed fairly well.

Keywords

live football match; intelligent editing system; highlight extraction; deep learning; object detection;

Citations & Related Records

Reference

1	Boreczky J S, Rowe L A, "Comparison of Video Shot Boundary Detection Techniques," Journal of Electronic Imaging, 5(2), 32-38, 1996.
2	H. Ueda, T. Miyatake, and S. Yoshizawa, "IMPACT: an interactive natural-motion-picture dedicated multimedia authoring system," in Proc. of CHI, ACM, New York, pp. 343-350, 1991.
3	Tian, Z., Huang, W., He, T., He, P., & Qiao, Y., "Detecting text in natural image with connectionist text proposal network," in Proc. of uropean Conference on Computer Vision, 56-72, 2016.
4	Smith R, "An Overview of the Tesseract OCR Engine," in Proc. of International Conference on Document Analysis and Recognition. IEEE Computer Society, 629-633, 2007.
5	Deng J, Guo J, Zafeiriou S, "ArcFace: Additive Angular Margin Loss for Deep Face Recognition," 2018.
6	Zhang K, Zhang Z, Li Z, et al., "Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks," IEEE Signal Processing Letters, 23(10), 1499-1503, 2016. DOI
7	Liu W, Anguelov D, Erhan D, et al., "SSD: Single Shot MultiBox Detector," in Proc. of European Conference on Computer Vision. Springer International Publishing, 21-37, 2016.
8	Carreira J, Zisserman A, "Quo Vadis, Action Recognition? A New Model and the Kinetics Dataset," Computer Vision and Pattern Recognition. IEEE, 4724-4733, 2017.
9	Hu W, Xie N, Li L, et al., "A Survey on Visual Content-Based Video Indexing and Retrieval," IEEE Transactions on Systems Man and Cybernetics Part C (Applications and Reviews), 41(6), 797-819, 2011. DOI
10	Datta R , Joshi D , Li J, et al., "Image retrieval: Ideas, influences, and trends of the new age," Acm Computing Surveys, 40(2), 1-60, 2008. DOI
11	Lew M S, "Content-based Multimedia Information Retrieval : State of the art and challenges," Acm Transactions on Multimedia Computing Communications & Applications, 2(1), 1-19, 2006. DOI
12	A. Ekin and M. Tekalp, "Generic play-break event detection for summarization and hierarchical sports video analysis," in Proc. of 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698)(ICME), Baltimore, MD, USA, pp. 169-172, 2003.
13	Pan H, Van Beek P, Sezan M I, "Detection of slow-motion replay segments in sports video for highlights generation," in Proc. of Acoustics, Speech, & Signal Processing, on IEEE International Conference. IEEE Computer Society, 2001.
14	Pan H, Li B, Sezan M I, "Automatic detection of replay segments in broadcast sports programs by detection of logos in scene transitions," in Proc. of IEEE International Conference on Acoustics. IEEE, 2002.
15	Ekin A , Tekalp A M , Mehrotra R, "Automatic soccer video analysis and summarization," IEEE Transactions on Image Processing, 12(7), 796-807, 2003. DOI
16	Ancona, N., Cicirelli, G., Branca, A., Distante, A, "Goal detection in football by using support vector machines for classification," in Proc. of International Joint Conference on Neural Networks, Vol. 1, 15-19, 2001.
17	Zawbaa H M, El-Bendary N, Ella Hassanien A, Kim T, "Event detection based approach for soccer video summarization using machine learning," International Journal of Multimedia and Ubiquitous Engineering, Vol. 7, no. 2, pp 63-80, April 2012.
18	Fendri E, Ben-Abdallah H, Ben Hamadou A, "A novel approach for soccer video summarization," in Proc. of Second International Conference on Multimedia and Information Technology (MMIT 2010), Kaifeng, China, pp. 138-141, April 24 - 25, 2010.
19	Lotfi E, Pourreza H R, "Event detection and automatic summarization in soccer video," in Proc. of 4th Iranian Conference on Machine Vision and Image Processing (MVIP07), 2007 Mashhad, Iran, 2007.
20	Schoeffmann K, Hopfgartner F, Marques O, et al., "Video browsing interfaces and applications: a review," Spie Reviews, 1(1), 018004, 2010. DOI
21	Tabii Y, Oulad Haj Thami R, "A new method for soccer video summarizing based on shot detection,classification and finite state machine," in Proc. of 5th International Conference: Sciences of Electronic, Technologies of Information and Telecommunications (SETIT 2009), Hammamet, Tunisia, pp. 7-11, March 22-26, 2009
22	Girshick R, Donahue J, Darrell T, et al., "Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation," in Proc. of 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, 2014.
23	Redmon J, Farhadi A, "[IEEE 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - Honolulu, HI (2017.7.21-2017.7.26)] 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) - YOLO9000: Better, Faster, Stronger," 6517-6525, 2017.
24	He K , Zhang X , Ren S , et al., "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," IEEE Transactions on Pattern Analysis & Machine Intelligence, 37(9), 1904-1916, 2015. DOI
25	Ren S, He K, Girshick R, et al., "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Transactions on Pattern Analysis & Machine Intelligence, 39(6), 1137-1149, 2017. DOI
26	Redmon J, Divvala S, Girshick R, et al., "You Only Look Once: Unified, Real-Time Object Detection," 2015.
27	Redmon J , Farhadi A, "YOLOv3: An Incremental Improvement," 2018.
28	Wang H, Schmid C, "Action recognition with improved trajectories," in Proc. of the IEEE international conference on computer vision, 3551-3558, 2013.
29	Simonyan K, Zisserman A, "Two-stream convolutional networks for action recognition in videos," Advances in neural information processing systems, 568-576, 2014.
30	Feichtenhofer C, Pinz A, Zisserman A, "Convolutional two-stream network fusion for video action recognition," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, 1933-1941, 2016.
31	Tran D, Bourdev L, Fergus R, et al., "Learning spatiotemporal features with 3d convolutional networks," in Proc. of the IEEE international conference on computer vision, 4489-4497, 2015.
32	Yan, R., Tang, J., Shu, X., Li, Z., & Tian, Q, "Participation-Contributed Temporal Dynamic Model for Group Activity Recognition," in Proc. of the 26th ACM international conference on Multimedia, 1292-1300, 2018.
33	Qiu Z, Yao T, Mei T, "Learning spatio-temporal representation with pseudo-3d residual networks," in Proc. of 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 5534-5542, 2017.