Browse > Article
http://dx.doi.org/10.5909/JBE.2020.25.2.143

Video Highlight Prediction Using GAN and Multiple Time-Interval Information of Audio and Image  

Lee, Hansol (Dept. of Media IT Engineering, Graduate School, Seoul National University of Science and Technology)
Lee, Gyemin (Dept. of Media IT Engineering, Graduate School, Seoul National University of Science and Technology)
Publication Information
Journal of Broadcast Engineering / v.25, no.2, 2020 , pp. 143-150 More about this Journal
Abstract
Huge amounts of contents are being uploaded every day on various streaming platforms. Among those videos, game and sports videos account for a great portion. The broadcasting companies sometimes create and provide highlight videos. However, these tasks are time-consuming and costly. In this paper, we propose models that automatically predict highlights in games and sports matches. While most previous approaches use visual information exclusively, our models use both audio and visual information, and present a way to understand short term and long term flows of videos. We also describe models that combine GAN to find better highlight features. The proposed models are evaluated on e-sports and baseball videos.
Keywords
Video highlight; Multimodal model; GAN; Multiple time-interval model; Audio information;
Citations & Related Records
Times Cited By KSCI : 4  (Citation Analysis)
연도 인용수 순위
1 I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, "Generative Adversarial Nets," In NIPS, pp. 2672-2680, 2014, http://papers.nips.cc/paper/5423-generative-adversarial-nets.
2 K. Zhang, WL. Chao, F. Sha, and K. Grauman, "Video Summarization with Long Short-term Memory," European Conference on Computer Vision, Amsterdam, Netherlands, pp. 766-782, 2016, doi:10.1007/978-3-319-46478-7_47.
3 B. Mahasseni, M. Lam, and S. Todorovic, "Unsupervised Video Summarization with Adversarial LSTM Networks," The IEEE Conference on Computer Vision and Pattern Recognition, pp. 2982-2991, 2017, doi: https://doi.org/10.1109/cvpr.2017.318.
4 K. Zhang, K. Grauman, and F. Sha, "Retrospective Encoders for Video Summarization," In ECCV, pp. 383-399, 2018, doi: https://doi.org/10.1007/978-3-030-01237-3_24.
5 K. Zhou, Y. Qiao, and Tao Xiang, "Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward," In Thirty-Second AAAI Conference on Artificial Intelligence, pp. 7582-7589, 2018, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/viewPaper/16395.
6 H. Lee, G. Lee, "Summarizing Long-Length Videos with GANEnhanced Audio/Visual Features," In ICCV workshop, 2019.
7 E. Kim, G. Lee, "Highlight Detection in Personal Broadcasting by Analysing Chat Traffic : Game Contests as a Test Case," Journal of Broadcast Engineering, Vol. 23, No. 2, pp. 218-226, 2018, doi: http://dx.doi.org/10.5909/JBE.2018.23.2.218.   DOI
8 E. Kim, G. Lee, "Video Highlight Prediction Using Multiple Time-Interval Information of Chat and Audio," Journal of Broadcast Engineering, Vol. 24, No. 4, pp. 553-563, 2019, https://doi.org/10.5909/JBE.2019.24.4.1.   DOI
9 Twitch, https://www.twitch.tv/ (accessed Dec. 23, 2019).
10 Kakao TV, https://tv.kakao.com/ (accessed Dec. 23, 2019).
11 A. Krizhevsky, I. Sutskever, and G. Hinton, "Imagenet Classification with Deep Convolutional Neural Networks," In NIPS, 2012, doi: https://doi.org/10.1145/3065386.
12 K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," In CVPR, pp. 770-778, 2016, doi: https://doi.org/10.1109/cvpr.2016.90.
13 Naver-sports, https://sports.news.naver.com/(accepted Dec. 23, 2019).
14 OGN, http://ogn.tving.com/ (accepted Dec. 23, 2019).