DOI QR코드

DOI QR Code

다중 카메라로 관심선수를 촬영한 동영상에서 베스트 뷰 추출방법

A Best View Selection Method in Videos of Interested Player Captured by Multiple Cameras

  • 홍호탁 (서강대학교 컴퓨터공학과) ;
  • 엄기문 (한국전자통신연구원 스마트미디어연구그룹) ;
  • 낭종호 (서강대학교 컴퓨터공학과)
  • 투고 : 2017.06.05
  • 심사 : 2017.08.22
  • 발행 : 2017.12.15

초록

최근 스포츠 중계에 동원되는 카메라 대수가 증가함에 따라 수많은 카메라 화면 중 순간적으로 최고의 화면을 고르는데 어려움이 있다. 지금까지 스포츠 경기를 촬영한 영상들에서 자동으로 최고의 화면을 선택하는 방법들이 연구되어 왔지만 배경이 고정된 영상들만을 고려해 배경이 움직이는 영상들을 고려하는 연구가 필요하다. 본 논문에서는 각 영상 별로 관심선수를 추적하여 획득한 영상 내 관심선수 영역을 대상으로 관심선수의 활동량, 얼굴 가시성, 다른 선수와의 겹침 정도, 이미지 블러 현상 정도를 매 프레임 마다 정량적으로 나타내어 정량화된 값을 기반으로 최고의 화면을 선택한다. 이렇게 선택된 베스트 뷰를 20명의 일반 사람들에게 베스트 뷰와 워스트 뷰를 선택하게 하여 사람들이 선택한 베스트 뷰, 워스트 뷰와 비교한 결과 베스트 뷰와 일치율이 54.5%로 낮았지만 반대로 워스트 뷰와 일치율이 9%로 확실히 사람들이 선호하지 않는 화면은 선택하지 않는 것을 알 수 있었다.

In recent years, the number of video cameras that are used to record and broadcast live sporting events has increased, and selecting the shots with the best view from multiple cameras has been an actively researched topic. Existing approaches have assumed that the background in video is fixed. However, this paper proposes a best view selection method for cases in which the background is not fixed. In our study, an athlete of interest was recorded in video during motion with multiple cameras. Then, each frame from all cameras is analyzed for establishing rules to select the best view. The frames were selected using our system and are compared with what human viewers have indicated as being the most desirable. For the evaluation, we asked each of 20 non-specialists to pick the best and worst views. The set of the best views that were selected the most coincided with 54.5% of the frame selection using our proposed method. On the other hand, the set of views most selected as worst through human selection coincided with 9% of best view shots selected using our method, demonstrating the efficacy of our proposed method.

키워드

과제정보

연구 과제번호 : 퍼스널 미디어가 연결 공유결합하여 재구성 가능케 하는 복합 모달리티 기반 미디어 응용 프레임워크 개발

연구 과제 주관 기관 : 정보통신기술진흥센터

참고문헌

  1. F. Daniyal and A. Cavallaro, "Multi-camera Scheduling for Video Production," Proc. of 2011 Conference for Visual Media Production, pp. 11-20, 2011.
  2. F. Chen and C. D. Vleeschouwer, “Personalized production of basketball videos from multi-sensored data under limited display resolution,” Journal of Computer Vision and Image Understanding, Vol. 114, No. 6, pp. 667-680, 2010. https://doi.org/10.1016/j.cviu.2010.01.005
  3. F. Daniyal, M. Taj and A. Cavallaro, “Content and task-based view selection from multiple video streams,” Journal of Springer J. Multimedia Tools and Applications, Vol. 2-3, No. 46, pp. 235-258, 2010.
  4. X. Wang, Y. Muramatu, T. Hirayama and K. Mase, "Context-Dependent Viewpoint Sequence Recommendation System for Multi-view Video," Proc. of the IEEE International Symposium on Multimedia, pp. 195-202, 2014.
  5. Y. Muramatsu, T. Hirayama and K. Mase, "Video generation method based on user's tendency of viewpoint selection for multi-view video contents," Proc. of the 5th Augmented Human International Conference, 2014.
  6. P. Kelly, C. O Conaire, C. Kim, and N. E. O'Connor, "Automatic camera selection for activity monitoring in a multi-camera system for tennis," Proc. of 2009 Third ACM/IEEE International Conference on Distributed Smart Cameras, pp. 1-8, 2009.
  7. H. Jiang, S. Fels and J. J. Little, “Optimizing Multiple Object Tracking and Best View Video Synthesis,” Journal of IEEE Transactions on Multimedia, Vol. 10, No. 6, pp. 997-1012, 2008. https://doi.org/10.1109/TMM.2008.2001379
  8. S. Gruenwedel, X. Xie, W. Philips, C.-W. Chen and H. Aghajan, "A Best View Selection in Meetings through Attention Analysis Using a Multi-camera Network," Proc. of Sixth International Conference on Distributed Smart Cameras, pp. 1-6, 2012.
  9. B. Hornler, D. Arsic, B. Schuller and G. Rigoll, "Boosting multi-modal camera selection with semantic features," Proc. of the 2009 IEEE International Conference on Multimedia and Expo, pp. 1298-1301, 2009.
  10. D. Rudoy and L. Zelnik-Manor, "Viewpoint Selection for Human Actions," Journal of Computer Vision, Vol. 97, pp. 243-254, 2012. https://doi.org/10.1007/s11263-011-0484-5
  11. Seeing Machines, faceAPI, http://www.seeingmachines.com/product/faceapi/, 2009.
  12. G.-M. Um, K.-Y. Kim, Y. Kim, K.-S. Cho and J.-H. Jang, "Multi-view Video Acquisition System using Object Tracking-based Multi-PTZ Camera Control," Proc. of The 7th International Conference on Information and Communication Technology Convergence, 2016. (in Korean)
  13. S. Ren, K. He, R. Girshick and J. Sun, "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," Proc. of Advances in Neural Information Processing Systems, pp. 91-99, 2015.
  14. J. F. Henriques, R. Caseiro, P. Martins and J. Batista, “High-Speed Tracking with Kernelized Correlation Filters,” Journal of IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37, No. 3, pp. 583-596, 2015. https://doi.org/10.1109/TPAMI.2014.2345390
  15. J. Dai, K. He and J. Sun, "Instance-aware Semantic Segmentation via Multi-task Network Cascades," Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2016.
  16. P. Viola and M. Jones, "Rapid Object Detection using a Boosted Cascade of Simple Features," Proc. of IEEE Conference on Computer Vision and Pattern Recognition, Vol. 1, pp. 511-518, 2001.