Structural similarity based efficient keyframes extraction from multi-view videos

Hussain, Tanveer;Khan, Salman;Muhammad, Khan;Lee, Mi Young;Baik, Sung Wook;

The Journal of Korean Institute of Next Generation Computing (한국차세대컴퓨팅학회논문지)

Volume 14 Issue 6
/
Pages.7-14
/
2018
/
1975-681X(pISSN)

Korean Institute of Next Generation Computing (한국차세대컴퓨팅학회)

Structural similarity based efficient keyframes extraction from multi-view videos

구조적인 유사성에 기반한 다중 뷰 비디오의 효율적인 키프레임 추출

후세인 탄베르 (세종대학교) ;
칸 살만 (세종대학교) ;
무함마드 칸 (세종대학교) ;
이미영 (세종대학교) ;
백성욱 (세종대학교)

Received : 2018.09.19
Accepted : 2018.12.13
Published : 2018.12.31

⟨ Previous Next ⟩

Abstract

Salient information extraction from multi-view videos is a very challenging area because of inter-view, intra-view correlations, and computational complexity. There are several techniques developed for keyframes extraction from multi-view videos with very high computational complexities. In this paper, we present a keyframes extraction approach from multi-view videos using entropy and complexity information present inside frame. In first step, we extract representative shots of the whole video from each view based on structural similarity index measurement (SSIM) difference value between frames. In second step, entropy and complexity scores for all frames of shots in different views are computed. Finally, the frames with highest entropy and complexity scores are considered as keyframes. The proposed system is subjectively evaluated on available office benchmark dataset and the results are convenient in terms of accuracy and time complexity.

다중 뷰 비디오로부터 두드러진 정보 추출은 인터뷰, 인트라 뷰간 상관관계와 계산 비용 때문에 매우 어려운 영역입니다. 매우 높은 계산 복잡성을 지닌 멀티 뷰 비디오에서 키프레임을 추출하기 위해 개발된 몇 가지 기술이 있습니다. 이 논문에서, 우리는 내부에 존재하는 엔트로피와 복잡한 정보를 사용하여 멀티 뷰 비디오의 키프레임 추출 접근 방식을 제시합니다. 첫 번째 단계에서는 프레임 사이의 SSIM값을 기반으로 각 보기에서 전체 비디오의 대표 샷을 추출합니다. 두 번째 단계에서는 서로 다른 보기의 모든 샷 프레임에 대한 엔트로피와 복잡성 점수가 계산됩니다. 마지막으로 엔트로피와 복잡성 점수가 가장 높은 프레임은 키 프레임으로 간주됩니다. 제안된 시스템은 사용 가능한 Office벤치마크 데이터 세에서 주관적으로 평가되며, 정확성과 시간 복잡성의 측면에서 결과는 편리합니다.

Keywords

Acknowledgement

Supported by : National Research Foundation of Korea (NRF)

References

K. Muhammad, T. Hussain, and S. W. Baik, "Efficient CNN based summarization of surveillance videos for resource-constrained devices," Pattern Recognition Letters, 2018.
K. Pitstick, J. Hansen, M. Klein, E. Morris, and J. Vazquez-Trejo, "Applying video summarization to aerial surveillance," in SPIE Defense+Security, p.10, 2018.
M. Paul and M. M. Salehin, "Spatial and Motion Saliency Prediction Method using Eye Tracker Data for Video Summarization," IEEE Transactions on Circuits and Systems for Video Technology, pp. 1-1, 2018.
Z. Ji, Y. Su, R. Qian, and J. Ma, "Surveillance video summarization based on moving object detection and trajectory extraction," in Signal Processing Systems (ICSPS), 2010 2nd International Conference on, 2010, pp. V2-250-V2-253.
U. Damnjanovic, V. Fernandez, E. Izquierdo, and J. M. Martinez, "Event detection and clustering for surveillance video summarization," in Imag Analysis for Multimedia Interactive Services, 2008. WIAMIS'08. Ninth International Workshop on, pp. 63-66, 2008.
M. Ajmal, M. H. Ashraf, M. Shakir, Y. Abbas, and F. A. Shah, "Video summarization: techniques and classification," in International Conference on Computer Vision and Graphics, pp. 1-13, 2012.
Y. Fu, Y. Guo, Y. Zhu, F. Liu, C. Song, and Z.-H. Zhou, "Multi-view video summarization," IEEE Transactions on Multimedia, vol. 12, pp. 717-729, 2010. https://doi.org/10.1109/TMM.2010.2052025
R. Panda, A. Dasy, and A. K. Roy-Chowdhury, "Video summarization in a multi-view camera network," in Pattern Recognition (ICPR), 2016 23rd International Conference on, pp. 2971-2976, 2016.
Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, et al., "Caffe: Convolutional architecture for fast feature embedding," in Proceedings of the 22nd ACM international conference on Multimedia, pp. 675-678, 2014.
Y. Li and B. Merialdo, "Multi-video summarization based on Video-MMR," in Image Analysis for Multimedia Interactive Services (WIAMIS), 2010 11th International Workshop on, pp. 1-4, 2010.
Y.-G. Jiang and C.-W. Ngo, "Visual word proximity and linguistics for semantic video indexing and near-duplicate retrieval," Computer Vision and Image Understanding, vol. 113, pp. 405-414, 2009. https://doi.org/10.1016/j.cviu.2008.10.002
M. Sajjad, S. Khan, T. Hussain, K. Muhammad, A. K. Sangaiah, A. Castiglione, et al., "CNN-based anti-spoofing two-tier multi-factor authentication system," Pattern Recognition Letters, 2018.
A. Ullah, J. Ahmad, K. Muhammad, M. Sajjad, and S. Baik, Action Recognition in Video Sequences using Deep Bi-directional LSTM with CNN Features vol. PP, 2017.
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Sukthankar, and L. Fei-Fei, "Large-scale video classification with convolutional neural networks," in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp. 1725-1732, 2014.
M. Sajjad, M. Nasir, K. Muhammad, S. Khan, Z. Jan, A. Kumar Sangaiah, et al., Raspberry Pi assisted face recognition framework for enhanced law-enforcement services in smart cities, 2017.
M. Sajjad, M. Nasir, F. U. M. Ullah, K. Muhammad, A. K. Sangaiah, and S. W. Baik, "Raspberry Pi assisted facial expression recognition framework for smart security in law-enforcement services," Information Sciences, 2018.

The Journal of Korean Institute of Next Generation Computing (한국차세대컴퓨팅학회논문지)

Structural similarity based efficient keyframes extraction from multi-view videos

구조적인 유사성에 기반한 다중 뷰 비디오의 효율적인 키프레임 추출

Abstract

Keywords

Acknowledgement

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)