DOI QR코드

DOI QR Code

Study on the Video Stabilizer based on a Triplet CNN and Training Dataset Synthesis

Triplet CNN과 학습 데이터 합성 기반 비디오 안정화기 연구

  • Received : 2020.03.16
  • Accepted : 2020.04.13
  • Published : 2020.05.30

Abstract

The jitter in the digital videos lowers the visibility and degrades the efficiency of image processing and image compressing. In this paper, we propose a video stabilizer architecture based on triplet CNN and a method of synthesizing training datasets based on video synthesis. Compared with a conventional deep-learning video stabilization method, the proposed video stabilizer can reduce wobbling distortion.

영상 내 흔들림은 비디오의 가시성을 떨어뜨리고 영상처리나 영상압축의 효율을 저하시킨다. 최근 디지털 영상처리 분야에 딥러닝이 본격 적용되고 있으나, 비디오 안정화 분야에 딥러닝 적용은 아직 초기 단계이다. 본 논문에서는 Wobbling 왜곡 경감을 위한 triplet 형태의 CNN 기반 비디오 안정화기 구조를 제안하고, 비디오 안정화기 학습을 위한 학습데이터 합성 방법을 제안한다. 제안한 CNN 기반 비디오 안정화기는 기존 딥러닝 기반 비디오 안정화기와 비교되었으며, Wobbling 왜곡은 감소하고 더 안정적인 학습이 이루어지는 결과를 얻었다.

Keywords

References

  1. P. Rawat and J. Singhai, "Efficient Video Stabilization Technique for Hand Held Mobile Videos," International Journal of Signal Processing, Image Processing and Pattern Recognition, Vol. 6, No. 3, pp.17-32, Jun. 2013.
  2. W. J. Freeman, Digital Video Stabilization with Inertial Fusion, Master's Thesis of Virginia Polytechnic Institute, VA, 2013.
  3. F. L. Rosa et al., Optical Image Stabilization (OIS), White paper. STMicroelectronics, 2015.
  4. S. Bayrak, Video Stabilization: Digital and Mechanical Approaches, Master's Thesis of Middle East Technical University, Ankara, Turkey, 2008.
  5. M. J. Smith et al., "Electronic Image Stabilization using Optical Flow with Inertial Fusion," Proceeding of IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, Taiwan, pp.1146-1153, 2010.
  6. J. Xu et al., "Fast feature-based video stabilization without accumulative global motion estimation," IEEE Transactions on Consumer Electronics, Vol. 58, No. 3, pp. 993-999, Sep. 2012, https://ieeexplore.ieee.org/document/6311347. https://doi.org/10.1109/TCE.2012.6311347
  7. B. Pinto and P. R. Anurenjan, "Video stabilization using Speeded Up Robust Features," Proceeding of International Conference on Communications and Signal Processing, Calicut, India, pp. 527-531, 2011.
  8. S. Battiato et al., "SIFT Features Tracking for Video Stabilization," Proceeding of 14th International Conference on Image Analysis and Processing, Modena, Italy, pp. 825-830, 2007.
  9. C. Harris and M. Stephens, "A combined corner and edge detector," Proceeding of Fourth Alvey Vision Conference, Manchester, England, pp. 147-151, 1988.
  10. D. G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," International Journal of Computer Vision, Vol. 60, No. 2, pp. 91-110, 2004, https://doi.org/10.1023/B:VISI.0000029664.99615.94.
  11. K. Veon, Video Stabilization using SIFT Features, Fuzzy Clustering, and Kalman Filtering, Master's Thesis of University of Denver, Denver, CO, 2011.
  12. M. A. Fischler and R. C. Bolles "Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography," Comm. ACM, Vol. 24, No. 6, pp. 381-395, Jun. 1981, https://doi.org/10.1145/358669.358692.
  13. C. Yin et al., "Removing Dynamic 3D Objects from Point Clouds of a Moving RGB-D Camera," Proceeding of International Conference on Information and Automation, Lijiang, China, pp. 1600-1606, 2015.
  14. K. He et al., "Deep Residual Learning for Image Recognition," Proceeding of Computer Vision and Pattern Recognition, Las Vegas, NV, pp. 770-778, 2016.
  15. S. Zagoruyko and N. Komodakis, "Learning to Compare Image Patches via Convolutional Neural Networks," Proceeding of Computer Vision and Pattern Recognition, Apr. 2015.
  16. E. Hoffer and N. Ailon, "Deep metric learning using Triplet network," Proceeding of Computer Vision and Pattern Recognition, Mar. 2015.
  17. M. Jaderberg et al., "Spatial Transformer Networks," Proceeding of the 28th International Conference on Neural Information Processing Systems, Montreal, Canada, pp. 2017-2025, 2015.
  18. M. Wang et al., "Deep Online Video Stabilization," arXiv, Feb. 2018.
  19. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once:Unified, Real-Time Object Detection," Proceeding of Computer Vision and Pattern Recognition, Las Vegas, NV, pp. 779-788, 2016.
  20. H. Qu, L. Song, and G. Xue, "Shaking video synthesis for video stabilization performance assessment," Proceeding of Visual Communications and Image Processing, Kuching, Malaysia, pp. 1-6, 2013.
  21. S.-P. Lu et al., "Synthesis of Shaking Video Using Motion Capture Data and Dynamic 3D Scene Modeling," Proceeding of 25th IEEE International Conference on Image Processing, Athens, Greece, pp. 1438-1442, 2018.
  22. M. Grundmann, V. Kwatra, and I. Essa, "Auto-Directed Video Stabilization with Robust L1 Optimal Camera Paths," Proceeding of Computer Vision and Pattern Recognition, Colorado Springs, CO, pp. 225-232, 2011.