DOI QR코드

DOI QR Code

Data Augmentation Scheme for Semi-Supervised Video Object Segmentation

준지도 비디오 객체 분할 기술을 위한 데이터 증강 기법

  • 김호진 (대구경북과학기술원 정보통신융합전공) ;
  • 김동현 (대구경북과학기술원 정보통신융합전공) ;
  • 김정훈 (대구경북과학기술원 정보통신융합전공) ;
  • 임성훈 (대구경북과학기술원 정보통신융합전공)
  • Received : 2021.11.22
  • Accepted : 2021.12.29
  • Published : 2022.01.30

Abstract

Video Object Segmentation (VOS) task requires an amount of labeled sequence data, which limits the performance of the current VOS methods trained with public datasets. In this paper, we propose two effective data augmentation schemes for VOS. The first augmentation method is to swap the background segment to the background from another image, and the other method is to play the sequence in reverse. The two augmentation schemes for VOS enable the current VOS methods to robustly predict the segmentation labels and improve the performance of VOS.

동영상 객체 분할(VOS) 기술은 연속된 레이블링 데이터를 필요로 하며, 현재 공개된 데이터셋으로 훈련된 VOS방법은 그 성능이 제한된다. 이 문제를 해결하기 위해 본 논문에서는 간단하면서도 효과적인 동영상 데이터 증강 기술들을 제안한다. 첫번째 증강 기술은 영상 내에서 객체를 제외한 배경을 다른 영상의 배경으로 대체하는 기법이고, 두번째 기술은 학습될 동영상 데이터의 순서를 무작위 확률로 뒤집어 역 재생되는 영상을 학습시키는 기법이다. 두 증강 기술은 객체 분할 시 배경 정보에 강인한 추정을 가능하게 하였고, 추가 데이터 없이 기존 모델의 성능을 향상시킬 수 있음을 보였다.

Keywords

Acknowledgement

This work was supported by Institute of Information & Communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT) (No.2014-3-00123, Development of High Performance Visual BigData Discovery Platform for Large-Scale Realtime Data Analysis).

References

  1. Xu, N., Yang, L., Fan, Y., Yue, D., Liang, Y., Yang, J., & Huang, T., Youtube-vos: A large-scale video object segmentation benchmark, arXiv preprint arXiv:1809.03327, 2018.
  2. Shorten, C., & Khoshgoftaar, T. M., A survey on image data augmentation for deep learning. Journal of Big Data, 6(1), 1-48, 2019. https://doi.org/10.1186/s40537-018-0162-3
  3. Ghiasi, G., Cui, Y., Srinivas, A., Qian, R., Lin, T. Y., Cubuk, E. D., ... & Zoph, B., Simple copy-paste is a strong data augmentation method for instance segmentation, In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition pp. 2918-2928, 2021.
  4. Yun, S., Han, D., Oh, S. J., Chun, S., Choe, J., & Yoo, Y., Cutmix: Regularization strategy to train strong classifiers with localizable features, In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 6023-6032), 2019.
  5. Caelles, Sergi, et al., "One-shot video object segmentation.", Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.
  6. Xiao, Huaxin, et al, "Monet: Deep motion exploitation for video object segmentation.", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018.
  7. Voigtlaender, Paul, and Bastian Leibe., "Online adaptation of convolutional neural networks for video object segmentation.", arXiv preprint arXiv:1706.09364, 2017.
  8. Perazzi, Federico, et al., "Learning video object segmentation from static images.", Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.
  9. Luiten, J., Voigtlaender, P., & Leibe, B. Premvos: Proposal-generation, refinement and merging for video object segmentation. In Asian Conference on Computer Vision (pp. 565-580). Springer, Cham, December 2018.
  10. Oh, Seoung Wug, et al., "Video object segmentation using space-time memory networks." Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
  11. Cheng, H. K., Tai, Y. W., & Tang, C. K., Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation, arXiv preprint arXiv:2106.05210, 2021.
  12. Yun, S., Oh, S. J., Heo, B., Han, D., & Kim, J., VideoMix: Rethinking Data Augmentation for Video Classification, arXiv preprint arXiv: 2012.03457, 2020.
  13. MISRA, Ishan; ZITNICK, C. Lawrence; HEBERT, Martial., Shuffle and learn: unsupervised learning using temporal order verification. In: European Conference on Computer Vision. Springer, Cham, p. 527-544, 2016.
  14. Lai, Z., & Xie, W., Self-supervised learning for video correspondence flow, arXiv preprint arXiv:1905.00875, 2019.