DOI QR코드

DOI QR Code

Object Segmentation Using ESRGAN and Semantic Soft Segmentation

ESRGAN과 Semantic Soft Segmentation을 이용한 객체 분할

  • Dongsik Yoon (Department of Electronics and Electrical Engineering, Korea University ) ;
  • Noyoon Kwak (Division of Computer Engineering, Baekseok University)
  • 윤동식 (고려대학교 전기전자공학과 ) ;
  • 곽노윤 (백석대학교 컴퓨터공학부)
  • Received : 2022.12.04
  • Accepted : 2023.01.08
  • Published : 2023.02.28

Abstract

This paper is related to object segmentation using ESRGAN(Enhanced Super Resolution GAN) and SSS(Semantic Soft Segmentation). The segmentation performance of the object segmentation method using Mask R-CNN and SSS proposed by the research team in this paper is generally good, but the segmentation performance is poor when the size of the objects is relatively small. This paper is to solve these problems. The proposed method aims to improve segmentation performance of small objects by performing super-resolution through ESRGAN and then performing SSS when the size of an object detected through Mask R-CNN is below a certain threshold. According to the proposed method, it was confirmed that the segmentation characteristics of small-sized objects can be improved more effectively than the previous method.

본 논문은 ESRGAN(Enhanced Super Resolution GAN)과 SSS(Semantic Soft Segmentation)을 이용한 객체 분할에 관한 것이다. 본 논문의 연구진이 앞서 제안한 Mask R-CNN과 SSS를 이용한 객체 분할 방법의 분할 성능은 전반적으로 양호하지만 객체의 크기가 상대적으로 작은 경우 분할 성능이 저조해지는 문제점이 있었다. 본 논문은 이러한 문제점을 해소하기 위한 것이다. 제안된 방법은 Mask R-CNN을 통해 검출된 객체의 크기가 일정 기준치 이하인 경우, ESRGAN을 통해 초해상화를 수행한 후, SSS을 수행함으로써 소형 객체의 분할 성능을 개선하고자 한다. 제안된 방법에 따르면, 기존의 방법에 비해 크기가 작은 객체의 분할 특성을 좀 더 효과적으로 개선할 수 있음을 확인할 수 있었다.

Keywords

Acknowledgement

본 논문은 2022년도 교육부의 재원으로 한국연구재단의 지원을 받아 수행된 지자체-대학 협력기반 지역혁신 사업의 연구과제(2021RIS-004)로 수행되었음.

References

  1. K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," Proceedings of IEEE International Conference on Computer Vision, pp.2980-2988. 2017.
  2. Y. Aksoy, T.-H. Oh, S. Paris, M. Pollefeys, and W. Matusik, "Semantic Soft Segmentation," ACM Trans. Graph., 2018.
  3. D. Yoon and N. Kwak, "Object Segmentation Using MaskR-CNN and Semantic Soft Segmentation," Proceedings of 2020 Winter Conference of Korean Society of Communications and Communications, pp.872-873, Feb. 2020.
  4. K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," Proceedings of Conference on Computer Vision and Pattern Recognition, pp.770-778, 2016.
  5. A. Levin, A. Rav-Acha, and D. Lischinski, "Spectral Matting," IEEE Trans. Pattern Anal. Mach. Intell., Vol.30, No.10, pp.1699-1712, Oct. 2008. https://doi.org/10.1109/TPAMI.2008.168
  6. X. Wang, K. Yu, S. Wu, J. Gu, Y. Liu, C. Dong, C. -C Loy, Y. Qiao, and X. Tang, "ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks," Proceedings of European Conference on Computer Vision, 2018.
  7. Z. Li, C. Peng, G. Yu, X. Zhang, Y. Deng, and J. Sun, "Light-head R-CNN: In Defense of Two-stage Object Detector," arXiv preprint arXiv:1711.07264, 2017.
  8. Z. Cai and N. Vasconcelos, "Cascade R-CNN: Delving into High Quality Object Detection," Proceedings of Conference on Computer Vision and Pattern Recognition, 2018.
  9. P. Purkait, C. Zhao, and C. Zach, "SPP-Net: Deep Absolute Pose Regression with Synthetic Views," arXiv preprint arXiv:1712.03452, 2017.
  10. Z. Li and F. Zhou, "FSSD: Feature Fusion Single Shot Multibox Detector," arXiv preprint arXiv:1712.00960, 2017.
  11. H. Law and J. Deng, "CornerNet: Detecting Objects as Paired Keypoints," Proceedings of European Conference on Computer Vision, 2018.
  12. J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-time Object Detection," Proceedings of Conference on Computer Vision and Pattern Recognition, 2016.
  13. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards Real-time Object Detection with Region Proposal Networks," Proceedings of Conference on Neural Information Processing Systems, 2015.
  14. J. Long, E. Shelhamer, and T. Darrell, "Fully Convolutional Networks for Semantic Segmentation," Proceedings of Conference on Computer Vision and Pattern Recognition, 2015.
  15. A. Paszke, A. Chaurasia, S. Kim, and E. Culurciello, "ENet: A Deep Neural Network Architecture for Real-time Semantic Segmentation," arXiv preprint arXiv:1606.02147, 2016.
  16. A. Chaurasia and E. Culurciello, "LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation," 2017 IEEE Visual Communications and Image Processing, 2017.
  17. E. Romera, J. M. Alvarez, L. M. Bergasa, and R. Arroyo, "ERFNet: Efficient Residual Factorized Convnet for Real-time Semantic Segmentation," IEEE Transactions on Intelligent Transportation Systems, Vol.19, pp.263-272, 2017. https://doi.org/10.1109/TITS.2017.2750080
  18. L. -C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A. L. Yuille, "DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs," IEEE Trans. Pattern Anal. Mach. Intell., Vol.40, 2017.
  19. R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. Susstrunk, "SLIC Superpixels Compared to State-of-the-Art Superpixel Methods," IEEE Trans. Pattern Anal. Mach. Intell., Vol.34, No.11, pp.2274-2281, 2012. https://doi.org/10.1109/TPAMI.2012.120
  20. H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid Scene Parsing Network," Proceedings of Conference on Computer Vision and Pattern Recognition, 2017.
  21. C. Ledig, L. Theis, F. Huszar, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wang, and W. Shi, "Photo-realistic Single Image Super-resolution Using a Generative Adversarial Network," Proceedings of Conference on Computer Vision and Pattern Recognition, pp.4681-4690, 2017.
  22. B. Lim, S. Son, H. Kim, S. Nah, and K. Lee, "Enhanced Deep Residual Networks for Single Image Super-resolution," Proceedings of Conference on Computer Vision and Pattern Recognition, 2017.
  23. A. Jolicoeur-Martineau, "The Relativistic Discriminator: a Key Element Missing from Standard GAN," arXiv preprint arXiv:1807.00734, 2018.