DOI QR코드

DOI QR Code

A Study on Various Attention for Improving Performance in Single Image Super Resolution

초고해상도 복원에서 성능 향상을 위한 다양한 Attention 연구

  • 문환복 (국민대학교 컴퓨터공학과) ;
  • 윤상민 (국민대학교 소프트웨어융합대학)
  • Received : 2020.09.17
  • Accepted : 2020.10.21
  • Published : 2020.11.30

Abstract

Single image-based super-resolution has been studied for a long time in computer vision because of various applications. Various deep learning-based super-resolution algorithms are introduced recently to improve the performance by reducing side effects like blurring and staircase effects. Most deep learning-based approaches have focused on how to implement the network architecture, loss function, and training strategy to improve performance. Meanwhile, Several approaches using Attention Module, which emphasizes the extracted features, are introduced to enhance the performance of the network without any additional layer. Attention module emphasizes or scales the feature map for the purpose of the network from various perspectives. In this paper, we propose the various channel attention and spatial attention in single image-based super-resolution and analyze the results and performance according to the architecture of the attention module. Also, we explore that designing multi-attention module to emphasize features efficiently from various perspectives.

컴퓨터 비전에서 단일 영상 기반의 초고해상도 영상 복원의 중요성과 확장성으로 관련 분야에서 많은 연구가 진행되어 왔으며, 최근 딥러닝에 대한 관심이 증가하면서 딥러닝을 활용한 단안 영상 기반 초고해상도 연구가 활발히 진행되고 있다. 대부분의 딥러닝을 기반으로 하는 단안 영상 기반 초고해상도 복원 연구는 복원 성능을 향상시키기 위해 네트워크의 구조, 손실 함수, 학습 방법에 초점이 맞추어 연구가 진행되었다. 한편, 딥러닝 네트워크를 깊게 쌓지 않고 초고해상도 영상 복원 성능을 향상시키기 위해 추출된 특징 맵을 강조하는 Attention Module에 대한 연구가 다양한 분야에 적용되어 왔다. Attention Module은 다양한 관점에서 네트워크의 목적에 맞는 특징 정보를 강조 및 스케일링 한다. 본 논문에서는 초고해상도 복원 네트워크를 기반으로 다양한 구조의 Channel Attention과 Spatial Attention을 설계하고, 다양한 관점에서 특징 맵을 강조하기 위해 다중 Attention Module 구조를 설계하여 성능을 분석 및 비교한다.

Keywords

Acknowledgement

This work was supported by Institute of Information & communications Technology Planning & Evaluation(IITP) grant funded by the Korea government(MSIT)(No.2020-0-01826, AI 기반 선도적 실전문제해결 연구인재 양성) This work was supported by the National Research Foundation of Korea(NRF) Grant funded by the Korean Government(MSIP)(No. Grant Number - 2015R1A5A7037615) Following(or This research) was results of a study on the "HPC Support" Project, supported by the 'Ministry of Science and ICT' and NIPA.

References

  1. C. Dong, C. Loy, K. He, "Image Super-Resolution Using Deep Convolutional Networks," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol.38, No.2 pp.295-307, Feb 2016. https://doi.org/10.1109/TPAMI.2015.2439281
  2. J. Hu, L. Shen, G. Sun, "Squeeze-and-excitation networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, Utah, USA, pp.7132-7141, 2018.
  3. L. Chen, H. Zhang, J. Xiao, L. Nie, J. Shao, W. Liu, T. Chua, "Sca-cnn: Spatial and channel-wise attention in convolutional networks for image captioning," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, USA, pp.6298-6306, 2017.
  4. Y. Zhang, K. Li, K. Li, L. Wang, B. Zhong, Y. Fu, "Image super-resolution using very deep residual channel attention networks," Proceedings of the European Conference on Computer Vision, Munich, Germany, pp.286-301, 2018.
  5. T. Dai, J. Cai, Y. Zhang, S. Xia, L. Zhang, "Second-order attention network for single image super-resolution," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, pp. 11065-11074, 2019.
  6. J. Kim, J. Kwon, K. Lee, "Accurate image super-resolution using very deep convolutional networks," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp.1646-1654, 2016.
  7. J. Kim, J. Kwon Lee, K. Lee "Deeply-recursive convolutional network for image super-resolution," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp.1637-1645, 2016.
  8. Y. Tai, J. Yang, X. Liu, "Image super-resolution via deep recursive residual network," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, Hawaii, USA, pp. 3147-3155, 2017.
  9. B. Lim, S. Son, H. Kim, S. Nah, K. Lee, "Enhanced deep residual networks for single image super-resolution," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, Hawaii, USA, pp.1132-1140, 2017.
  10. W. Shi, J. Caballero, F. Huszar, J. Totz, A. Aitken, R. Bishop, D. Rueckert, Z. Wang, "Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, pp.1874-1883, 2016.
  11. Y. Tai, J. Yang, X. Liu, C. Xu, "Memnet: A persistent memory network for image restoration," Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp. 4539-4547, 2017.
  12. T. Tong, G. Li, X. Liu, Q. Gao, "Image super-resolution using dense skip connections," Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, pp. 4799-4807, 2017.
  13. Y. Hu, J. Li, Y. Huang, X. Gao, "Channel-wise and spatial feature modulation network for single image super-resolution," IEEE Transactions on Circuits and Systems for Video Technology, Vol.30, No.11, pp.3911-3927, Nov 2020. https://doi.org/10.1109/TCSVT.2019.2915238
  14. Z. Chen, V. Badrinarayanan, C. Lee, A. Rabinovich, "Gradnorm: Gradient normalization for adaptive loss balancing in deep multitask networks," Proceedings of the 35th International Conference on Machine Learning, PMLR, Vol.80, pp.794-803, 2018.
  15. E. Agustsson, R. Timofte, "Ntire 2017 challenge on single image super-resolution: Dataset and study," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, Hawaii, USA, pp.1100-1121, 2017.
  16. J. Song, H. Cho, J. Yoon, S. Yoon, "Structure adaptive total variation minimization-based image decomposition," IEEE Transactions on Circuits and Systems for Video Technology, Vol.28, No.9, pp.2164-2176, Sep 2018. https://doi.org/10.1109/TCSVT.2017.2717542