DOI QR코드

DOI QR Code

CAttNet: A Compound Attention Network for Depth Estimation of Light Field Images

  • Dingkang Hua (School of Information, Mechanical and Electrical Engineering, Shanghai Normal University) ;
  • Qian Zhang (School of Information, Mechanical and Electrical Engineering, Shanghai Normal University) ;
  • Wan Liao (School of Information, Mechanical and Electrical Engineering, Shanghai Normal University) ;
  • Bin Wang (School of Information, Mechanical and Electrical Engineering, Shanghai Normal University) ;
  • Tao Yan (School of Mechanical, Electrical & Information Engineering, Putian University)
  • Received : 2022.08.10
  • Accepted : 2023.02.26
  • Published : 2023.08.31

Abstract

Depth estimation is one of the most complicated and difficult problems to deal with in the light field. In this paper, a compound attention convolutional neural network (CAttNet) is proposed to extract depth maps from light field images. To make more effective use of the sub-aperture images (SAIs) of light field and reduce the redundancy in SAIs, we use a compound attention mechanism to weigh the channel and space of the feature map after extracting the primary features, so it can more efficiently select the required view and the important area within the view. We modified various layers of feature extraction to make it more efficient and useful to extract features without adding parameters. By exploring the characteristics of light field, we increased the network depth and optimized the network structure to reduce the adverse impact of this change. CAttNet can efficiently utilize different SAIs correlations and features to generate a high-quality light field depth map. The experimental results show that CAttNet has advantages in both accuracy and time.

Keywords

Acknowledgement

This research was jointly sponsored by the Natural Science Foundation of Fujian Province (No. 2019J01816), the Putian Science and Technology Bureau (No. 2021G2001-8) and New Century Excellent Talents in Fujian Province University (No. 2018JY7RC(PU), Yantao).

References

  1. A. C. Tsai, Y. Y. Ou, W. C. Wu, and J. F. Wang, "Occlusion resistant face detection and recognition system," in Proceedings of 2020 8th International Conference on Orange Technology (ICOT), Daegu, South Korea, 2020, pp. 1-4. https://doi.org/10.1109/ICOT51877.2020.9468767
  2. J. Liu, "Survey of the image recognition based on deep learning network for autonomous driving car," in Proceedings of 2020 5th International Conference on Information Science, Computer Technology and Transportation (ISCTT), Shenyang, China, 2020, pp. 1-6. https://doi.org/10.1109/ISCTT51595.2020.00007
  3. X. F. Han, H. Laga, and M. Bennamoun, "Image-based 3D object reconstruction: state-of-the-art and trends in the deep learning era," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 43, no. 5, pp. 1578-1604, 2021. https://doi.org/10.1109/TPAMI.2019.2954885
  4. H. C. Yang, P. H. Chen, K. W. Chen, C. Y. Lee, and Y. S. Chen, "FADE: feature aggregation for depth estimation with multi-view stereo," IEEE Transactions on Image Processing, vol. 29, pp. 6590-6600, 2020. https://doi.org/10.1109/TIP.2020.2991883
  5. Y. Zhang, H. Lv, Y. Liu, H. Wang, X. Wang, Q. Huang, X. Xiang, and Q. Dai, "Light-field depth estimation via epipolar plane image analysis and locally linear embedding," IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 4, pp. 739-747, 2017. https://doi.org/10.1109/TCSVT.2016.2555778
  6. A. Ak and P. Le-Callet, "Investigating epipolar plane image representations for objective quality evaluation of light field images," in Proceedings of 2019 8th European Workshop on Visual Information Processing (EUVIP), Roma, Italy, 2019, pp. 135-139. https://doi.org/10.1109/EUVIP47703.2019.8946194
  7. W. Zhou, E. Zhou, Y. Yan, L. Lin, and A. Lumsdaine, "Learning depth cues from focal stack for light field depth estimation," in Proceedings of 2019 IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 2019, pp. 1074-1078. https://doi.org/10.1109/ICIP.2019.8804270
  8. C. Shin, H. G. Jeon, Y. Yoon, I. S. Kweon, and S. J. Kim, "EpiNet: a fully-convolutional neural network using epipolar geometry for depth from light field images," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, 2018, pp. 4748-4757. https://doi.org/10.1109/CVPR.2018.00499
  9. K. Honauer, O. Johannsen, D. Kondermann, and B. Goldluecke, "A dataset and evaluation methodology for depth estimation on 4D light fields," in Computer Vision-ACCV 2016. Cham, Switzerland: Springer, 2017, pp. 19-34. https://doi.org/10.1007/978-3-319-54187-7_2
  10. S. Woo, J. Park, J. Y. Lee, and I. S. Kweon, "CBAM: convolutional block attention module," in Computer Vision-ECCV 2018. Cham, Switzerland: Springer, 2018, pp. 3-19. https://doi.org/10.1007/978-3-030-01234-2_1
  11. K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, 2016, pp. 770-778. https://doi.org/10.1109/CVPR.2016.90
  12. Z. Yu, X. Guo, H. Lin, A. Lumsdaine, and J. Yu, "Line assisted light field triangulation and stereo matching," in Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 2013, pp. 2792-2799. https://doi.org/10.1109/ICCV.2013.347
  13. S. Heber and T. Pock, "Shape from light field meets robust PCA," in Computer Vision-ECCV 2014. Cham, Switzerland: Springer, 2014, pp. 751-767. https://doi.org/10.1007/978-3-319-10599-4_48
  14. J. Chen, J. Hou, Y. Ni, and L. P. Chau, "Accurate light field depth estimation with superpixel regularization over partially occluded regions," IEEE Transactions on Image Processing, vol. 27, no. 10, pp. 4889-4900, 2018. https://doi.org/10.1109/TIP.2018.2839524
  15. M. W. Tao, S. Hadap, J. Malik, and R. Ramamoorthi, "Depth from combining defocus and correspondence using light-field cameras," in Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 2013, pp. 673-680. https://doi.org/10.1109/ICCV.2013.89
  16. S. Wanner and B. Goldluecke, "Variational light field analysis for disparity estimation and super-resolution," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 3, pp. 606-619, 2014. https://doi.org/10.1109/TPAMI.2013.147 
  17. H. Sheng, P. Zhao, S. Zhang, J. Zhang, and D. Yang, "Occlusion-aware depth estimation for light field using multi-orientation EPIs," Pattern Recognition, vol. 74, pp. 587-599, 2018. https://doi.org/10.1016/j.patcog.2017.09.010
  18. J. Li and X. Jin, "EPI-neighborhood distribution based light field depth estimation," in Proceedings of 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 2003-2007. https://doi.org/10.1109/ICASSP40776.2020.9053664
  19. Y. J. Tsai, Y. L. Liu, M. Ouhyoung, and Y. Y. Chuang, "Attention-based view selection networks for lightfield disparity estimation," Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 07, pp. 12095-12103, 2020. https://doi.org/10.1609/aaai.v34i07.6888
  20. Y. Li, L. Zhang, Q. Wang, and G. Lafruit, "MANet: multi-scale aggregated network for light field depth estimation," in Proceedings of 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2020, pp. 1998-2002. https://doi.org/10.1109/ICASSP40776.2020.9053532
  21. Y. Li, Q. Wang, L. Zhang, and G. Lafruit, "A lightweight depth estimation network for wide-baseline light fields," IEEE Transactions on Image Processing, 30, 2288-2300, 2021. https://doi.org/10.1109/TIP.2021.3051761
  22. G. Klambauer, T. Unterthiner, A. Mayr, and S. Hochreiter, "Self-normalizing neural networks," Advances in Neural Information Processing Systems, vol. 30, pp. 971-980, 2017.
  23. H. Schilling, M. Diebold, C. Rother, and B. Jahne, "Trust your model: light field depth estimation with inline occlusion handling," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 4530-4538. https://doi.org/10.1109/CVPR.2018.00476
  24. Y. Luo, W. Zhou, J. Fang, L. Liang, H. Zhang, and G. Dai, "EPI-patch based convolutional neural network for depth estimation on 4D light field," in Neural Information Processing. Cham, Switzerland: Springer, 2017, pp. 642-652. https://doi.org/10.1007/978-3-319-70090-8_65