DOI QR코드

DOI QR Code

Artificial Neural Network Method Based on Convolution to Efficiently Extract the DoF Embodied in Images

  • Received : 2021.02.05
  • Accepted : 2021.03.22
  • Published : 2021.03.31

Abstract

In this paper, we propose a method to find the DoF(Depth of field) that is blurred in an image by focusing and out-focusing the camera through a efficient convolutional neural network. Our approach uses the RGB channel-based cross-correlation filter to efficiently classify the DoF region from the image and build data for learning in the convolutional neural network. A data pair of the training data is established between the image and the DoF weighted map. Data used for learning uses DoF weight maps extracted by cross-correlation filters, and uses the result of applying the smoothing process to increase the convergence rate in the network learning stage. The DoF weighted image obtained as the test result stably finds the DoF region in the input image. As a result, the proposed method can be used in various places such as NPR(Non-photorealistic rendering) rendering and object detection by using the DoF area as the user's ROI(Region of interest).

본 논문에서는 카메라의 포커싱과 아웃포커싱에 의해 이미지에서 뿌옇게 표현되는 피사계 심도(Depth of field, DoF) 영역을 효율적인 합성곱 신경망을 통해 찾는 방법을 제안한다. 우리의 접근 방식은 RGB채널기반의 상호-상관 필터를 이용하여 DoF영역을 이미지로부터 효율적으로 분류하고, 합성곱 신경망 네트워크에 학습하기 위한 데이터를 구축하며, 이렇게 얻어진 데이터를 이용하여 이미지-DoF가중치 맵 데이터 쌍을 설정한다. 학습할 때 사용되는 데이터는 이미지와 상호-상관 필터 기반으로 추출된 DoF 가중치 맵을 이용하며, 네트워크 학습 단계에서 수렴률을 높이기 위해 스무딩을 과정을 한번 더 적용한 결과를 사용한다. 테스트 결과로 얻은 DoF 가중치 이미지는 입력 이미지에서 DoF영역을 안정적으로 찾아내며, 제안하는 방법은 DoF영역을 사용자의 ROI(Region of interest)로 활용하여 NPR렌더링, 객체 검출 등 다양한 곳에 활용이 가능하다.

Keywords

References

  1. Gargallo, Pau, and Peter Sturm. "Bayesian 3D modeling from images using multiple depth maps." In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05), vol. 2, pp. 885-891. IEEE, 2005.
  2. Silberman, Nathan, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. "Indoor segmentation and support inference from rgbd images." In European conference on computer vision, pp. 746-760. Springer, Berlin, Heidelberg, 2012.
  3. Hane, Christian, Lionel Heng, Gim Hee Lee, Friedrich Fraundorfer, Paul Furgale, Torsten Sattler, and Marc Pollefeys. "3D visual perception for self-driving cars using a multi-camera system: Calibration, mapping, localization, and obstacle detection." Image and Vision Computing, pp. 14-27, 2017.
  4. Muller, K., Merkle, P. and Wiegand, T., "3-D video representation using depth maps". Proceedings of the IEEE, 99(4), pp.643-656, 2010.
  5. Hane, Christian, Christopher Zach, Jongwoo Lim, Ananth Ranganathan, and Marc Pollefeys. "Stereo depth map fusion for robot navigation." IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1618-1625, 2011.
  6. Rafiee, Gholamreza and Dlay, Satnam Singh and Woo, Wai Lok, Region-of-interest extraction in low depth of field images using ensemble clustering and difference of Gaussian approaches. Pattern Recognition, pp. 2685-2699, 2013.
  7. Park, Jungwoo and Kim, Changick, Extracting focused object from low depth-of-field image sequences. Visual Communications and Image Processing, pp. 60771O, 2006.
  8. Simonyan, Karen and Zisserman, Andrew, Data-driven estimation of cloth simulation models. arXiv preprint arXiv:1409.1556, 2014.
  9. Hadsell, Raia and Sermanet, Pierre and Ben, Jan and Erkan, Ayse and Scoffier, Marco and Kavukcuoglu, Koray and Muller, Urs and LeCun, Yann, Learning long-range vision for autonomous off-road driving. Journal of Field Robotics, pp. 120-144, 2009. https://doi.org/10.1002/rob.20276
  10. Eigen, David, Christian Puhrsch, and Rob Fergus. "Depth map prediction from a single image using a multi-scale deep network." In Advances in neural information processing systems, pp. 2366-2374. 2014.
  11. Laina, Iro, Christian Rupprecht, Vasileios Belagiannis, Federico Tombari, and Nassir Navab. "Deeper depth prediction with fully convolutional residual networks." In 2016 Fourth international conference on 3D vision (3DV), pp. 239-248. IEEE, 2016.
  12. Liu, Fayao, Chunhua Shen, and Guosheng Lin. "Deep convolutional neural fields for depth estimation from a single image." In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 5162-5170. 2015.
  13. Geiger, Andreas, Philip Lenz, and Raquel Urtasun. "Are we ready for autonomous driving? the kitti vision benchmark suite." In 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 3354-3361, 2012.
  14. Li, Zhengqi, and Noah Snavely. "Megadepth: Learning single-view depth prediction from internet photos." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2041-2050. 2018.
  15. Srinivasan, Pratul P., Rahul Garg, Neal Wadhwa, Ren Ng, and Jonathan T. Barron. "Aperture supervision for monocular depth estimation." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6393-6401. 2018.
  16. Suwajanakorn, Supasorn, Carlos Hernandez, and Steven M. Seitz. "Depth from focus with your mobile phone." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3497-3506. 2015.
  17. Barron, Jonathan T., Andrew Adams, YiChang Shih, and Carlos Hernandez. "Fast bilateral-space stereo for synthetic defocus." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4466-4474. 2015.
  18. Ha, Hyowon, Sunghoon Im, Jaesik Park, Hae-Gon Jeon, and In So Kweon. "High-quality depth from uncalibrated small motion clip." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5413-5421. 2016.
  19. Joshi, Neel, and C. Lawrence Zitnick. "Micro-baseline stereo." Technical Report MSR-TR-2014-73, p. 8, 2014.
  20. Klose, Felix, Oliver Wang, Jean-Charles Bazin, Marcus Magnor, and Alexander Sorkine-Hornung. "Sampling based scene-space video processing." ACM Transactions on Graphics (TOG) vol, 34, no. 4, pp. 67, 2015.
  21. Haeberli, Paul, and Kurt Akeley. "The accumulation buffer: hardware support for high-quality rendering." ACM SIGGRAPH computer graphics vol. 24, no. 4, pp. 309-318, 1990. https://doi.org/10.1145/97880.97913
  22. Lee, Sungkil, Elmar Eisemann, and Hans-Peter Seidel. "Real-time lens blur effects and focus control." ACM Transactions on Graphics (TOG) vol. 29, no. 4, pp. 1-7, 2010.
  23. Kraus, Martin, and Magnus Strengert. "Depth-of-field rendering by pyramidal image processing." In Computer graphics forum, vol. 26, no. 3, pp. 645-654. Oxford, UK: Blackwell Publishing Ltd, 2007. https://doi.org/10.1111/j.1467-8659.2007.01088.x
  24. Lee, Sungkil, Gerard Jounghyun Kim, and Seungmoon Choi. "Real-time depth-of-field rendering using anisotropically filtered mipmap interpolation." IEEE Transactions on Visualization and Computer Graphics vol. 15, no. 3, pp. 453-464, 2009. https://doi.org/10.1109/TVCG.2008.106
  25. Yang, Yang, Haiting Lin, Zhan Yu, Sylvain Paris, and Jingyi Yu. "Virtual dslr: High quality dynamic depth-of-field synthesis on mobile platforms." Electronic Imaging, no. 18, pp. 1-9, 2016. https://doi.org/10.2352/issn.2470-1173.2017.18.color-a
  26. Dong, Chao and Loy, Chen Change and He, Kaiming and Tang, Xiaoou, Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, pp. 295-307, 2015.