DOI QR코드

DOI QR Code

Deep Image Retrieval using Attention and Semantic Segmentation Map

관심 영역 추출과 영상 분할 지도를 이용한 딥러닝 기반의 이미지 검색 기술

  • Received : 2023.01.20
  • Accepted : 2023.03.30
  • Published : 2023.03.30

Abstract

Self-driving is a key technology of the fourth industry and can be applied to various places such as cars, drones, cars, and robots. Among them, localiztion is one of the key technologies for implementing autonomous driving as a technology that identifies the location of objects or users using GPS, sensors, and maps. Locilization can be made using GPS or LIDAR, but it is very expensive and heavy equipment must be mounted, and precise location estimation is difficult for places with radio interference such as underground or tunnels. In this paper, to compensate for this, we proposes an image retrieval using attention module and image segmentation maps using color images acquired with low-cost vision cameras as an input.

자율주행은 4차 산업의 핵심 기술로 차, 드론, 자동차, 로봇 등 다양한 곳에 응용 가능하다. 그 중 위치 추정 기술은 GPS, 센서, 지도 등을 활용하여, 객체나 사용자의 위치를 파악하는 기술로 자율주행을 구현하기 위한 핵심적인 기술 중 하나이다. GPS나 LIDAR 등의 센서를 이용하여 위치 추정이 가능하지만, 이는 매우 고가이고 무거운 장비를 탑재해야 하며 지하 혹은 터널 등 전파 방해가 있는 곳의 경우 정밀한 위치 추정이 어렵다는 단점이 있다. 본 논문에서는 이를 보완하기 위해 저가의 비전 카메라로 획득한 컬러 영상을 입력으로 하여 관심 영역 추출 네트워크와 영상 분할 지도를 이용한 영상 검색 기술을 제안한다.

Keywords

Acknowledgement

This was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP)(NRF-2021R1C1C2005202).

References

  1. Hausler, S., Garg, S., Xu. M., Milford, M., & Fischer, T, "Patch-netvlad: Multi-scale fusion of locally-global descriptors for place recognition," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), 2021. doi: https://doi.org/10.1109/CVPR46437.2021.01392 
  2. G. Tolias, S. Ronan, and J. Herve, "Particular object retrieval with integral max-pooling of CNN activations," arXiv preprint, 2015. doi: https://doi.org/10.48550/arXiv.1511.05879 
  3. A. Gordo, J. Almazan, J. Revaud, & D. Larlus, "End-to-end learning of deep visual representations for image retrieval," International Journal of Computer Vision, Vol. 124, No. 2, pp. 237-254, 2017. doi: https://doi.org/10.1007/s11263-017-1016-8 
  4. J. G. Kwak, Y. Jin, Y. Li, D. Yoon, D. Kim, and H. Ko, "Adverse Weather Image Translation with Asymmetric and Uncertainty-aware GAN," arXiv preprint, 2021. doi: https://doi.org/10.48550/arXiv.2112.04283 
  5. B. Cheng, D. C. Maxwell, Z. Yukun, L. Ting, S. H. Thomas, A. Hartwig, C. Liang, "Panoptic-deeplab: A simple, strong, and fast baseline for bottom-up panoptic segmentation," Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR), 2020. https://doi.org/10.1109/CVPR42600.2020.01249 
  6. M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, B. Schiele, "The cityscapes dataset for semantic urban scene understanding," Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 2016. https://doi.org/10.1109/CVPR.2016.350 
  7. S. Xie, R. Girshick, P. Dollar, Z. Tu, and K. He, "Aggregated residual transformations for deep neural networks," Proceedings of the IEEE conference on computer vision and pattern recognition(CVPR), 2017. doi: https://doi.org/10.1109/CVPR.2017.634 
  8. S. Kim, S. Kim, D. Min, K. Sohn, "Laf-net: Locally adaptive fusion networks for stereo confidence estimation," Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019. doi: https://doi.org/10.1109/CVPR.2019.00029 
  9. S. Woo, J. Park, JY. Lee, I.S. Kwoen, "Cbam: Convolutional block attention module," Proceedings of the European conference on computer vision (ECCV), 2018. doi: https://doi.org/10.1007/978-3-030-01234-2_1