DOI QR코드

DOI QR Code

Comparative Analysis of Self-supervised Deephashing Models for Efficient Image Retrieval System

효율적인 이미지 검색 시스템을 위한 자기 감독 딥해싱 모델의 비교 분석

  • Received : 2023.08.24
  • Accepted : 2023.11.28
  • Published : 2023.12.31

Abstract

In hashing-based image retrieval, the hash code of a manipulated image is different from the original image, making it difficult to search for the same image. This paper proposes and evaluates a self-supervised deephashing model that generates perceptual hash codes from feature information such as texture, shape, and color of images. The comparison models are autoencoder-based variational inference models, but the encoder is designed with a fully connected layer, convolutional neural network, and transformer modules. The proposed model is a variational inference model that includes a SimAM module of extracting geometric patterns and positional relationships within images. The SimAM module can learn latent vectors highlighting objects or local regions through an energy function using the activation values of neurons and surrounding neurons. The proposed method is a representation learning model that can generate low-dimensional latent vectors from high-dimensional input images, and the latent vectors are binarized into distinguishable hash code. From the experimental results on public datasets such as CIFAR-10, ImageNet, and NUS-WIDE, the proposed model is superior to the comparative model and analyzed to have equivalent performance to the supervised learning-based deephashing model. The proposed model can be used in application systems that require low-dimensional representation of images, such as image search or copyright image determination.

해싱 기반 이미지 검색에서는 조작된 이미지의 해시코드가 원본 이미지와 달라 동일한 이미지 검색이 어렵다. 본 논문은 이미지의 질감, 모양, 색상 등 특징 정보로부터 지각적 해시코드를 생성하는 자기 감독 기반 딥해싱 모델을 제안하고 평가한다. 비교 모델은 오토인코더 기반 변분 추론 모델들이며, 인코더는 완전 연결 계층, 합성곱 신경망과 트랜스포머 모듈 등으로 설계된다. 제안된 모델은 기하학적 패턴을 추출하고 이미지 내 위치 관계를 활용하는 SimAM 모듈을 포함하는 변형 추론 모델이다. SimAM은 뉴런과 주변 뉴런의 활성화 값을 이용한 에너지 함수를 통해 객체 또는 로컬 영역이 강조된 잠재 벡터를 학습할 수 있다. 제안 방법은 표현 학습 모델로 고차원 입력 이미지의 저차원 잠재 벡터를 생성할 수 있으며, 잠재 벡터는 구분 가능한 해시코드로 이진화 된다. CIFAR-10, ImageNet, NUS-WIDE 등 공개 데이터셋의 실험 결과로부터 제안 모델은 비교 모델보다 우수하며, 지도학습 기반 딥해싱 모델과 동등한 성능이 분석되었다.

Keywords

Acknowledgement

이 논문은 2023년도 정부(문화체육관광부)의 재원으로 한국콘텐츠진흥원의 지원을 받아 수행된 연구임(No.2021-ec-9500S2, 교육 콘텐츠에 대한 인공지능 기반 저작권 침해 의심요소 검출 및 대체 재료 콘텐츠 추천 기술 개발).

References

  1. L. Du, A. T. Ho, and R. Cong, "Perceptual hashing for image authentication: A survey," Signal Processing: Image Communication, Vol.81, pp.115713, 2020. 
  2. J. Mao, D. Zhong, Y. Hu, W. Sheng, G. Xiao, and Z. Qu, "An image authentication technology based on depth residual network," Systems Science & Control Engineering, Vol.6, No.1, pp.57-70, 2018. 
  3. D. Kim, S. Heo, J. Kang, H. Kang, and S. Lee, "A photo identification framework to prevent copyright infringement with manipulations," Applied Sciences, Vol.11, No.19, pp. 9194, 2021. 
  4. S. Zhu, C. Zhu, and W. Wang, "A new image encryption algorithm based on chaos and secure hash sha-256," Entropy, Vol.20, No.9, pp.716, 2018. 
  5. L. W. Kang, C. S. Lu, and C. Y. Hsu, "Compressive sensing-based image hashing," in 2009 16th IEEE International Conference on Image Processing (ICIP). IEEE, pp.1285-1288, 2009. 
  6. K. P. Murphy, "Probabilistic Machine Learning: An introduction," MIT Press, 2022.
  7. G. E Hinton, A. Krizhevsky, and S. D. Wang, "Transforming auto-encoders," in Artificial Neural Networks and Machine Learning-ICANN 2011: 21st International Conference on Artificial Neural Networks, Espoo, Finland, June 14-17, 2011, Proceedings, Part I 21. Springer, pp.44-51, 2011. 
  8. D. P. Kingma and M. Welling, "Auto-encoding variational bayes," in 2nd International Conference on Learning Representations, ICLR 2014, Banff, AB, Canada, April 14-16, 2014, Conference Track Proceedings, 2014. 
  9. X. Chen, Y. SUN, M. Zhang, and D. Peng, "Evolving deep convolutional variational autoencoders for image classification," IEEE Transactions on Evolutionary Computation, Vol.25, No.5, pp.815-829, 2020.  https://doi.org/10.1109/TEVC.2020.3047220
  10. A. Dosovitskiy et al., "An image is worth 16×16 words: Transformers for image recognition at scale," arXiv preprint arXiv:2010.11929, 2020. 
  11. L. Yang, R. Y. Zhang, L. Li, and X. Xie, "Simam: A simple, parameter-free attention module for convolutional neural networks," in Proceedings of the 38th International Conference on Machine Learning, Marina Meila and Tong Zhang, Eds. 18-24 Jul 2021, Vol. 139 of Proceedings of Machine Learning Research, pp.11863-11874, PMLR. 
  12. H. Venkateswara, J. Eusebio, S. Chakraborty, and S. Panchanathan, "Deep hashing network for unsupervised domain adaptation," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.5018-5027, 2017.
  13. A. Krizhevsky, I. Sutskever, and G. E. Hinton. "Imagenet classification with deep convolutional neural networks," Advances in Neural Information Processing Systems, Vol.25, 2012. 
  14. K. Simonyan and A. Zisserman. "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014. 
  15. V. Gattupalli, Y. Zhuo, and B. Li, "Weakly supervised deep image hashing through tag embeddings," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp.10375-10384, 2019. 
  16. H. Liu, R. Wang, S. Shan, and X. Chen, "Deep supervised hashing for fast image retrieval," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. 
  17. Z. Cao, M. Long, J. Wang, and P. S. Yu, "Hashnet: Deep learning to hash by continuation," Proceedings of the IEEE International Conference on Computer Vision, 2017.
  18. S. R. Dubey, S. K. Singh, and W. T. Chu. "Vision transformer hashing for image retrieval," 2022 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 2022.