DOI QR코드

DOI QR Code

KNN-based Image Annotation by Collectively Mining Visual and Semantic Similarities

  • Ji, Qian (School of Computer Science and Engineering, Nanjing University of Science and Technology) ;
  • Zhang, Liyan (School of Computer Science, Nanjing University of Aeronautics and Astronautics) ;
  • Li, Zechao (School of Computer Science and Engineering, Nanjing University of Science and Technology)
  • Received : 2016.11.26
  • Accepted : 2017.05.28
  • Published : 2017.09.30

Abstract

The aim of image annotation is to determine labels that can accurately describe the semantic information of images. Many approaches have been proposed to automate the image annotation task while achieving good performance. However, in most cases, the semantic similarities of images are ignored. Towards this end, we propose a novel Visual-Semantic Nearest Neighbor (VS-KNN) method by collectively exploring visual and semantic similarities for image annotation. First, for each label, visual nearest neighbors of a given test image are constructed from training images associated with this label. Second, each neighboring subset is determined by mining the semantic similarity and the visual similarity. Finally, the relevance between the images and labels is determined based on maximum a posteriori estimation. Extensive experiments were conducted using three widely used image datasets. The experimental results show the effectiveness of the proposed method in comparison with state-of-the-arts methods.

Keywords

References

  1. R. Bahmanyar, M.M.D.O Ambar and M. Datcu, "The semantic gap: an exploration of user and computer perspectives in earth observation images," IEEE Geoscience & Remote Sensing Letters, vol. 12, no. 10, pp. 2046-2050, 2015. https://doi.org/10.1109/LGRS.2015.2444666
  2. M. Chen, A. Zheng and K. Weinberger, "Fast image tagging," in Proc. of ICML, pp. 1274-1282, 2013.
  3. H. Fu, Q. Zhang and G. Qiu, "Random forest for image annotation," in Proc. of ECCV, pp. 86-99, 2012.
  4. Y. Verma and C. Jawahar, "Exploring SVM for image annotation in presence of confusing labels," in Proc. of BMVC, 2013.
  5. S. Zhang, J. Huang, Y. Huang, Y. Yu, H. Li and D. N. Metaxas, "Automatic image annotation using group sparsity," in Proc. of CVPR, pp. 3312-3319, 2010.
  6. M. M. Kalayeh, H. Idrees and M. Shah, "Nmf-knn: Image annotation using weighted multi-view non-negative matrix factorization," in Proc. of CVPR, pp. 184-191, 2014.
  7. W. Liu, D. Tao, "Multiview Hessian regularization for image annotation," IEEE Transactions on Image Processing, vol. 22, no. 7, pp. 2676-2687, 2013. https://doi.org/10.1109/TIP.2013.2255302
  8. Y. Yang, F. Wu, F. Nie, et al., "Web and personal image annotation by mining label correlation with relaxed visual graph embedding," IEEE Transactions on Image Processing, vol. 21, no. 3, pp. 1339-1351, 2012. https://doi.org/10.1109/TIP.2011.2169269
  9. R. Hong, M. Wang, Y. Gao, et al., "Image annotation by multiple-instance learning with discriminative feature mapping and selection," IEEE Transaction on Cybernetic, vol. 44, no. 5, pp. 669-680, 2014. https://doi.org/10.1109/TCYB.2013.2265601
  10. M. Alkaoud, I, Ashshohail, M. M. B. Ismail, "Automatic Image Annotation Using Fuzzy Cross-Media Relevance Models," International Journal of Image and Graphics, vol. 2, no. 1, pp. 59-63, 2014.
  11. J. Tang, S. Yan, R. Hong, G. Qi and T. Chua, "Inferring semantic concepts from community-contributed images and noisy tags," in Proc. of ACM Multimedia (MM), pp. 223-232, 2009.
  12. J. Tang, R. Hong, S. Yan, T. Chua, G. Qi and Ramesh Jain, "Image annotation by kNN-Sparse graph-based label propagation over Noisily-Tagged Web Images," ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 2, pp. 135-136, 2011.
  13. Z. Li, J. Tang, "Weakly Supervised Deep Matrix Factorization for Social Image Understanding," IEEE Trans. Image Processing, vol. 26, no. 1, pp. 276-288, 2017. https://doi.org/10.1109/TIP.2016.2624140
  14. P. Duygulu, K. Barnard J.F.G.D Freitas et al, "Object Recognition as Machine Translation: Learning a Lexicon for a Fixed Image Vocabulary," in Proc. of CVPR, pp. 97-112, 2002.
  15. A. Makadia, V. Pavlovic and S. Kumar, "A new baseline for image annotation," in Proc. of ECCV, pp. 316-329, 2008.
  16. L. Von Ahn and L. Dabbish, "Labeling images with a computer game," in Proc. of SIGCHI Conference on Human Factors in Computing Systems, pp. 319-326, 2004.
  17. M. Szummer and R. Picard, "Indoor-outdoor image classification," in Proc. of IEEE international workshop on Contentbased Access of Image and Video Database, pp. 42-51, 1998.
  18. G. Carneiro, A.B. Chan, P.J. Moreno and N. Vasconcelos, "Supervised learning of semantic classes for image annotation and retrieval," IEEE transactions on pattern analysis and machine intelligence, vol. 29, no. 3, pp. 394-410, 2007. https://doi.org/10.1109/TPAMI.2007.61
  19. K. Barnard, P. Duygulu, D. Forsyth, N. De Freitas, D. M. Blei and M. I. Jordan, "Matching words and pictures," Journal of machine learning research, vol. 3, no. 2, pp. 1107-1135, 2003.
  20. A. Vailaya, A. Jain and H. Zhang, "On image classification: city vs. Landscape," Pattern Recognition, pp. 3-8, 1998.
  21. J. Jeon, V. Lavrenko and R. Manmatha, "Automatic image annotation and retrieval using cross-media relevance models," in Proc. of the 26th annual international ACM SIGIR conference on Research and development in information retrieval, ACM, pp. 119-126, 2003.
  22. H.D. Pham, K.H. Kim and S. Choi, "Semi-supervised Learning on Bi-relational Graph for Image Annotation," in Proc. of ICPR, pp. 2465-2470, 2014.
  23. L. Gao, J. Song, F. Nie, et al, "Optimal graph learning with partial tags and multiple features for image and video annotation," in Proc. of CVPR, pp. 4371-4379, 2015.
  24. F. Su and L. Xue, "Graph learning on k nearest neighbors for automatic image annotation," in Proc. of the 5th ACM on International Conference on Multimedia Retrieval, ACM, pp. 403-410, 2015.
  25. M. Guillaumin, T. Mensink, J. Verbeek and C. Schmid, "Tagprop: Discriminative metric learning in nearest neighbor models for image auto-annotation," in Proc. of ICCV, pp. 309-316, 2009.
  26. Y. Verma and C. Jawahar, "Image annotation using metric learning in semantic neighborhoods," in Proc. of ECCV, pp. 836-849, 2012.
  27. S. Feng, R. Manmatham and V. Lavrenko, "Multiple bernoulli relevance models for image and video annotation," in Proc. of CVPR, pp. 1003-1009, 2004.
  28. H. Fu, Q. Zhang and G. Qiu, "Random forest for image annotation," in Proc. of CVPR, pp. 86-99, 2012.
  29. H. Nakayama, "Linear distance metric learning for large-scale generic image recognition," PhD thesis, The University of Tokyo, 2011.
  30. S. Moran and V. Lavrenko, "Sparse kernel learning for image annotation," in Proc. of international conference on multimedia retrieval, pp. 113-120, 2014.
  31. Z. Li, J. Tang, "Weakly Supervised Deep Metric Learning for Community-Contributed Image Retrieval," IEEE Trans. Multimedia, vol. 17, no. 11, pp. 1989-1999, 2015. https://doi.org/10.1109/TMM.2015.2477035

Cited by

  1. A review on visual content-based and users’ tags-based image annotation: methods and techniques vol.79, pp.29, 2017, https://doi.org/10.1007/s11042-020-08862-1
  2. Image Tag Recommendation based on Ranked Categorical Nearest Neighbors and Weighted Tag Features vol.5, pp.6, 2017, https://doi.org/10.25046/aj0506166