Browse > Article
http://dx.doi.org/10.3837/tiis.2016.07.022

Enhanced VLAD  

Wei, Benchang (School of Computer Science and Technology, Huazhong University of Science & Technology)
Guan, Tao (School of Computer Science and Technology, Huazhong University of Science & Technology)
Luo, Yawei (School of Computer Science and Technology, Huazhong University of Science & Technology)
Duan, Liya (Institute of Oceanographic Instrumentation, Shandong Academy of Sciences)
Yu, Junqing (School of Computer Science and Technology, Huazhong University of Science & Technology)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.10, no.7, 2016 , pp. 3272-3285 More about this Journal
Abstract
Recently, Vector of Locally Aggregated Descriptors (VLAD) has been proposed to index image by compact representations, which encodes powerful local descriptors and makes significant improvement on search performance with less memory compared against the state of art. However, its performance relies heavily on the size of the codebook which is used to generate VLAD representation. It indicates better accuracy needs higher dimensional representation. Thus, more memory overhead is needed. In this paper, we enhance VLAD image representation by using two level hierarchical-codebooks. It can provide more accurate search performance while keeping the VLAD size unchanged. In addition, hierarchical-codebooks are used to construct multiple inverted files for more accurate non-exhaustive search. Experimental results show that our method can make significant improvement on both VLAD image representation and non-exhaustive search.
Keywords
hierarchical-codebook; enhanced VLAD; projected residual vector quantization; multiple inverted files;
Citations & Related Records
연도 인용수 순위
  • Reference
1 T. Guan, Y.F. He, J. Gao, J.Z. Yang, J.Q. Yu, “On-Device Mobile Visual Location Recognition by Integrating Vision and Inertial Sensors,” IEEE Transactions on Multimedia, vo. 15, no. 7, pp. 1688-1699, Nov. 2013. Article(CrossRef Link)   DOI
2 R. Ji, X. Xie, H. Yao, W. Ma, "Mining City Landmarks from Blogs by Graph Modeling," ACM Multimedia, pp. 105-114, 2009. Article(CrossRef Link)
3 Y. Gao, M. Wang, D. Tao, R. Ji, Q. Dai, “3-D Object Retrieval and Recognition with Hypergraph Analysis,” IEEE Transaction on Image Processing, vol.21, no.9, pp. 4290-4303, Sept. 2012. Article(CrossRef Link)   DOI
4 C. S. Anan and R. Hartley, "Optimized KD-trees for fast image descriptor matching," in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp.1-8, June 23-28, 2008. Article(CrossRef Link)
5 Ji RR, Duan LY, Chen J, Yao HX, Huang TJ et al, "Learning Compact Visual Descriptor for Low Bit Rate Mobile Landmark Search," Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence (IJCA), vol. 34, no. 2 pp. 2456-2463, 2011. Article(CrossRef Link)
6 Ji RR, Duan LY, Chen J, Yao HX, “Location Discriminative Vocabulary Coding for Mobile Landmark Search,” International Journal of Computer Vision (IJCV), vol. 96, no.3, pp. 290-314, February, 2012. Article(CrossRef Link)   DOI
7 Ji RR, Yao HX, Liu WL, Sun XS, Tian Q, “Task Dependent Visual Codebook Compression,” IEEE Transactions on Image Processing, vol. 21, no.4, pp. 2282-2293, April, 2012. Article(CrossRef Link)   DOI
8 Ji RR, Duan LY, Chen J, Huang TJ, Gao W, “Mining Compact Bag-of- Patterns for Low Bit Rate Mobile Visual Search,” IEEE Transactions on Image Processing, vol. 23, no. 7, pp. 3099-3133, July, 2014. Article(CrossRef Link)   DOI
9 Ji RR, Duan LY, Yao HX, Xie LX, Rui Y, et al, “Learning to Distribute Vocabulary Indexing for Scalable Visual Search,” IEEE Transactions on Multimedia, vol. 15, no. 1, pp. 153-166, Jan. 2013. Article(CrossRef Link)   DOI
10 Gao Y, Wang M, Li XL, Wu XD, “Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search,” IEEE Transactions on Image Processing, vol. 22, no. 1, pp. 363-376, Jan. 2013. Article(CrossRef Link)   DOI
11 Ji RR, Gao Y, Hong RC, Liu Q, Tao DC, et al., “Spectral-Spatial Constraint Hyperspectral Image Classification,” IEEE Transactions on Geoscience and Remote Sensing, vol. 52, no. 3, pp. 1811-1824, March, 2014, Article(CrossRef Link)   DOI
12 Guan T, He YF, Gao J, Yang JZ, Yu JQ, “On‐Device Mobile Visual Location Recognition by Integrating Vision and Inertial Sensors,” IEEE Trans, Multimedia, vol. 15, no. 7, pp. 1688 -1699, Nov., 2013. Article(CrossRef Link)   DOI
13 J. Sivic and A. Zisserman, "Video Google: A text retrieval approach to object matching in videos," in Proc. of IEEE International Conference on Computer Vision (ICCV), pp. 1470-1477, Oct. 13-16, 2003. Article(CrossRef Link)
14 H. Jegou, M. Douze, C. Schmid, and P. Perez, "Aggregating local descriptors into a compact image representation," in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304-3311, June 13-18, 2010. Article(CrossRef Link)
15 H. Jegou, F. Perronnin, M. Douze, J. Sanchez, P. Perez, and C. Schmid, “Aggregating local image descriptors into compact codes,” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 34, no. 9, pp.1704–1714, Sept., 2012. Article(CrossRef Link)   DOI
16 H. Jegou, M. Douze, and C. Schmid, “Product quantization for nearest neighbor search,” IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 33, no. 1, pp. 117–128, Jan., 2011. Article(CrossRef Link)   DOI
17 F. Perronnin and C. R. Dance, "Fisher kernels on visual vocabularies for image categorization," in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1-8, June 17-22, 2007. Article(CrossRef Link)
18 F. Perronnin, J.Sanchez, and T. Mensink, "Improving the Fisher kernel for large-scale image classification," European Conference on Computer Vision (ECCV), pp.143-1560, Sept. 5-11, 2010. Article(CrossRef Link)
19 R. Arandjelovic and A. Zisserman, "All about VLAD," in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition(CVPR), pp. 1578-1585, June 23-28, 2013. Article(CrossRef Link)
20 H. Jegou, M. Douze, and C. Schmid, “Improving bag-of-features for large scale image search,” International Journal of Computer Vision (IJCV), vol. 87, no. 3, pp.316–336, May, 2010. Article(CrossRef Link)   DOI
21 J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, "Object retrieval with large vocabularies and fast spatial matching," in Proc. of IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp.1-8, June 17-22, 2007. Article(CrossRef Link)
22 D. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision (IJCV), vol. 60, no. 2, pp. 91–110, Nov., 2004. Article(CrossRef Link)   DOI
23 B. Wei, T. Guan, J. Yu, “Projected residual vector quantization for ANN search,” IEEE multimedia, vol. 21, no. 3, pp. 41-51,June-Sept., 2014. Article(CrossRef Link)   DOI