[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2016.01.021

Object Classification based on Weakly Supervised E2LSH and Saliency map Weighting

Zhao, Yongwei (China National Digital Switching System Engineering and Technological R&D Center)
Li, Bicheng (China National Digital Switching System Engineering and Technological R&D Center)
Liu, Xin (China National Digital Switching System Engineering and Technological R&D Center)
Ke, Shengcai (China National Digital Switching System Engineering and Technological R&D Center)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.10, no.1, 2016 , pp. 364-380 More about this Journal

Abstract

The most popular approach in object classification is based on the bag of visual-words model, which has several fundamental problems that restricting the performance of this method, such as low time efficiency, the synonym and polysemy of visual words, and the lack of spatial information between visual words. In view of this, an object classification based on weakly supervised E2LSH and saliency map weighting is proposed. Firstly, E2LSH (Exact Euclidean Locality Sensitive Hashing) is employed to generate a group of weakly randomized visual dictionary by clustering SIFT features of the training dataset, and the selecting process of hash functions is effectively supervised inspired by the random forest ideas to reduce the randomcity of E2LSH. Secondly, graph-based visual saliency (GBVS) algorithm is applied to detect the saliency map of different images and weight the visual words according to the saliency prior. Finally, saliency map weighted visual language model is carried out to accomplish object classification. Experimental results datasets of Pascal 2007 and Caltech-256 indicate that the distinguishability of objects is effectively improved and our method is superior to the state-of-the-art object classification methods.

Keywords

Object Classification; Bag of Visual Words; E2LSH; Graph-based Visual Saliency; Visual Language Model Method;

Citations & Related Records

Reference

1	S. Lazebnik, C. Schmid, J. Ponce. "Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2169-2178. October 21-26. 2006. Article (CrossRef Link).
2	T. Chen, K. H. Yap and D.J. Zhang. “Discriminative soft bag-of-visual phrase for mobile landmark recognition,” IEEE Transactions on Multimedia, vol. 16, no. 3, pp. 612-622. April, 2014. Article (CrossRef Link). DOI
3	J. Philbin, O. Chum, M. Isard, et al. "Lost in quantization: Improving particular object retrieval in large scale image databases," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8. June 23-28. 2008. Article (CrossRef Link).
4	W. Jing-yan, L. Yong-ping, Z. Ying, et a1. “Bag-of-features based medical image retrieval via multiple assignment and visual words weighting,” IEEE Transactions on Medical Imaging, vol. 30, no. 11, pp. 1996-2011, November, 2011. Article (CrossRef Link). DOI
5	G. Sharma, F. Jurie. "Learning discriminative spatial representation for image classification," in Proc. of the 22nd British Machine Vision Conference, pp. 1-11. July 08-11, 2011. Article (CrossRef Link).
6	L. Xie, Q. Tian, B. Zhang. “Spatial Pooling of Heterogeneous Features for Image Classification,” IEEE Transactions on Image Processing, vol. 23, no. 5, pp. 1994-2008, May, 2014. Article (CrossRef Link). DOI
7	Wu Lei, Li Ming, Li Z, et al. "Visual language modeling for image classification," in Proc. of the International Workshop on Workshop on Multimedia Information Retrieval. pp. 115-124. June14-17, 2007. Article (CrossRef Link).
8	Wu Lei, Hu Y, Li M, et al. “Scale-Invariant visual language modeling for object categorization,” IEEE Transactions on Multimedia, vol. 11, no. 2, pp. 286-294, February, 2009. Article (CrossRef Link). DOI
9	J. Sivic, A. Zisserman. "Video Google: a text retrieval approach to object matching in videos," in Proc. of 9th IEEE International Conference on Computer Vision, pp. 1470-1477, October 13-16, 2003. Article (CrossRef Link).
10	H. Jegou, M. Douze, C. Schmid. "Packing bag-of features," in Proc. of IEEE 12th International Conference on Computer Vision, pp. 2357-2364, September 29-October 2, 2009. Article (CrossRef Link).
11	Y. Z. Chen, A. Dick, X. Li, et al. “Spatially aware feature selection and weighting for object retrieval,” Image and Vision Computing, vol. 31, no. 6, pp. 935–948, December, 2013. Article (CrossRef Link). DOI
12	J. Y. Wang, H. Bensmail, X. Gao. “Joint learning and weighting of visual vocabulary for bag-of-feature based tissue classification,” Pattern Recognition, vol. 46, no. 3, pp. 3249-3255, June, 2013. Article (CrossRef Link). DOI
13	M. Slaney, M. Casey, ‘Locality-sensitive hashing for finding nearest neighbors,” IEEE Signal Processing Magazine, vol. 25, no. 2, pp. 128-131, March, 2008. Article (CrossRef Link). DOI
14	J. Harel, C. Koch, and P. Perona. Graph-based visual saliency[C], in Proc. of Advances in Neural Information Processing Systems, pp. 545-552, November 12-15, 2007. Article (CrossRef Link).
15	S. Nakamoto and T. Toriu. “Combination way of local properties, classifiers and saliency in bag-of-keypoints approach for generic object recognition,” International Journal of Computer Science and Network Security, vol. 11, no. 1, pp. 35-42, July, 2011. Article (CrossRef Link).
16	M. Datar, N. Immorlica, P. Indyk, V.S. Mirrokni. "Locality-sensitive hashing scheme based on p-stable distributions," in Proc. of the 20th Annual Symposium on Computational Geometry, pp. 253-262, October 21-25, 2004. Article (CrossRef Link).
17	L. Breiman. "Random forests," http://www.stat.berkeley.edu/-breiman/RandomForests/ 2014. 07.
18	L. Itti, C. Koch, and E. Niebur. “A model of saliency-based visual attention for rapid scene analysis,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20, no. 11, pp. 1254–1259, November, 1998. Article (CrossRef Link). DOI
19	B. Geng, L. Yang, and C. Xu. "A study of language model for image retrieval," In: Proc. of IEEE International Conference on Data Mining Workshops, pp. 158-163, December 6-6, 2009. Article (CrossRef Link).
20	F.F. Li, R. Fergus, P. Perona. “Learning generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories,” Computer Vision and Image Understanding, vol. 106, no. 1, pp. 59-70, Augest, 2005. Article (CrossRef Link). DOI
21	J. C. Van Gemert, C. J. Veenman, A. W. M. Smeulders, et al. “Visual word ambiguity,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 7, no. 32, pp. 1271-1283, July, 2010. Article (CrossRef Link). DOI
22	Raphaël Marée, Philippe Denis, Louis Wehenkel, et al. "Incremental indexing and distributed image search using shared randomized dictionaries," in Proc. of MIR 2010, pp. 91-100, May 05-07, 2010. Article (CrossRef Link).
23	O. A. B. Penatti, F. B. Silva, Eduardo Valle, et al. “Visual word spatial arrangement for image retrieval and classification,” Pattern Recognition, vol. 47, no. 1, pp. 705-720, June, 2014. Article (CrossRef Link). DOI
24	D. G. Lowe. “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, April, 2004. Article (CrossRef Link). DOI
25	D. Nister, H. Stewenius. Scalable recognition with a vocabulary tree[C], in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2161-2168June . 17-22, 2006. Article (CrossRef Link).
26	J. Philbin, O. Chum, M. Isard, et a1. "Object retrieval with large vocabularies and fast spatial matching," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, June 17-22, 2007. Article (CrossRef Link).
27	R. J. Zhang, F.S Wei, B. C. Li. “E2LSH based Multiple Kernel Learning Approach for Object Detection,” Neurocomputing, vol. 124, no. 1, pp. 105-110, March, 2014. Article (CrossRef Link). DOI
28	Q. Zheng, W. Gao. “Constructing visual phrases for effective and efficient object-based image retrieval,” ACM Transactions on Multimedia Computing, Communications and Applications, vol. 5, no. 1, pp. 1-19, May, 2008. Article (CrossRef Link). DOI
29	S. hui, L. Zhenbao, Han Junwei et al. “Learning High-Level Feature by Deep Belief Networks for 3-D Model Retrieval and Recognition,” IEEE Transactions on Multimedia, vol. 16, no. 8, pp. 2154-2167, December, 2014. Article (CrossRef Link). DOI
30	M. Everingham, L. Van Gool, C. K. I. Williams, et al. "The PASCAL Visual Object Classes Challenge Results,"http://pascallin.ecs.soton.ac.uk/challenges/VOC/voc2007/results/index.shtml, 08. 2014.