[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2019.12.012

Feature Voting for Object Localization via Density Ratio Estimation

Wang, Liantao (College of Internet of Things Engineering, Hohai University)
Deng, Dong (College of Internet of Things Engineering, Hohai University)
Chen, Chunlei (School of Computer Engineering, Weifang University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.13, no.12, 2019 , pp. 6009-6027 More about this Journal

Abstract

Support vector machine (SVM) classifiers have been widely used for object detection. These methods usually locate the object by finding the region with maximal score in an image. With bag-of-features representation, the SVM score of an image region can be written as the sum of its inside feature-weights. As a result, the searching process can be executed efficiently by using strategies such as branch-and-bound. However, the feature-weight derived by optimizing region classification cannot really reveal the category knowledge of a feature-point, which could cause bad localization. In this paper, we represent a region in an image by a collection of local feature-points and determine the object by the region with the maximum posterior probability of belonging to the object class. Based on the Bayes' theorem and Naive-Bayes assumptions, the posterior probability is reformulated as the sum of feature-scores. The feature-score is manifested in the form of the logarithm of a probability ratio. Instead of estimating the numerator and denominator probabilities separately, we readily employ the density ratio estimation techniques directly, and overcome the above limitation. Experiments on a car dataset and PASCAL VOC 2007 dataset validated the effectiveness of our method compared to the baselines. In addition, the performance can be further improved by taking advantage of the recently developed deep convolutional neural network features.

Keywords

Naive-Bayes; feature-voting; object localization; density ratio estimation; feature-scoring;

Citations & Related Records

Reference

1	T. Kanamori, S. Hido, and M. Sugiyama, "Efficient direct density ratio estimation for non-stationarity adaptation and outlier detection," in Proc. of Advances in Neural Information Processing Systems, pp. 809-816, 2008.
2	T. Kanamori, S. Hido, and M. Sugiyama, "A least-squares approach to direct importance estimation," Journal of Machine Learning Research, vol. 10, pp. 1391-1445, 2009.
3	X. Nguyen, M. J. Wainwright, and M. I. Jordan, "Estimating divergence functionals and the likelihood ratio by convex risk minimization," IEEE Transactions on Information Theory, vol. 56, no. 11, pp. 5847-5861, 2010. DOI
4	V. Vapnik, I. Braga, and R. Izmailov, "A constructive setting for the problem of density ratio estimation," in Proc. of SIAM International Conference on Data Mining, pp. 434-442, 2014.
5	M. Yamada, T. Suzuki, T. Kanamori, H. Hachiya, and M. Sugiyama, "Relative density-ratio estimation for robust distribution comparison," in Proc. of Advances in Neural Information Processing Systems, pp. 594-602, 2011.
6	M. Sugiyama, M. Krauledat, and K. Muller, "Covariate shift adaptation by importance weighted cross validation," Journal of Machine Learning Research, vol. 8, pp. 985-1005, 2007.
7	A. J. Smola, L. Song, and C. H. Teo, "Relative novelty detection," in Proc. of International Conference on Artificial Intelligence and Statistics, pp. 536-543, 2009.
8	Y. Kawahara and M. Sugiyama, "Change-point detection in time-series data by direct density-ratio estimation," in Proc. of SIAM International Conference on Data Mining, pp. 389-400, 2009.
9	S. Liu, M. Yamada, N. Collier, and M. Sugiyama, "Change-point detection in time-series data by relative density-ratio estimation," Structural, Syntactic, and Statistical Pattern Recognition, pp. 363-372, 2012.
10	R. B. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," in Proc. of 2014 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2014, Columbus, OH, USA, pp. 580-587, June 23-28, 2014.
11	J. Zhao, D. Meng, and J. Ma, "Density-based region search with arbitrary shape for object localization," IET Computer Vision, vol. 9, no. 6, pp. 943-949, 2015. DOI
12	K. D. Tang, R. Sukthankar, J. Yagnik, and F.-F. Li, "Discriminative segment annotation in weakly labeled video," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2483-2490, 2013.
13	A. Vedaldi and B. Fulkerson, "VLFeat: An open and portable library of computer vision algorithms," in Proc. of the 18th ACM international conference on Multimedia, pp. 1469-1472, 2010.
14	C. H. Lampert, M. B. Blaschko, and T. Hofmann, "Efficient subwindow search: A branch and bound framework for object localization," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 31, no. 12, pp. 2129-2142, 2009. DOI
15	K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," CoRR, vol. abs/1409.1556, 2014.
16	X. Wei, J. Luo, J. Wu, and Z. Zhou, "Selective convolutional descriptor aggregation for fine-grained image retrieval," IEEE Trans. Image Processing, vol. 26, no. 6, pp. 2868-2881, 2017. DOI
17	V. Badrinarayanan, A. Kendall, and R. Cipolla, "Segnet: A deep convolutional encoder-decoder architecture for image segmentation," IEEE Trans. Pattern Anal. Mach. Intell., vol. 39, no. 12, pp. 2481-2495, 2017. DOI
18	L. Wang, J. Lu, X. Li, Z. Huan, J. Liang, and S. Chen, "Learning arbitrary-shape object detector from bounding-box annotation by searching region-graph," Pattern Recognition Letters, vol. 87, pp. 171-176, 2017. DOI
19	K. Chatfield, K. Simonyan, A. Vedaldi, and A. Zisserman, "Return of the devil in the details: Delving deep into convolutional nets," in Proc. of British Machine Vision Conference, BMVC 2014, Nottingham, UK, September 1-5, 2014.
20	J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 4, pp. 640-651, 2017. DOI
21	C. Zhou and J. Yuan, "Arbitrary-shape object localization using adaptive image grids," in Proc. of Computer Vision - ACCV 2012 - 11th Asian Conference on Computer Vision, pp. 71-84, 2012.
22	P. Zhang, D. Wang, H. Lu, H. Wang, and X. Ruan, "Amulet: Aggregating multi-level convolutional features for salient object detection," in Proc. of IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, pp. 202-211, October 22-29, 2017.
23	M. Sugiyama, S. Nakajima, H. Kashima, P. von Bunau, and M. Kawanabe, "Direct importance estimation with model selection and its application to covariate shift adaptation," Annals of the Institute of Statistical Mathematics, vol. 60, no. 4, pp. 699-746, 2008. DOI
24	C. Doersch, A. Gupta, and A. A. Efros, "Mid-level visual element discovery as discriminative mode seeking," in Proc. of Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held, Lake Tahoe, Nevada, United States., pp. 494-502, December 5-8, 2013.
25	H. Lei, G. Jiang, R. Wang, and L. Quan, "Object localization using positive features," Neurocomputing, vol. 171, pp. 463-470, 2016. DOI
26	S. Vijayanarasimhan and K. Grauman, "Efficient region search for object detection," in Proc. of 2011 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1401-1408, 2011.
27	T. Yeh, J. J. Lee, and T. Darrell, "Fast concurrent object localization and recognition," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 280-287, 2009.
28	J. Yuan, Z. Liu, and Y. Wu, "Discriminative subvolume search for efficient action detection," in Proc. of 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2442-2449, 2009.
29	O. Boiman, E. Shechtman, and M. Irani, "In defense of nearest-neighbor based image classification," in Proc. of 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008.
30	R. Behmo, P. Marcombes, A. S. Dalalyan, and V. Prinet, "Towards optimal naive bayes nearest neighbor," in Proc. of European Conference on Computer Vision, pp. 171-184, 2010.
31	S. McCann and D. G. Lowe, "Local naive bayes nearest neighbor for image classification," in Proc. of 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, pp. 3650-3656, June 16-21, 2012.
32	R. Timofte, T. Tuytelaars, and L. J. V. Gool, "Naive bayes image classification: Beyond nearest neighbors," in Proc. of Computer Vision - ACCV 2012 - 11th Asian Conference on Computer Vision, Daejeon, Korea, Revised Selected Papers, Part I, pp. 689-703, November 5-9, 2012.
33	I. Laptev, "On space-time interest points," International Journal of Computer Vision, vol. 64, no. 2, pp. 107-123, 2005. DOI
34	C. H. Lampert, M. B. Blaschko, and T. Hofmann, "Beyond sliding windows: Object localization by efficient subwindow search," in Proc. of 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-8, 2008.
35	C.-Y. Chen and K. Grauman, "Efficient activity detection with max-subgraph search," in Proc. of 2012 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1274-1281, 2012.
36	A. Lehmann, B. Leibe, and L. van Gool, "Feature-centric efficient subwindow search," in Proc. of IEEE International Conference on Computer Vision, pp. 940-947, 2009.
37	D. G. Lowe, "Distinctive image features from scale-invariant keypoints," International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110, 2004. DOI
38	H. Bay, T. Tuytelaars, and L. J. V. Gool, "SURF: speeded up robust features," in Proc. of European Conference on Computer Vision, pp. 404-417, 2006.
39	B. Hariharan, P. A. Arbelaez, R. B. Girshick, and J. Malik, "Hypercolumns for object segmentation and fine-grained localization," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, pp. 447-456, June 7-12, 2015.
40	C. Ma, J. Huang, X. Yang, and M. Yang, "Hierarchical convolutional features for visual tracking," in Proc. of 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, pp. 3074-3082, December 7-13, 2015.
41	M. Sugiyama, T. Suzuki, and T. Kanamori, Density Ratio Estimation in Machine Learning, Cambridge University Press, 2012.
42	A. D. Lehmann, B. Leibe, and L. J. V. Gool, "PRISM: principled implicit shape model," in Proc. of British Machine Vision Conference, pp. 1-11, 2009.
43	J. Yuan, Z. Liu, and Y. Wu, "Discriminative video pattern search for efficient action detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 33, no. 9, pp. 1728-1743, 2011. DOI
44	M. T. Dittrich, G. W. Klau, A. Rosenwald, T. Dandekar, and T. Muller, "Identifying functional modules in protein-protein interaction networks: an integrated exact approach," in Proc. of International Conference on Intelligent Systems for Molecular Biology, pp. 223-231, 2008.