[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2021.11.012

Efficient Visual Place Recognition by Adaptive CNN Landmark Matching

Chen, Yutian (Institute of Field Engineering, Army Engineering University of PLA)
Gan, Wenyan (Institute of Field Engineering, Army Engineering University of PLA)
Zhu, Yi (Institute of Field Engineering, Army Engineering University of PLA)
Tian, Hui (Institute of Field Engineering, Army Engineering University of PLA)
Wang, Cong (Institute of Field Engineering, Army Engineering University of PLA)
Ma, Wenfeng (Institute of Field Engineering, Army Engineering University of PLA)
Li, Yunbo (Institute of Command and Control Engineering, Army Engineering University of PLA)
Wang, Dong (Institute of Field Engineering, Army Engineering University of PLA)
He, Jixian (Changsha Vocational and Technical College)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.15, no.11, 2021 , pp. 4084-4104 More about this Journal

Abstract

Visual place recognition (VPR) is a fundamental yet challenging task of mobile robot navigation and localization. The existing VPR methods are usually based on some pairwise similarity of image descriptors, so they are sensitive to visual appearance change and also computationally expensive. This paper proposes a simple yet effective four-step method that achieves adaptive convolutional neural network (CNN) landmark matching for VPR. First, based on the features extracted from existing CNN models, the regions with higher significance scores are selected as landmarks. Then, according to the coordinate positions of potential landmarks, landmark matching is improved by removing mismatched landmark pairs. Finally, considering the significance scores obtained in the first step, robust image retrieval is performed based on adaptive landmark matching, and it gives more weight to the landmark matching pairs with higher significance scores. To verify the efficiency and robustness of the proposed method, evaluations are conducted on standard benchmark datasets. The experimental results indicate that the proposed method reduces the feature representation space of place images by more than 75% with negligible loss in recognition precision. Also, it achieves a fast matching speed in similarity calculation, satisfying the real-time requirement.

Keywords

visual place recognition; CNN; adaptive; landmark; matching;

Citations & Related Records

Reference

1	L. G. Camara, C. Gabert, and L. Preucil, "Highly Robust Visual Place Recognition Through Spatial Matching of CNN Features." in Proc. of IEEE International Conference on Robotics and Automation, pp. 3748-3755, 2020.
2	Z. Chen, F. Maffra, I. Sa, and M. Chli, "Only Look Once, Mining Distinctive Landmarks from CNN for Visual Place Recognition," in Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 9-16, 2017.
3	M. Chancan, L. Hernandez-Nunez, A. Narendra, A. B. Barron, and M. Milford, "A Hybrid Compact Neural Architecture for Visual Place Recognition," IEEE Robot. Auton. Let., vol. 5, no. 2, pp. 993-1000, Apr. 2020. DOI
4	L. G. Camara, and L. Preucil, "Spatio-Semantic ConvNet-Based Visual Place Recognition," in Proc. of European Conference on Mobile Robots, pp. 1-8, 2019.
5	B. Diallo, J. Hu, T. Li, G. A. Khan, and Y. Zhao, "Deep embedding clustering based on contractive autoencoder," Neurocomputing, vol. 433, no. 3, pp. 96-107, Jan. 2021. DOI
6	G. A. Khan, J. Hu, T. Li, B. Diallo, and H. Wang, "Multi-view data clustering via non-negative matrix factorization with manifold regularization," Int. J. Mach. Learn. Cybern., Mar. 2021.
7	K. Simonyan, and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," in Proc. of International Conference on Learning Representations, 2015.
8	Y. Jia, E. Shelhamer, J. Donahue, S. Karayev, J. Long, R. Girshick, S. Guadarrama, and T. Darrell, "Caffe: Convolutional Architecture for Fast Feature Embedding," in Proc. of ACM International Conference on Multimedia, pp. 675-678, 2014.
9	D. G. Lowe, "Distinctive Image Features from Scale-Invariant Keypoints," Int. J. Comput. Vis., vol. 60, no. 2, pp. 91-110, Nov. 2004. DOI
10	Z. C. Lawrence, and P. Dollar, "Edge Boxes: Locating Object Proposals from Edges," in Proc. of European Conference on Computer Vision, pp. 391-405, 2014.
11	H. Bay, A. Ess, T. Tuytelaars, and L. V. Gool, "Speeded-Up Robust Features (SURF)," Comput. Vis. Image Underst., vol. 110, no. 3, pp. 346-359, Jun. 2008. DOI
12	E. Rublee, V. Rabaud, K. Konolige, and G. Bradski, "ORB: An efficient alternative to SIFT or SURF," in Proc. of IEEE International Conference on Computer Vision, pp. 2564-2571, 2011.
13	X. Zhang, L. Wang, and Y. Su, "Visual place recognition: A survey from deep learning perspective," Pattern Recognit., vol. 113, pp. 107760, May 2021. DOI
14	D. Galvez-Lpez, and J. D. Tardos, "Bags of Binary Words for Fast Place Recognition in Image Sequences," IEEE Trans. Robot., vol. 28, no. 5, pp. 1188-1197, Oct. 2012. DOI
15	H. Liu, R. Wang, S. Shan, and X. Chen, "Deep supervised hashing for fast image retrieval," Int. J. Comput. Vis., vol. 127, no. 9, pp. 1217-1234, Sep. 2019. DOI
16	N. Sunderhauf, S. Shirazi, F. Dayoub, B. Upcroft, and M. Milford, "On the performance of ConvNet features for place recognition," in Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 4297-4304, 2015.
17	N. Sunderhauf, S. Shirazi, A. Jacobson, F. Dayoub, E. Pepperell, B. Upcroft, and M. Milford, "Place Recognition with CNN Landmarks: Viewpoint-Robust, Condition-Robust, Training-Free," in Proc. of Robotics: Science and Systems, vol. 11, 2015.
18	G. Neuhold, T. Ollmann, S. R. Bulo, and P. Kontschieder, "The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes," in Proc. of IEEE International Conference on Computer Vision, pp. 5000-5009, 2017.
19	J. A. Hanley, and B. J. McNeil, "The Meaning and Use of the Area under a Receiver Operating Characteristic (ROC) Curve," Radiology, vol. 143, no. 1, pp. 29-36, 1982. DOI
20	M. Zaffar, S. Ehsan, M. Milford, D. Flynn, and K. Mcdonald-Maier, "VPR-bench: an open-source visual place recognition evaluation framework with quantifiable viewpoint and appearance change," in arXiv, 2020.
21	B. Yang, X. Xu, J. Li, and H. Zhang, "Landmark Generation in Visual Place Recognition Using Multi-Scale Sliding Window for Robotics," Appl. Sci., vol. 9, no. 15, pp. 3146-3162, Aug. 2019. DOI
22	M. M. Cheng, Z. Zhang, W. Lin, and P. Torr, "BING: Binarized Normed Gradients for Objectness Estimation at 300fps," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3286-3293, 2014.
23	A. Khaliq, S. Ehsan, Z. Chen, M. Milford, and K. McDonald-Maier, "A Holistic Visual Place Recognition Approach Using Lightweight CNNs for Significant Viewpoint and Appearance Changes," IEEE Trans. Robot., vol. 36, no. 2, pp. 561-569, Apr. 2020. DOI
24	S. Lowry, N. Sunderhauf, P. Newman, J. J. Leonard, D. Cox, P. Corke, and M. J. Milford, "Visual Place Recognition: A Survey," IEEE Trans. Robot., vol. 32, no. 1, pp. 1-19, Feb. 2016. DOI
25	M. Cummins, and P. Newman, "Appearance-only SLAM at large scale with FAB-MAP 2.0," Int. J. Robot. Res., vol. 30, no. 9, pp. 1100-1123, Aug. 2011. DOI
26	A. Torii, R. Arandjelovic, J. Sivic, M. Okutomi, and T. Pajdla, "24/7 place recognition by view synthesis," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 2, pp. 257 - 271, Feb. 2018. DOI
27	Z. Chen, A. Jacobson, N. Sunderhauf, B. Upcroft, L. Liu, C. Shen, I. Reid, and M. Milford, "Deep Learning Features at Scale for Visual Place Recognition," in Proc. of IEEE International Conference on Robotics and Automation, pp. 3223-3230, 2017.
28	R. Arandjelovic, P. Gronat, A. Torii, T. Pajdla, and J. Sivic, "NetVLAD: CNN Architecture for Weakly Supervised Place Recognition," IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 6, pp. 1437-1451, Jun. 2018. DOI
29	L. Chen, S. Jin, and Z. Xia, "Towards a Robust Visual Place Recognition in Large-Scale vSLAM Scenarios Based on a Deep Distance Learning," Sensors, vol. 21, no. 1, Jan. 2021. DOI
30	J. Mao, X. Hu, X. He, L. Zhang, L. Wu, and M. Milford, "Learning to Fuse Multiscale Features for Visual Place Recognition," IEEE Access, vol. 7, pp. 5723-5735, Jan. 2019. DOI
31	Y. Chen, W. Gan, S. Jiao, Y. Xu, and Y. Feng, "Salient Feature Selection for CNN-Based Visual Place Recognition," IEICE Trans. Inf. Syst., vol. 101, no. 12, pp. 3102-3107, Dec. 2018.
32	L. G. Camara, and L. Preucil, "Visual Place Recognition by Spatial Matching of High-Level CNN Features," Robot. Auton. Syst., vol. 133, Nov. 2020.
33	B. Diallo, J. Hu, T. Li, G. A. Khan, and A. S. Hussein, "Multi-view document clustering based on geometrical similarity measurement," Int. J. Mach. Learn. Cybern., Mar. 2021.
34	G. A. Khan, J. Hu, T. Li, B. Diallo, and Y. Zhao, "Multi-view low rank sparse representation method for three-way clustering," Int. J. Mach. Learn. Cybern., Aug. 2021.
35	M. M. Breunig, H. P. Kriegel, R. T. Ng, and J. Sander, "LOF: Identifying Density-Based Local Outliers," in Proc. of ACM SIGMOD International Conference on Management of Data, vol. 29, pp. 93-104, 2000.
36	A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," Communications of the ACM, vol. 60, no. 6, pp. 84-90, 2017. DOI
37	C. Masone and B. Caputo, "A Survey on Deep Visual Place Recognition," IEEE Access, vol. 9, pp. 19516-19547, 2021. DOI
38	D. Zoran, M. Chrzanowski, P. S. Huang, S. Gowal, and P. Kohli, "Towards Robust Image Classification Using Sequential Attention Models," in Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp, 9480-9489, 2020.
39	T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal Loss for Dense Object Detection," IEEE Trans. Pattern Anal. Mach. Intell., vol. 42, no. 2, pp. 318-327, Feb. 2020. DOI
40	Y. Hou, H. Zhang, and S. Zhou, "Convolutional Neural Network-Based Image Representation for Visual Loop Closure Detection," in Proc. of IEEE International Conference on Information and Automation, pp. 2238-2245. 2015.
41	Y. Hou, H. Zhang, and S. Zhou, "Evaluation of Object Proposals and CNN Features for Landmark-Based Visual Place Recognition," J. Intell. Robot. Syst., vol. 92, no. 3, pp. 505-520, Dec. 2018. DOI
42	J. Redmon, and A. Farhadi, "YOLO9000: Better, Faster, Stronger," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 6517-6525, 2017.
43	J. R. Uijlings, K. E. Sande, T. Gevers, and A. W. Smeulders, "Selective Search for Object Recognition," Int. J. Comput. Vis., vol. 104, no. 2, pp. 154-171, Apr. 2013. DOI