Browse > Article
http://dx.doi.org/10.3837/tiis.2019.04.003

A method based on Multi-Convolution layers Joint and Generative Adversarial Networks for Vehicle Detection  

Han, Guang (Engineering Research Center of Wideband Wireless Communication Technique, Ministry of Education, Nanjing University of Posts and Telecommunications)
Su, Jinpeng (Engineering Research Center of Wideband Wireless Communication Technique, Ministry of Education, Nanjing University of Posts and Telecommunications)
Zhang, Chengwei (Engineering Research Center of Wideband Wireless Communication Technique, Ministry of Education, Nanjing University of Posts and Telecommunications)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.13, no.4, 2019 , pp. 1795-1811 More about this Journal
Abstract
In order to achieve rapid and accurate detection of vehicle objects in complex traffic conditions, we propose a novel vehicle detection method. Firstly, more contextual and small-object vehicle information can be obtained by our Joint Feature Network (JFN). Secondly, our Evolved Region Proposal Network (EPRN) generates initial anchor boxes by adding an improved version of the region proposal network in this network, and at the same time filters out a large number of false vehicle boxes by soft-Non Maximum Suppression (NMS). Then, our Mask Network (MaskN) generates an example that includes the vehicle occlusion, the generator and discriminator can learn from each other in order to further improve the vehicle object detection capability. Finally, these candidate vehicle detection boxes are optimized to obtain the final vehicle detection boxes by the Fine-Tuning Network(FTN). Through the evaluation experiment on the DETRAC benchmark dataset, we find that in terms of mAP, our method exceeds Faster-RCNN by 11.15%, YOLO by 11.88%, and EB by 1.64%. Besides, our algorithm also has achieved top2 comaring with MS-CNN, YOLO-v3, RefineNet, RetinaNet, Faster-rcnn, DSSD and YOLO-v2 of vehicle category in KITTI dataset.
Keywords
Vehicle detection; non-maximum suppression; generative adversarial networks; joint feature map; mask occlusion;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Fu, Cheng Yang, et al., "Dssd: Deconvolutional single shot detector," arXiv preprint arXiv:1701.06659, 2017.
2 J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition IEEE, pp.6517-6525, 2017.
3 T. Y. Lin, et al., "Focal loss for dense object detection," IEEE transactions on pattern analysis and machine intelligence, pp.2999-3007, 2018.
4 S. Q. Ren,et al., "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks," IEEE Trans Pattern Anal Mach Intell, vol. 39, no. 6, pp.1137-1149, 2015.   DOI
5 J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," Computer Vision and Pattern Recognition IEEE, pp.3431-3440, 2015.
6 J. Redmon, et al., "You Only Look Once: Unified, Real-Time Object Detection," in Proc. of the IEEE conference on computer vision and pattern recognition, pp.779-788, 2016.
7 W. Liu, et al., "SSD: Single Shot MultiBox Detector," in Proc. of European conference on computer vision, pp.21-37, 2016.
8 K. F. Hussain, M. Afifi, and G. Moussa, "A Comprehensive Study of the Effect of Spatial Resolution and Color of Digital Images on Vehicle Classification," IEEE Transactions on Intelligent Transportation Systems, pp.1-10, 2018.
9 F. Riaz, et al., "A collision avoidance scheme for autonomous vehicles inspired by human social norms," Computers & Electrical Engineering, 2018.
10 C. P. Moate , et al., "Vehicle Detection in Infrared Imagery Using Neural Networks with Synthetic Training Data," Image Analysis and Recognition, 2018.
11 Li, Jianan, et al., "Perceptual Generative Adversarial Networks for Small Object Detection," Computer Vision and Pattern Recognition IEEE, pp.1951-1959, 2017.
12 K. Simonyan, and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," in ICLR, 2015.
13 J. Deng, et al., "ImageNet: A large-scale hierarchical image database," in Proc. of Computer Vision and Pattern Recognition, 2009, CVPR 2009, IEEE Conference on IEEE, pp.248-255, 2009.
14 Y.Q. Jia, E. Shelhamer, J. Donahue, et al., "Caffe: Convolutional architecture for fast feature embedding," in Proc. of the 22nd ACM international conference on Multimedia, ACM, 2014.
15 S. Ioffe, and C. Szegedy, "Batch normalization: accelerating deep network training by reducing internal covariate shift," in Proc. of International Conference on International Conference on Machine Learning, JMLR.org, pp.448-456, 2015.
16 Bodla, Navaneeth, et al., "Soft-NMS - Improving Object Detection with One Line of Code," in Proc. of IEEE International Conference on Computer Vision IEEE Computer Society, pp.5562-5570, 2017.
17 Wen Long yin, et al., "UA-DETRAC: A New Benchmark and Protocol for Multi-Object Detection and Tracking," arXiv preprint arXiv:1511.04136, 2015.
18 Wang, Li, et al., "Evolving boxes for fast vehicle detection," in Proc. of IEEE International Conference on Multimedia and Expo, IEEE, pp.1135-1140, 2017.
19 Dollar, Piotr, et al., "Fast Feature Pyramids for Object Detection," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36, no. 8, pp.1532-1545, 2014.   DOI
20 Cai, Zhaowei, et al., "A Unified Multi-scale Deep Convolutional Neural Network for Fast Object Detection," ECCV, pp.354-370, 2016.
21 T. Kong, et al., "HyperNet: Towards Accurate Region Proposal Generation and Joint Object Detection," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition IEEE Computer Society, pp.845-853, 2016.
22 Girshick R, et al., "YOLOv3: An Incremental Improvement," arXiv preprint arXiv:1804.02767, 2018.
23 R. N. Rajaram, et al., "RefineNet: Refining Object Detectors for Autonomous Driving," IEEE Transactions on Intelligent Vehicles, vol.1, no. 4, pp.358-368, 2016.   DOI
24 L. Li, L. Shao, X. Zhen, et al., "Learning Discriminative Key Poses for Action Recognition," IEEE Transactions on Cybernetics, vol. 43, no. 6, pp.1860-1870, 2013.   DOI
25 Y. Lu, A. Chowdhery, and S. Kandula, "Optasia: A Relational Platform for Efficient Large-Scale Video Analytics," in Proc. of ACM Symposium on Cloud Computing ACM, pp.55-70, 2016.
26 K. Park, D. Lee, and Y. Park, "Video-based detection of street-parking violation," in Proc. of International Conference on Image Processing, Computer Vision, & Pattern Recognition, IPCV 2007, June 25-28, 2007, Las Vegas Nevada, USA DBLP, pp.152-156, 2007.
27 C. Stauffer and W. E. L. Grimson, "Adaptive Background Mixture Models for Real-Time Tracking," Computer Vision and Pattern Recognition, vol. 2, pp.252-258, 1999.
28 H. Azizpour and I. Laptev, "Object Detection Using Strongly-Supervised Deformable Part Models," European Conference on Computer Vision, pp.836-849, 2012.
29 A. Karpathy, et al., "Large-Scale Video Classification with Convolutional Neural Networks," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition IEEE Computer Society, pp.1725-1732, 2014.
30 K. M. He, et al., "Mask R-CNN," in Proc. of IEEE International Conference on Computer Vision, IEEE, pp.2980-2988, 2017.
31 R. Girshick, et al., "Rich feature hierarchies for accurate object detection and semantic segmentation," in Proc. of the IEEE conference on computer vision and pattern recognition, pp.580-587, 2014.
32 R. Girshick, "Fast R-CNN," in Proc. of IEEE International Conference on Computer Vision, IEEE, pp.1440-1448, 2015.