Browse > Article
http://dx.doi.org/10.9708/jksci.2019.24.03.019

Improving Performance of YOLO Network Using Multi-layer Overlapped Windows for Detecting Correct Position of Small Dense Objects  

Yu, Jae-Hyoung (School of Electronic Engineering, Soongsil University)
Han, Youngjoon (Department of Smart Systems Software, Soongsil University)
Hahn, Hernsoo (School of Electronic Engineering, Soongsil University)
Abstract
This paper proposes a new method using multi-layer overlapped windows to improve the performance of YOLO network which is vulnerable to detect small dense objects. In particular, the proposed method uses the YOLO Network based on the multi-layer overlapped windows to track small dense vehicles that approach from long distances. The method improves the detection performance for location and size of small vehicles. It allows crossing area of two multi-layer overlapped windows to track moving vehicles from a long distance to a short distance. And the YOLO network is optimized so that GPU computation time due to multi-layer overlapped windows should be reduced. The superiority of the proposed algorithm has been proved through various experiments using captured images from road surveillance cameras.
Keywords
Multi-layer Overlapped Window; YOLO network; Small Dense Objects; Crossing Area; Small Vehicle Tracking;
Citations & Related Records
연도 인용수 순위
  • Reference
1 S. Russell and P. Norvig, "Artificial Intelligence : A Modern Approach," NJ, USA: Prentice Hall Press, 2009.
2 O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, and L. Fei-Fei, "ImageNet large scale visual recognition challenge," Int. J. Comput. Vis., vol. 115, no. 3, pp. 211-252, Apr. 2015.   DOI
3 S. Ioffe and C. Szegedy, "Batch normalization : Accelerating deep network training by reducing internal covariate shift," In Proc. ICML, 2015.
4 T. Young, D. Hazarika, S. Poria, and E. Cambria, "Recent Trends in Deep Learning Based Natural Language Processing," arXiv preprint arXiv : 1708.02709. 2017.
5 K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask R-CNN," ICCV, pp. 2980-2988. IEEE, 2017.
6 O. Ronneberger, P. Fischer, and T. Brox, "Unet: Convolutional Networks for Biomedical Image Segmentation," International Conference on Medical image computing and computer-assisted intervention, pp. 234-241, 2015.
7 A. Krizhevsky, I. Sutskever, and G. E. Hinton, "Imagenet classification with deep convolutional neural networks," Advances in neural information processing systems, 2012, pp. 1097-1105.
8 V. Badrinarayanan, A. Kendall, and R. Cipolla, "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation," arXiv preprint arXiv: 1511.00561, 2015.
9 Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, "Gradientbased learning applied to document recognition," Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, 1998.   DOI
10 K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," In Proc. ICLR, 2015.
11 C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going deeper with convolutions," In Proc. CVPR, 2015.
12 R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580-587, 2014.
13 R. Girshick, "Fast r-cnn," Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 1440-1448.
14 W. Liu et al., "SSD: Single shot multibox detector," In Proc. ECCV, pp. 21-37, 2016.
15 J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once : Unified, real-time object detection," In Proc. CVPR, 2016.
16 J. Redmon and A. Farhadi, "YOLO9000 : Better, Faster, Stronger", Proceedings of the IEEE conference on computer vision and pattern recognition, pp.779-788, 2016.
17 J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement", arXiv preprint arXiv : 1804.02767, 2018.