Browse > Article
http://dx.doi.org/10.5909/JBE.2019.24.3.495

Object Tracking Method using Deep Learning and Kalman Filter  

Kim, Gicheol (Information of Departments, Hanbat National University)
Son, Sohee (Information of Departments, Hanbat National University)
Kim, Minseop (Information of Departments, Hanbat National University)
Jeon, Jinwoo (ETRI)
Lee, Injae (ETRI)
Cha, Jihun (ETRI)
Choi, Haechul (Information of Departments, Hanbat National University)
Publication Information
Journal of Broadcast Engineering / v.24, no.3, 2019 , pp. 495-505 More about this Journal
Abstract
Typical algorithms of deep learning include CNN(Convolutional Neural Networks), which are mainly used for image recognition, and RNN(Recurrent Neural Networks), which are used mainly for speech recognition and natural language processing. Among them, CNN is able to learn from filters that generate feature maps with algorithms that automatically learn features from data, making it mainstream with excellent performance in image recognition. Since then, various algorithms such as R-CNN and others have appeared in object detection to improve performance of CNN, and algorithms such as YOLO(You Only Look Once) and SSD(Single Shot Multi-box Detector) have been proposed recently. However, since these deep learning-based detection algorithms determine the success of the detection in the still images, stable object tracking and detection in the video requires separate tracking capabilities. Therefore, this paper proposes a method of combining Kalman filters into deep learning-based detection networks for improved object tracking and detection performance in the video. The detection network used YOLO v2, which is capable of real-time processing, and the proposed method resulted in 7.7% IoU performance improvement over the existing YOLO v2 network and 20 fps processing speed in FHD images.
Keywords
YOLO; Kalman filter; Object tracking; CNN; Deep learning;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Gidaris, Spyros, and Nikos Komodakis. "Object detection via a multi-region and semantic segmentation-aware cnn model." Proceedings of the IEEE International Conference on Computer Vision. 2015.
2 Redmon, Joseph, et al. "You only look once: Unified, real-time object detection." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
3 Redmon, Joseph, and Ali Farhadi. "YOLO9000: better, faster, stronger." Proceedings of the IEEE conference on computer vision and pattern recognition. 2017.
4 Brown, Robert Grover, and Patrick YC Hwang. Introduction to random signals and applied Kalman filtering. Vol. 3. New York: Wiley, 1992.
5 Ristic, Branko, Sanjeev Arulampalam, and Neil Gordon. "Beyond the Kalman filter." IEEE Aerospace and Electronic Systems Magazine 19.7 (2004): 37-38.
6 Haykin, Simon. Kalman filtering and neural networks. Vol. 47. John Wiley & Sons, 2004.
7 Peterfreund, Natan. "Robust tracking of position and velocity with Kalman snakes." IEEE transactions on pattern analysis and machine intelligence 21.6 (1999): 564-569.   DOI
8 LeCun, Yann, Yoshua Bengio, and Geoffrey Hinton. "Deep learning." nature 521.7553 (2015): 436.   DOI
9 Deng, Li, and Dong Yu. "Deep learning: methods and applications." Foundations and Trends(R) in Signal Processing 7.3-4 (2014): 197-387.   DOI
10 Mayer-Schonberger, Viktor, and Kenneth Cukier. Big data: A revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt, 2013.
11 Nair, Vinod, and Geoffrey E. Hinton. "Rectified linear units improve restricted boltzmann machines." Proceedings of the 27th international conference on machine learning (ICML-10). 2010.
12 Girshick, Ross, et al. "Rich feature hierarchies for accurate object detection and semantic segmentation." Proceedings of the IEEE conference on computer vision and pattern recognition. 2014.
13 Lowe, David G. "Object recognition from local scale-invariant features." Computer vision, 1999. The proceedings of the seventh IEEE international conference on. Vol. 2. Ieee, 1999.
14 John Lai, Luis Mejias, and Jason J. Ford, 2011, "Airborne Vision-Based Collision-Detection System," Journal of Field Robotics, Vol.28, Issue 2, pp.137-157.   DOI
15 Teal Group, 2014 Market Profile and Forecast, World Unmanned aerial Vehicle Systems, 2014
16 Choi Youngchul, Ahn Hyosung. (2015). Dron's current and technology development trends and prospects. The world of electricity, 64(12), 20-25.
17 Eric N. Johnson, Anthony J. Calise, Yoko Watanabe, Jincheol Ha, and James C. Neidhoefer, 2007, "Real-Time Vision-Based Relative Aircraft Navigation," Journal of Aerospace Computing, Information, and Communication, Vol.4, pp.707-738   DOI
18 Lienhart, Rainer, and Jochen Maydt. "An extended set of haar-like features for rapid object detection." Proceedings. International Conference on Image Processing. Vol. 1. IEEE, 2002.
19 Dalal, Navneet, and Bill Triggs. "Histograms of oriented gradients for human detection." Computer Vision and Pattern Recognition, 2005. CVPR 2005. IEEE Computer Society Conference on. Vol. 1. IEEE, 2005.
20 Bouguet, Jean-Yves. "Pyramidal implementation of the affine lucas kanade feature tracker description of the algorithm." Intel Corporation 5.1-10 (2001): 4.
21 He, Kaiming, et al. "Spatial pyramid pooling in deep convolutional networks for visual recognition." IEEE transactions on pattern analysis and machine intelligence 37.9 (2015): 1904-1916.   DOI
22 Girshick, Ross. "Fast r-cnn." Proceedings of the IEEE international conference on computer vision. 2015.
23 Ren, Shaoqing, et al. "Faster r-cnn: Towards real-time object detection with region proposal networks." Advances in neural information processing systems. 2015.
24 Bradski, Gary, and Adrian Kaehler. Learning OpenCV: Computer vision with the OpenCV library." O'Reilly Media, Inc.", 2008.
25 Krizhevsky, Alex, Ilya Sutskever, and Geoffrey E. Hinton. "Imagenet classification with deep convolutional neural networks." Advances in neural information processing systems. 2012.
26 Schmidhuber, Jurgen. "Deep learning in neural networks: An overview." Neural networks 61 (2015): 85-117.   DOI
27 Liu, Wei, et al. "Ssd: Single shot multibox detector." European conference on computer vision. Springer, Cham, 2016.
28 Everingham, Mark, et al. "The pascal visual object classes (voc) challenge." International journal of computer vision 88.2 (2010): 303-338.   DOI
29 Abadi, Martin, et al. "Tensorflow: A system for large-scale machine learning." 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16). 2016.