Browse > Article
http://dx.doi.org/10.9708/jksci.2022.27.10.019

A Self-Supervised Detector Scheduler for Efficient Tracking-by-Detection Mechanism  

Park, Dae-Hyeon (Vision & Learning Laboratory, Inha University)
Lee, Seong-Ho (Vision & Learning Laboratory, Inha University)
Bae, Seung-Hwan (Dept. of Computer Engineering, Inha University)
Abstract
In this paper, we propose the Detector Scheduler which determines the best tracking-by-detection (TBD) mechanism to perform real-time high-accurate multi-object tracking (MOT). The Detector Scheduler determines whether to run a detector by measuring the dissimilarity of features between different frames. Furthermore, we propose a self-supervision method to learn the Detector Scheduler with tracking results since it is difficult to generate ground truth (GT) for learning the Detector Scheduler. Our proposed self-supervision method generates pseudo labels on whether to run a detector when the dissimilarity of the object cardinality or appearance between frames increases. To this end, we propose the Detector Scheduling Loss to learn the Detector Scheduler. As a result, our proposed method achieves real-time high-accurate multi-object tracking by boosting the overall tracking speed while keeping the tracking accuracy at most.
Keywords
Multi-Object Tracking; Tracking-by-Detection Scheduling; Dissimilarity Learning; Self-Supervised Learning; Quality Measure;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Chen, Kai, et al., "Optimizing video object detection via a scale-time lattice.", Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7814-7823, Jun. 2018, DOI: https://doi.org/10.1109/CVPR.2018.00815   DOI
2 Feichtenhofer, Christoph, Axel Pinz, and Andrew Zisserman, "Detect to track and track to detect.", Proceedings of the IEEE international conference on computer vision, pp. 3038-3046, Oct. 2017, DOI: https://doi.org/10.1109/ICCV.2017.330   DOI
3 Zhu, Xizhou, et al., "Deep feature flow for video recognition.", Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2349-2358, Jul. 2017, DOI: https://doi.org/10.1109/CVPR.2017.441   DOI
4 Chen, Ting, et al., "Self-supervised gans via auxiliary rotation loss.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12154-12163, Jun. 2019, DOI:10.1109/CVPR.2019.01243   DOI
5 Noroozi, Mehdi, et al., "Boosting self-supervised learning via knowledge transfer.", Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9359-9367, Jun. 2018, DOI: https://doi.org/10.1109/CVPR.2018.00975   DOI
6 Wang, Zhongdao, et al., "Towards real-time multi-object tracking.", European Conference on Computer Vision, pp. 767-770, Aug. 2020, DOI: https://doi.org/10.1007/978-3-030-58621-8_7   DOI
7 Karthik, Shyamgopal, Ameya Prabhu, and Vineet Gandhi, "Simple unsupervised multi-object tracking.", arXiv preprint arXiv:2006.02609, 2020, DOI: https://doi.org/10.48550/arXiv.2006.02609   DOI
8 Zhang, Yifu, et al., "Fairmot: On the fairness of detection and re-identification in multiple object tracking.", International Journal of Computer Vision 129.11, pp. 3069-3087, Nov. 2021, DOI:https://doi.org/10.1007/s11263-021-01513-4   DOI
9 Luo, Hao, et al., "Detect or track: Towards cost-effective video object detection/tracking.", Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. No. 01, pp. 8803-8810, Jan. 2019, DOI: https://doi.org/10.1609/aaai.v33i01.33018803   DOI
10 Dosovitskiy, Alexey, et al., "Flownet: Learning optical flow with convolutional networks.", Proceedings of the IEEE international conference on computer vision, pp. 2758-2766, Dec. 2015, DOI:https://doi.org/10.1109/ICCV.2015.316   DOI
11 Kuhn, Harold W, "The Hungarian method for the assignment problem.", Naval research logistics quarterly 2.1-2, pp. 83-97, Mar. 1955, DOI: https://doi.org/10.1002/nav.3800020109   DOI
12 Milan, Anton, et al., "MOT16: A benchmark for multi-object tracking.", arXiv preprint arXiv:1603.00831, 2016, DOI: https://doi.org/10.48550/arXiv.1603.00831   DOI
13 Pang, Jiangmiao, et al., "Quasi-dense similarity learning for multiple object tracking.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 164-173, Jun. 2021, DOI: https://doi.org/10.1109/CVPR46437.2021.00023   DOI
14 Tokmakov, Pavel, et al., "Learning to track with object permanence.", Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10860-10869, Oct. 2021, DOI: https://doi.org/10.1109/iccv48922.2021.01068   DOI
15 Lu, Zhichao, et al., "Retinatrack: Online single stage joint detection and tracking.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14668-14678, Jun. 2020, DOI: https://doi.org/10.1109/CVPR42600.2020.01468   DOI
16 Peng, Jinlong, et al., "Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking.", European conference on computer vision, pp. 145-161, Oct. 2020, DOI: https://doi.org/10.1007/978-3-030-58548-8_9   DOI
17 Lin, Xufeng, et al., "On the detection-to-track association for online multi-object tracking.", Pattern Recognition Letters 146, pp. 200-207, Jun. 2021, DOI: https://doi.org/10.1016/j.patrec.2021.03.022   DOI
18 Wu, Jialian, et al., "Track to detect and segment: An online multi-object tracker.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp, 12352-12361, Jun. 2021, DOI: https://doi.org/10.1109/CVPR46437.2021.01217   DOI
19 Zhou, Xingyi, Vladlen Koltun, and Philipp Krahenbuhl, "Tracking objects as points.", European Conference on Computer Vision, pp. 474-490, Oct. 2020, DOI: https://doi.org/10.1007/978-3-030-58548-8_28   DOI
20 Wang, Yongxin, Kris Kitani, and Xinshuo Weng, "Joint object detection and multi-object tracking with graph neural networks." 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13708-13715, 2021, DOI: https://doi.org/10.1109/ICRA48506.2021.9561110   DOI
21 Pang, Bo, et al., "Tubetk: Adopting tubes to track multi-object in a one-step training model.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6308-6318, Jun. 2020, DOI: https://doi.org/10.1109/CVPR42600.2020.00634   DOI
22 Yao, Chun-Han, et al., "Video object detection via object-level temporal aggregation.", European conference on computer vision, pp. 160-177, Nov. 2020, DOI: https://doi.org/10.1007/978-3-030-58568-6_10   DOI
23 Wojke, Nicolai, Alex Bewley, and Dietrich Paulus, "Simple online and realtime tracking with a deep association metric.", 2017 IEEE international conference on image processing (ICIP), pp. 107-122, Sep. 2020, DOI: https://doi.org/10.1109/icip.2017.8296962   DOI
24 Zhu, Xizhou, et al., "Towards high performance video object detection.", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7210-7218, Jun. 2018, DOI:https://doi.org/10.1109/cvpr.2018.00753   DOI
25 Wang, Qiang, et al., "Multiple object tracking with correlation learning.", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3876-3886, Jun. 2021, DOI: https://doi.org/10.1109/CVPR46437.2021.00387   DOI
26 Kalman, Rudolph Emil, "A new approach to linear filtering and prediction problems.", Journal of Fluids Engineering Vol 82, pp. 35-45, Mar. 1960, DOI: https://doi.org/10.1115/1.3662552   DOI
27 Bernardin, Keni, and Rainer Stiefelhagen, "Evaluating multiple object tracking performance: the clear mot metrics.", EURASIP Journal on Image and Video Processing May 2008, pp. 1-10, 2008, DOI: https://doi.org/10.1155/2008/246309   DOI
28 Babaee, Maryam, Zimu Li, and Gerhard Rigoll, "A dual cnn-rnn for multiple people tracking.", Neurocomputing 368, pp. 69-83, Nov. 2019, DOI: https://doi.org/10.1016/j.neucom.2019.08.008   DOI
29 Zheng, Linyu, et al., "Improving multiple object tracking with single object tracking.", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2453-2462, Jun. 2021, DOI: https://doi.org/10.1109/CVPR46437.2021.00248   DOI
30 Yu, Fengwei, et al., "Poi: Multiple object tracking with high performance detection and appearance feature.", European Conference on Computer Vision, pp. 36-42, Nov. 2016, DOI: https://doi.org/10.1007/978-3-319-48881-3_3   DOI