[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.9708/jksci.2022.27.10.019

A Self-Supervised Detector Scheduler for Efficient Tracking-by-Detection Mechanism

Park, Dae-Hyeon (Vision & Learning Laboratory, Inha University)
Lee, Seong-Ho (Vision & Learning Laboratory, Inha University)
Bae, Seung-Hwan (Dept. of Computer Engineering, Inha University)

Publication Information

Journal of the Korea Society of Computer and Information / v.27, no.10, 2022 , pp. 19-28 More about this Journal

Abstract

In this paper, we propose the Detector Scheduler which determines the best tracking-by-detection (TBD) mechanism to perform real-time high-accurate multi-object tracking (MOT). The Detector Scheduler determines whether to run a detector by measuring the dissimilarity of features between different frames. Furthermore, we propose a self-supervision method to learn the Detector Scheduler with tracking results since it is difficult to generate ground truth (GT) for learning the Detector Scheduler. Our proposed self-supervision method generates pseudo labels on whether to run a detector when the dissimilarity of the object cardinality or appearance between frames increases. To this end, we propose the Detector Scheduling Loss to learn the Detector Scheduler. As a result, our proposed method achieves real-time high-accurate multi-object tracking by boosting the overall tracking speed while keeping the tracking accuracy at most.

Keywords

Multi-Object Tracking; Tracking-by-Detection Scheduling; Dissimilarity Learning; Self-Supervised Learning; Quality Measure;

Citations & Related Records

Reference

1	Chen, Kai, et al., "Optimizing video object detection via a scale-time lattice.", Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 7814-7823, Jun. 2018, DOI: https://doi.org/10.1109/CVPR.2018.00815 DOI
2	Feichtenhofer, Christoph, Axel Pinz, and Andrew Zisserman, "Detect to track and track to detect.", Proceedings of the IEEE international conference on computer vision, pp. 3038-3046, Oct. 2017, DOI: https://doi.org/10.1109/ICCV.2017.330 DOI
3	Zhu, Xizhou, et al., "Deep feature flow for video recognition.", Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2349-2358, Jul. 2017, DOI: https://doi.org/10.1109/CVPR.2017.441 DOI
4	Chen, Ting, et al., "Self-supervised gans via auxiliary rotation loss.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 12154-12163, Jun. 2019, DOI:10.1109/CVPR.2019.01243 DOI
5	Noroozi, Mehdi, et al., "Boosting self-supervised learning via knowledge transfer.", Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 9359-9367, Jun. 2018, DOI: https://doi.org/10.1109/CVPR.2018.00975 DOI
6	Wang, Zhongdao, et al., "Towards real-time multi-object tracking.", European Conference on Computer Vision, pp. 767-770, Aug. 2020, DOI: https://doi.org/10.1007/978-3-030-58621-8_7 DOI
7	Karthik, Shyamgopal, Ameya Prabhu, and Vineet Gandhi, "Simple unsupervised multi-object tracking.", arXiv preprint arXiv:2006.02609, 2020, DOI: https://doi.org/10.48550/arXiv.2006.02609 DOI
8	Zhang, Yifu, et al., "Fairmot: On the fairness of detection and re-identification in multiple object tracking.", International Journal of Computer Vision 129.11, pp. 3069-3087, Nov. 2021, DOI:https://doi.org/10.1007/s11263-021-01513-4 DOI
9	Luo, Hao, et al., "Detect or track: Towards cost-effective video object detection/tracking.", Proceedings of the AAAI Conference on Artificial Intelligence. Vol. 33. No. 01, pp. 8803-8810, Jan. 2019, DOI: https://doi.org/10.1609/aaai.v33i01.33018803 DOI
10	Dosovitskiy, Alexey, et al., "Flownet: Learning optical flow with convolutional networks.", Proceedings of the IEEE international conference on computer vision, pp. 2758-2766, Dec. 2015, DOI:https://doi.org/10.1109/ICCV.2015.316 DOI
11	Kuhn, Harold W, "The Hungarian method for the assignment problem.", Naval research logistics quarterly 2.1-2, pp. 83-97, Mar. 1955, DOI: https://doi.org/10.1002/nav.3800020109 DOI
12	Milan, Anton, et al., "MOT16: A benchmark for multi-object tracking.", arXiv preprint arXiv:1603.00831, 2016, DOI: https://doi.org/10.48550/arXiv.1603.00831 DOI
13	Pang, Jiangmiao, et al., "Quasi-dense similarity learning for multiple object tracking.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 164-173, Jun. 2021, DOI: https://doi.org/10.1109/CVPR46437.2021.00023 DOI
14	Tokmakov, Pavel, et al., "Learning to track with object permanence.", Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10860-10869, Oct. 2021, DOI: https://doi.org/10.1109/iccv48922.2021.01068 DOI
15	Lu, Zhichao, et al., "Retinatrack: Online single stage joint detection and tracking.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 14668-14678, Jun. 2020, DOI: https://doi.org/10.1109/CVPR42600.2020.01468 DOI
16	Peng, Jinlong, et al., "Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking.", European conference on computer vision, pp. 145-161, Oct. 2020, DOI: https://doi.org/10.1007/978-3-030-58548-8_9 DOI
17	Lin, Xufeng, et al., "On the detection-to-track association for online multi-object tracking.", Pattern Recognition Letters 146, pp. 200-207, Jun. 2021, DOI: https://doi.org/10.1016/j.patrec.2021.03.022 DOI
18	Wu, Jialian, et al., "Track to detect and segment: An online multi-object tracker.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp, 12352-12361, Jun. 2021, DOI: https://doi.org/10.1109/CVPR46437.2021.01217 DOI
19	Zhou, Xingyi, Vladlen Koltun, and Philipp Krahenbuhl, "Tracking objects as points.", European Conference on Computer Vision, pp. 474-490, Oct. 2020, DOI: https://doi.org/10.1007/978-3-030-58548-8_28 DOI
20	Wang, Yongxin, Kris Kitani, and Xinshuo Weng, "Joint object detection and multi-object tracking with graph neural networks." 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 13708-13715, 2021, DOI: https://doi.org/10.1109/ICRA48506.2021.9561110 DOI
21	Pang, Bo, et al., "Tubetk: Adopting tubes to track multi-object in a one-step training model.", Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp. 6308-6318, Jun. 2020, DOI: https://doi.org/10.1109/CVPR42600.2020.00634 DOI
22	Yao, Chun-Han, et al., "Video object detection via object-level temporal aggregation.", European conference on computer vision, pp. 160-177, Nov. 2020, DOI: https://doi.org/10.1007/978-3-030-58568-6_10 DOI
23	Wojke, Nicolai, Alex Bewley, and Dietrich Paulus, "Simple online and realtime tracking with a deep association metric.", 2017 IEEE international conference on image processing (ICIP), pp. 107-122, Sep. 2020, DOI: https://doi.org/10.1109/icip.2017.8296962 DOI
24	Zhu, Xizhou, et al., "Towards high performance video object detection.", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7210-7218, Jun. 2018, DOI:https://doi.org/10.1109/cvpr.2018.00753 DOI
25	Wang, Qiang, et al., "Multiple object tracking with correlation learning.", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3876-3886, Jun. 2021, DOI: https://doi.org/10.1109/CVPR46437.2021.00387 DOI
26	Kalman, Rudolph Emil, "A new approach to linear filtering and prediction problems.", Journal of Fluids Engineering Vol 82, pp. 35-45, Mar. 1960, DOI: https://doi.org/10.1115/1.3662552 DOI
27	Bernardin, Keni, and Rainer Stiefelhagen, "Evaluating multiple object tracking performance: the clear mot metrics.", EURASIP Journal on Image and Video Processing May 2008, pp. 1-10, 2008, DOI: https://doi.org/10.1155/2008/246309 DOI
28	Babaee, Maryam, Zimu Li, and Gerhard Rigoll, "A dual cnn-rnn for multiple people tracking.", Neurocomputing 368, pp. 69-83, Nov. 2019, DOI: https://doi.org/10.1016/j.neucom.2019.08.008 DOI
29	Zheng, Linyu, et al., "Improving multiple object tracking with single object tracking.", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2453-2462, Jun. 2021, DOI: https://doi.org/10.1109/CVPR46437.2021.00248 DOI
30	Yu, Fengwei, et al., "Poi: Multiple object tracking with high performance detection and appearance feature.", European Conference on Computer Vision, pp. 36-42, Nov. 2016, DOI: https://doi.org/10.1007/978-3-319-48881-3_3 DOI