Browse > Article
http://dx.doi.org/10.12815/kits.2022.21.5.274

Training of a Siamese Network to Build a Tracker without Using Tracking Labels  

Kang, Jungyu (Autonomous Driving Intelligence Research Section, ETRI)
Song, Yoo-Seung (Autonomous Driving Intelligence Research Section, ETRI)
Min, Kyoung-Wook (Autonomous Driving Intelligence Research Section, ETRI)
Choi, Jeong Dan (Intelligent Robotics Resarch Division, ETRI)
Publication Information
The Journal of The Korea Institute of Intelligent Transport Systems / v.21, no.5, 2022 , pp. 274-286 More about this Journal
Abstract
Multi-object tracking has been studied for a long time under computer vision and plays a critical role in applications such as autonomous driving and driving assistance. Multi-object tracking techniques generally consist of a detector that detects objects and a tracker that tracks the detected objects. Various publicly available datasets allow us to train a detector model without much effort. However, there are relatively few publicly available datasets for training a tracker model, and configuring own tracker datasets takes a long time compared to configuring detector datasets. Hence, the detector is often developed separately with a tracker module. However, the separated tracker should be adjusted whenever the former detector model is changed. This study proposes a system that can train a model that performs detection and tracking simultaneously using only the detector training datasets. In particular, a Siam network with augmentation is used to compose the detector and tracker. Experiments are conducted on public datasets to verify that the proposed algorithm can formulate a real-time multi-object tracker comparable to the state-of-the-art tracker models.
Keywords
Autonomous driving; Multi-object Detection; Multi-object tracking; Deep-learning;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Tian, Z., Shen, C., Chen, H. and He, T.(2019), "Fcos: Fully convolutional one-stage object detection", Proceedings of the IEEE/CVF International Conference on Computer Vision(CVPR), pp.9627-9636.
2 Bewley, A., Ge, Z., Ott, L., Ramos, F. and Upcroft, B.(2016), "Simple online and realtime tracking", 2016 IEEE International Conference on Image Processing(ICIP), pp.3464-3468.
3 Dendorfer, P., Rezatofighi, H., Milan, A., Shi, J., Cremers, D., Reid, I., Roth, S., Schindler, K. and Leal-Taixe, L.(2020), Mot20: A benchmark for multi object tracking in crowded scenes, arXiv preprint arXiv:2003.09003.
4 Everingham, M., Van Gool, L., Williams, C. K., Winn, J. and Zisserman, A.(2010), "The pascal visual object classes (voc) challenge", International Journal of Computer Vision(IJCV), vol. 88, no. 2, pp.303-338.   DOI
5 He, A., Luo, C., Tian, X. and Zeng, W.(2018), "A twofold siamese network for real-time object tracking", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp.4834-4843.
6 Lin, T. Y., Goyal, P., Girshick, R., He, K. and Dollar, P.(2017), "Focal loss for dense object detection", Proceedings of the IEEE International Conference on Computer Vision(ICCV), pp.2980-2988.
7 LingChen, T. C., Khonsari, A., Lashkari, A., Nazari, M. R., Sambee, J. S. and Nascimento, M. A.(2020), Uniformaugment: A search-free probabilistic data augmentation approach, arXiv preprint arXiv:2003.14348.
8 Liu, S., Qi, L., Qin, H., Shi, J. and Jia, J.(2018), "Path aggregation network for instance segmentation", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp.8759-8768.
9 Milan, A., Leal-Taixe, L., Reid, I., Roth, S. and Schindler, K.(2016), MOT16: A benchmark for multi-object tracking, arXiv preprint arXiv:1603.00831.
10 Leal-Taixe, L., Milan, A., Reid, I., Roth, S. and Schindler, K.(2015), Motchallenge 2015: Towards a benchmark for multi-target tracking, arXiv preprint arXiv:1504.01942.
11 Ren, S., He, K., Girshick, R. and Sun, J.(2015), "Faster r-cnn: Towards real-time object detection with region proposal networks", IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 6, pp.1137-1149.
12 Smith, L. N.(2017), "Cyclical learning rates for training neural networks", 2017 IEEE Winter Conference on Applications of Computer Vision(WACV), pp.464-472.
13 Voigtlaender, P., Krause, M., Osep, A., Luiten, J., Sekar, B. B. G., Geiger, A. and Leibe, B.(2019), "Mots: Multi-object tracking and segmentation", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp.7942-7951.
14 Zhang, Y., Wang, C., Wang, X., Zeng, W. and Liu, W.(2021), "Fairmot: On the fairness of detection and re-identification in multiple object tracking", International Journal of Computer Vision(IJCV), vol. 129, no. 11, pp.3069-3087.   DOI
15 Luiten, J., Osep, A., Dendorfer, P., Torr, P., Geiger, A., Leal-Taixe, L. and Leibe, B.(2021), "Hota: A higher order metric for evaluating multi-object tracking", International Journal of Computer Vision(IJCV), vol. 129, no. 2, pp.548-578.   DOI
16 Redmon, J. and Farhadi, A.(2018), Yolov3: An incremental improvement, arXiv preprint arXiv:1804.02767.
17 Shao, S., Zhao, Z., Li, B., Xiao, T., Yu, G., Zhang, X. and Sun, J.(2018), Crowdhuman: A benchmark for detecting human in a crowd, arXiv preprint arXiv:1805.00123.
18 Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Doll'ar, P. and Zitnick, C. L.(2014), "Microsoft coco: Common objects in context", European Conference on Computer Vision(ECCV), pp.740-755.
19 Shuai, B., Berneshawi, A., Li, X., Modolo, D. and Tighe, J.(2021), "Siammot: Siamese multi-object tracking", Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR), pp.12372-12382.
20 Loshchilov, I. and Hutter, F.(2017), Decoupled weight decay regularization, arXiv preprint arXiv:1711.05101.
21 Sun, P., Cao, J., Jiang, Y., Zhang, R., Xie, E., Yuan, Z., Wang, C. and Luo, P.(2020), Transtrack: Multiple object tracking with transformer, arXiv preprint arXiv:2012.15460.
22 Wang, Z., Zheng, L., Liu, Y., Li, Y. and Wang, S.(2020), "Towards real-time multi-object tracking", European Conference on Computer Vision(ECCV), pp.107-122.
23 Wojke, N., Bewley, A. and Paulus, D.(2017), "Simple online and realtime tracking with a deep association metric", 2017 IEEE International Conference on Image Processing(ICIP), pp.3645-3649.
24 Bochinski, E., Eiselein, V. and Sikora, T.(2017), "High-speed tracking-by-detection without using image information", 2017 14th IEEE International Conference on Advanced Video and Signal Based Surveillance(AVSS), pp.1-6.
25 Ge, Z., Liu, S., Wang, F., Li, Z. and Sun, J.(2021), Yolox: Exceeding yolo series in 2021, arXiv preprint arXiv:2107.08430.
26 Kuhn, H. W.(1955), "The Hungarian method for the assignment problem", Naval Research Logistics Quarterly, vol. 2 no. 1-2, pp.83-97.   DOI
27 Redmon, J., Divvala, S., Girshick, R. and Farhadi, A.(2016), "You only look once: Unified, real-time object detection", Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), pp.779-788.
28 Kang, J. G., Kim, M. J. and Min, K. W.(2022), "Dataset Definition and Training Techniques for Road Environment Objects State Recognition", Proceedings of the IEEK Conference, vol. 45, no. 1, pp.2607-2610.