[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2020.12.009

EER-ASSL: Combining Rollback Learning and Deep Learning for Rapid Adaptive Object Detection

Ahmed, Minhaz Uddin (Department of Computer Engineering, Inha University)
Kim, Yeong Hyeon (Department of Computer Engineering, Inha University)
Rhee, Phill Kyu (Department of Computer Engineering, Inha University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.14, no.12, 2020 , pp. 4776-4794 More about this Journal

Abstract

We propose a rapid adaptive learning framework for streaming object detection, called EER-ASSL. The method combines the expected error reduction (EER) dependent rollback learning and the active semi-supervised learning (ASSL) for a rapid adaptive CNN detector. Most CNN object detectors are built on the assumption of static data distribution. However, images are often noisy and biased, and the data distribution is imbalanced in a real world environment. The proposed method consists of collaborative sampling and EER-ASSL. The EER-ASSL utilizes the active learning (AL) and rollback based semi-supervised learning (SSL). The AL allows us to select more informative and representative samples measuring uncertainty and diversity. The SSL divides the selected streaming image samples into the bins and each bin repeatedly transfers the discriminative knowledge of the EER and CNN models to the next bin until convergence and incorporation with the EER rollback learning algorithm is achieved. The EER models provide a rapid short-term myopic adaptation and the CNN models an incremental long-term performance improvement. EER-ASSL can overcome noisy and biased labels in varying data distribution. Extensive experiments shows that EER-ASSL obtained 70.9 mAP compared to state-of-the-art technology such as Faster RCNN, SSD300, and YOLOv2.

Keywords

Object Detection; Active Learning; Semi-Supervised Learning; Convolutional Neural Network;

Citations & Related Records

Reference

1	A. Sorokin and D. Forsyth, "Utility data annotation with Amazon Mechanical Turk," in Proc. of 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, pp. 1-8, 2018.
2	K. K. Singh and Y. J. Lee, "You reap what you sow: Using Videos to Generate High Precision Object Proposals for Weakly-supervised Object Detection," in Proc. of IEEE Conference Computer Visions, pp. 2219-2228, 2019.
3	Y. Yang and M. Loog, "Active Learning Using Uncertainty Information," arXiv:1702.08540, Feb. 2017.
4	S. Pang and X. Yang, "Deep Convolutional Extreme Learning Machine and Its Application," Computational Intelligence and Neuroscience, vol. 2016.
5	Y. Yang, A. Loquercio, D. Scaramuzza, and S. Soatto, "Unsupervised Moving Object Detection via Contextual Information Separation," Computer Vision Foundation, pp. 879-888, 2019.
6	Z. Chen, K. Wang, X. Wang, P. Peng, E. Izquierdo, and L. Lin, "Deep Co-Space: Sample Mining Across Feature Transformation for Semi-Supervised Learning," IEEE Transactions on Circuits and Systems for Video Technology, vol. 28, no. 10, pp. 2667-2678, Oct. 2018. DOI
7	P. K. Rhee, E. Erdenee, S. D. Kyun, M. U. Ahmed, and S. Jin, "Active and semi-supervised learning for object detection with imperfect data," Cognitive Systems Research, vol. 45, pp. 109-123, 2017. DOI
8	X. Zhu, Semi-Supervised Learning Literature Survey, 2008.
9	D. K. Shin, M. U. Ahmed, and P. K. Rhee, "Incremental Deep Learning for Robust Object Detection in Unknown Cluttered Environments," IEEE Access, vol. 6, pp. 61748-61760, 2018. DOI
10	J. Yuan, W. Zhang, H. S. Tai, and S. McMains, "Iterative cross learning on noisy labels," in Proc. of 2018 IEEE Winter Conference on Applications of Computer Vision, pp. 757-765, 2018.
11	K. He, X. Zhang, S. Ren, and J. Sun, "Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, no. 9, pp. 1904-1916, 2015. DOI
12	S. Tong and D. Koller, "Support vector machine active learning with applications to text classification," in Proc. of the 17th International Conference on Machine Learning, pp. 999-1006, 2000.
13	D. Cohn, Z. Ghahramani, and M. I. Jordan, "Active Learning with Statistical Models," Journal of Artificial Intelligence Research, vol. 4, 1996.
14	K. Chaloner and I. Verdinelli, "Bayesian experimental design: A review," Statistical Science," vol. 10, pp. 237-304, 1995.
15	W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, "SSD: Single Shot MultiBox Detector," in Proc. of European Conferece on Computer Visions, vol. 9905, pp. 21-37, 2016.
16	J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You Only Look Once: Unified, Real-Time Object Detection," 2015.
17	J. Kwon and K. M. Lee, "Tracking of a Non-Rigid Object via Patch-based Dynamic Appearance Modeling and Adaptive Basin Hopping Monte Carlo Sampling," in Proc. of 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1208-1215, 2009.
18	P. F. Felzenszwalb, R. B. Girshick, D. Mcallester, and D. Ramanan, "Object Detection with Discriminatively Trained Part Based Models," IEEE Transations on Pattern Analysis and Machine Intelligence, vol. 32, no. 9, pp. 1-20, 2009.
19	I. Muslea, S. N. Minton, and C. A. Knoblock, "Active learning with strong and weak views: A case study on wrapper induction," IJCAI Int. Jt. Conf. Artif. Intell., pp. 415-420, 2003.
20	X. Zhu, J. Lafferty, and Z. Ghahramani,"Combining active learning and semi-supervised learning using Gaussian fields and harmonic functions," ICML 2003 workshop on The Continuum from Labeled to Unlabeled Data in Machine Learning and Data Mining, 2003.
21	M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, and C. Citro, "TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems," arXiv:1603.04467, 2016.
22	C. Mitash, K. E. Bekris, and A. Boularias, "A Self-Supervised Learning System for Object Detection using Physics Simulation and Multi-view Pose Estimation," in Proc. of International Conference on Intelligent Robots and Systems, pp. 545-551, 2017.
23	G. Salton and M. J. Mcgill, Introduction to Modern Information Retrieval, New York, USA: McGraw-Hill Inc, 1986.
24	M. Everingham , L. V. Gool, C. K. I. Williams, J. Winn, and A. Zisserman, "The PASCAL Visual Object Classes (VOC) Challenge," International Journal of Computer Vision, vol. 88, no. 2, pp 303-338. DOI
25	D. P. Kingma and J. Ba, "Adam: A Method for Stochastic Optimization," pp. 1-15, 2017.
26	X. Zhang, S. Wang, and X. Yun, "Bidirectional Active Learning: A Two-Way Exploration Into Unlabeled and Labeled Data Set," IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 12, Dec. 2015.
27	J. Redmon and A. Farhadi, "YOLO9000: Better, Faster, Stronger," 2016.
28	I. Dimitrovski, D. Kocev, I. Kitanovski, S. Loskovska, and S. Dzeroski, "Improved medical image modality classification using a combination of visual and textual features," Computerized Medical Imaging and Graphics, vol. 39, pp. 14-26, Jan. 2015. DOI
29	Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, "Backpropagation applied to handwritten zip code recognition," Neural Computation, vol. 1, no 4, pp. 541-551, 1989. DOI
30	L. Tamas, R. Frohlich, and Z. Kato, "Relative pose estimation and fusion of omnidirectional and Lidar cameras," Lecture Notes in Computer Science, vol. 8926, pp. 640-651, Mar. 2015.
31	P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun, "OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks," arXiv Prepr. arXiv, 2013.
32	S. Chetlur, C. Woolley, P. Vandermersch, J. Cohen, J. Tran, B. Catanzaro, and E. Shelhamer, "cuDNN: Efficient Primitives for Deep Learning," pp. 1-9, 2014.
33	A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," in Proc. of the 25th International Conference on Neural Information Processing Systems, vol. 1, pp. 1097-1105.
34	B. Kang, Z. Liu, X. Wang, F. Yu, J. Feng, and T. Darrell, "Few-shot Object Detection via Feature Reweighting," 2018.
35	O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Bernstein, A. C. Berg, and L. FeiFei, "ImageNet Large Scale Visual Recognition Challenge," International Jouranl of Computer Vision, vol. 115, pp. 211-252, 2015. DOI
36	M. Everingham and J. Winn, "The PASCAL Visual Object Classes Challenge 2007 (VOC2007) Development Kit," Challenge, vol. 2007, pp. 1-23, 2007.
37	J. Dai, Y. Li, K. He, and J. Sun, "R-FCN: Object Detection via Region-based Fully Convolutional Networks," 2016.
38	K. Shmelkov, C. Schmid, and K. Alahari, "Incremental Learning of Object Detectors without Catastrophic Forgetting," in Proc. of IEEE International Conferenc of Computer Vision, 2017.
39	Z. Lu, X. Wu, and J. C. Bongard, "Active learning through adaptive heterogeneous ensembling," IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 2, pp. 368-381, 2015. DOI
40	B. Settles, "Active Learning Literature Survey," Computer Science Report, University of Wisconsin, USA, 2009.
41	B. Settles and M. Craven, "An analysis of active learning strategies for sequence labeling tasks," in Proc. of the Conference on Empirical Methods in Natural Language Processing, p. 1070, 2018.