Browse > Article
http://dx.doi.org/10.3745/KTSDE.2021.10.9.349

Expanded Object Localization Learning Data Generation Using CAM and Selective Search and Its Retraining to Improve WSOL Performance  

Go, Sooyeon (숙명여자대학교 컴퓨터과학과)
Choi, Yeongwoo (숙명여자대학교 컴퓨터과학과)
Publication Information
KIPS Transactions on Software and Data Engineering / v.10, no.9, 2021 , pp. 349-358 More about this Journal
Abstract
Recently, a method of finding the attention area or localization area for an object of an image using CAM (Class Activation Map)[1] has been variously carried out as a study of WSOL (Weakly Supervised Object Localization). The attention area extraction from the object heat map using CAM has a disadvantage in that it cannot find the entire area of the object by focusing mainly on the part where the features are most concentrated in the object. To improve this, using CAM and Selective Search[6] together, we first expand the attention area in the heat map, and a Gaussian smoothing is applied to the extended area to generate retraining data. Finally we train the data to expand the attention area of the objects. The proposed method requires retraining only once, and the search time to find an localization area is greatly reduced since the selective search is not needed in this stage. Through the experiment, the attention area was expanded from the existing CAM heat maps, and in the calculation of IOU (Intersection of Union) with the ground truth for the bounding box of the expanded attention area, about 58% was improved compared to the existing CAM.
Keywords
WSOL(Weakly Supervised Object Localization); CAM(Class Activation Map); Selective Search; Localization;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
2 J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
3 X. Zhang, Y. Wei, J. Feng, Y. Yang, and T. Huang, "Adversarial complementary learning for weakly supervised object localization," in IEEE Computer Vision and Pattern Recognition, pp.1325-1334, 2018.
4 L. Bazzani, A. Bergamo, D. Anguelov, and L. Torresani, "Self-taught object localization with deep networks," 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), IEEE, 2016.
5 D. Li, J. B. Huang, Y. Li, S. Wang, and M. H. Yang, "Weakly supervised object localization with progressive domain adaption," In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
6 B. Zhou, A. Khosla, L. A., A. Oliva, and A. Torralba, "Learning deep features for discriminative localization," Computer Vision and Pattern Recognition, pp.2921-2929, 2016.
7 K. K. Singh and Y. J. Lee, "Hide-and-seek: Forcing a network to be meticulous for weakly-supervised object and action localization," arXiv preprint arXiv:1704.04232, 2017.
8 S. Yun, D. Han, S. J. Oh, S. Chun, J. Choe, and Y. Yoo, "Cutmix: Regularization strategy to train strong classifiers with localizable features," in International Conference on Computer Vision, pp.6022-6031, 2019.
9 X. Zhang, Y. Wei, Y. Yang and F. Wu. Rethinking Localization Map: Towards Accurate Object Perception with Self-Enhancement Maps. Computer Vision and Pattern Recognition preprint, 2020.
10 J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders, "Selective search for object recognition," International Computer of Computer Vision, Vol.104, pp.154-171, 2013.   DOI
11 P. Felzenszwalb and D. Huttenlocher, "Efficient graph-based image segmentation," International Journal of Computer Vision, Vol.59, No.2, Sep. 2004.
12 A. Kolesnikov and C. H. Lampert, "Seed, expand and constrain: Three principles for weakly-supervised image segmentation," In European Conference on Computer Vision, pp.695-711, 2016.
13 J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "Imagenet: A large-scale hierarchical image database," Conference on Computer Vision and Pattern Recognition, pp.248-255, 2009.
14 S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," Advances in Neural Information Processing Systems, Vol.39, pp.1137-1149, 2015.
15 A. J. Bency, H. Kwon, H. Lee, S. Karthikeyan, and B. Manjunath, "Weakly supervised localization using deep feature maps," European Conference on Computer Vision, pp.714-731, Springer, 2016.