References
- J. Redmon and A. Farhadi, "YOLOv3: An Incremental Improvement," University of Washington, Washington: WA, Technical Report, 2018.
- K. Xu, J. Ba, R. Kiros, K. Cho, and A. Courville, "Show, attend and tell: Neural image caption generation with visual attention," in International conference on machine learning, France: FR, pp. 2048-2057, 2015.
- H. Nam, J. Ha, and J. Kim, "Dual attention networks for multimodal reasoning and matching," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii: HI, pp. 299-307, 2017.
- F. Wang, M. Jiang, C. Qian, S. Yang, C. Li, H. Zhang, and X. Tang, "Residual attention network for image classification," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii: HI, pp. 3156-3164, 2017.
- J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Utah: UT, pp. 7132-7141, 2018.
- S. Woo, J. Park, J. Lee, and K. So, "Convolutional block attention module," in Proceedings of the European conference on computer vision (ECCV), Germany: DE, pp. 3-19, 2018.
- Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, and Q. Hu, "ECA-net: Efficient channel attention for deep convolutional neural networks," in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Virtual, pp. 11534-11542, 2020.
- Z. Zheng, P. Wang, W. Liu, J. Li, R. Ye, and D. Ren, "Distance-IoU Loss: Faster and Better Learning for Bounding Box Regression," in Proceeding of the AAAI Conference on Artificial Intelligence, New York: NY, vol. 34, no. 7, pp. 12993-13000, 2020.
- K. He, G. Gkioxari, P. Dollar, and R. Girshick, "Mask r-cnn," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii: HI, pp. 2961-2969, 2017.
- H. Qassim, A. Verma, and D. Feinzimer, "Compressed residual-VGG16 CNN model for big data places image recognition," in 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), Nevada: NV, pp. 169-175, 2018.
- K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nevada: NV, pp. 770-778, 2016.
- G. Huang, Z. Liu, L. V. D. Maaten, and K. Q. Weinberger, "Densely connected convolutional networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii: HI, pp. 4700-4708, 2017.
- T. Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollar, "Focal loss for dense object detection," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Hawaii: HI, pp. 2980-2988, 2017.
- H. Rezatofighi, N. Tsoi, J. Gwak, A. Sadeghian, I. Reid, and S. Savarese, "Generalized intersection over union: A metric and a loss for bounding box regression," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, California: CA, pp. 658-666, 2019.
- T. Dozat, "Incorporating nesterov momentum into adam," in ICLR 2016 workshop submission, Puerto Rico: PR, 2016.
- D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," in Proceedings of the 3rd International Conference on Learning Representations (ICLR), California: CA, pp. 1-15, 2015.
- A. Mittal, A. Zisserman, and P. Torr. Hand Dataset [Internet]. Available: http://www.robots.ox.ac.uk/-vgg/data/hands/.