References
- D. H. Hubel, T. N. Wiesel, Receptive fields and functional architecture of monkey striate cortex, The Journal of physiology(1968), pp. 215-243.
- K. Fukushima, S. Miyake, Neocognitron: A selforganizing neural network model for a mechanism of visual pattern recognition, in: Competition and cooperation in neural nets, 1982, pp. 267-285.
- Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel, Handwritten digit recognition with a back-propagation network, in: Proceedings of the Advances in Neural Information Processing Systems (NIPS), 1989, pp. 396-404.
- Y. LeCun, L. Bottou, Y. Bengio, P. Haffner, Gradientbased learning applied to document recognition, Proceedings of IEEE 86 (11) (1998), pp. 2278-2324.
- O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al.,ImageNet large scale visual recognition challenge, International Journal of Computer Vision (IJCV) 115 (3) (2015), pp. 211-252. https://doi.org/10.1007/s11263-015-0816-y
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, ImageNet classification with deep convolutional neural networks. In NIPS, pp. 1106-1114, 2012.
- K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, in: Proceedings of the International Conference on Learning Representations (ICLR), 2015.
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, A. Rabinovich, Going deeper with convolutions, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1-9.
- K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770-778.
- G. Huang, Z. Liu, and K. Q. Weinberger. Densely connected convolutional networks. arXiv preprint arXiv:1608.06993, 2016.
- M. Egmont-Petersen, D. de Ridder, H. Handels, Image processing with neural networks a review, Pattern recognition35 (10) (2002), pp. 2279-2301. https://doi.org/10.1016/S0031-3203(01)00178-9
- K. Nogueira, O. A. Penatti, J. A. dos Santos, Towards better exploiting convolutional neural networks for remote sensing scene classification, Pattern Recognition 61 (2017), pp. 539-556. https://doi.org/10.1016/j.patcog.2016.07.001
- Z. Zuo, G. Wang, B. Shuai, L. Zhao, Q. Yang, Exemplar based deep discriminative and shareable feature learning for scene image classification, Pattern Recognition 48 (10) (2015), pp. 3004-3015. https://doi.org/10.1016/j.patcog.2015.02.003
- A. T. Lopes, E. de Aguiar, A. F. De Souza, T. Oliveira-Santos, Facial expression recognition with convolutional neural networks: Coping with few data and the training sample order, Pattern Recognition 61 (2017), pp. 610-628. https://doi.org/10.1016/j.patcog.2016.07.026
- N. Srivastava, R. R. Salakhutdinov, Discriminative transfer learning with tree-based priors, in: Proceedings of the Advances in Neural Information Processing Systems (NIPS), 2013, pp. 2094-2102.
- Z. Wang, X. Wang, G. Wang, Learning fine-grained features via a cnn tree for large-scale classification, CoRRabs/1511.04534.
- T. Xiao, J. Zhang, K. Yang, Y. Peng, Z. Zhang, Error-driven incremental learning in deep convolutional neural network for large-scale image classification, in: Proceedings of the ACM Multimedia Conference, 2014, pp. 177-186.
- Z. Yan, V. Jagadeesh, D. DeCoste, W. Di, R. Piramuthu, Hd-cnn: Hierarchical deep convolutional neural network for image classification, in: Proceedings of the International Conference on Computer Vision (ICCV), pp. 2740-2748.
- T. Berg, J. Liu, S. W. Lee, M. L. Alexander, D. W. Jacobs, P. N. Belhumeur, Birdsnap: Large-scale fine-grained visual categorization of birds, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2014, pp. 2019-2026.
- A. Khosla, N. Jayadevaprakash, B. Yao, F.-F. Li, Novel dataset for fine-grained image categorization: Stanford dogs, in:Proceedings of the IEEE International Conference on Computer Vision (CVPR Workshops, Vol. 2, 2011.
- L. Yang, P. Luo, C. C. Loy, X. Tang, A large-scale car dataset for fine-grained categorization and verification, in:Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3973- 3981.
- M. Minervini, A. Fischbach, H. Scharr, S. A. Tsaftaris, Finely-grained annotated datasets for image-based plant phenotyping, Pattern recognition letters 81 (2016), pp. 80-89. https://doi.org/10.1016/j.patrec.2015.10.013
- G.-S. Xie, X.-Y. Zhang, W. Yang, M.-L. Xu, S. Yan, C.-L. Liu, Lg-cnn: From local parts to global discrimination forfine-grained recognition, Pattern Recognition 71 (2017), pp. 118-131. https://doi.org/10.1016/j.patcog.2017.06.002
- S. Branson, G. Van Horn, P. Perona, S. Belongie, Improved bird species recognition using pose normalized deep convolutional nets, in: Proceedings of the British Machine Vision Conference (BMVC), 2014.
- R. Girshick, F. Iandola, T. Darrell, J. Malik, Deformable part models are convolutional neural networks, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 437-446.
- S. J. Nowlan, J. C. Platt, A convolutional neural network hand tracker, in: Proceedings of the Advances in Neural Information Processing Systems (NIPS), 1994, pp. 901- 908.
- R. Vaillant, C. Monrocq, Y. Le Cun, Original approach for the localisation of objects in images, IEE Proceedings- Vision, Image and Signal Processing 141 (4) (1994) 245- 250. https://doi.org/10.1049/ip-vis:19941301
- M. Everingham, S. A. Eslami, L. Van Gool, C. K. Williams, J. Winn, A. Zisserman, The pascal visual object classes challenge: A retrospective, International Journal of Computer Vision (IJCV) 111 (1) (2015), pp. 98-136. https://doi.org/10.1007/s11263-014-0733-5
- T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, C. L. Zitnick, Microsoft coco: Common objects in context, in: Proceedings of the European Conference on Computer Vision (ECCV), 2014, pp. 740-755.
- P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, Y. LeCun, Overfeat: Integrated recognition, localization and detection using convolutional networks.
- L. Gomez, D. Karatzas, Text proposals: a text-specific selective search algorithm for word spotting in the wild, Pattern Recognition 70 (2017), pp. 60-74. https://doi.org/10.1016/j.patcog.2017.04.027
- R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation, in: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014, pp. 580, pp. 587.
- K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 37 (9) (2015), pp. 1904-1916. https://doi.org/10.1109/TPAMI.2015.2389824
- R. Girshick, Fast R-CNN, CoRR, abs/1504.08083, 2015.
- S. Ren, K. He, R. Girshick, J. Sun, Faster R-CNN: Towards real-time object detection with region proposal networks, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 39 (6) (2017), pp. 1137-1149. https://doi.org/10.1109/TPAMI.2016.2577031
- J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection, in: Proceedingso f the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 779-788.
- W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, Ssd: Single shot multibox detector, in: Proceedings of the European Conference on Computer Vision (ECCV), 2016, pp. 21-37.
- Fu, C. Y., Liu, W., Ranga, A., Tyagi, A., Berg, A. C. (2017). DSSD: Deconvolutional Single Shot Detector. arXiv preprint arXiv:1701.06659.
- Shrivastava A, Sukthankar R, Malik J, Gupta A. Beyond Skip Connections: Top-Down Modulation for Object Detection. arXiv preprint arXiv:1612.06851. 2016.
- J. Redmon and A. Farhadi. YOLO9000: Better, faster,stronger. In CVPR, 2017.
- K.-S. Fu, J. Mui, A survey on image segmentation, Pattern recognition 13 (1) (1981), pp. 3-16. https://doi.org/10.1016/0031-3203(81)90028-5
- Q. Zhou, B. Zheng, W. Zhu, L. J. Latecki, Multi-scale context for scene labeling via flexible segmentation graph, Pattern Recognition 59 (2016), pp. 312-324. https://doi.org/10.1016/j.patcog.2016.03.023
- F. Liu, G. Lin, C. Shen, CRF learning with cnn features for image segmentation, Pattern Recognition 48 (10) (2015), pp. 2983-2992. https://doi.org/10.1016/j.patcog.2015.04.019
- S. Bu, P. Han, Z. Liu, J. Han, Scene parsing using inference embedded deep networks, Pattern Recognition 59 (2016), pp. 188-198. https://doi.org/10.1016/j.patcog.2016.01.027
- B. Peng, L. Zhang, D. Zhang, A survey of graph theoretical approaches to image segmentation, Pattern Recognition 46 (3) (2013), pp. 1020-1038. https://doi.org/10.1016/j.patcog.2012.09.015
- J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 39 (4) (2017), pp. 640-651. https://doi.org/10.1109/TPAMI.2016.2572683
- L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, A. L. Yuille, Semantic image segmentation with deep convolutional nets and fully connected crfs, in: Proceedings of the International Conference on Learning Representations (ICLR), 2015.
- K. He, G. Gkioxari, P. Dollar, and R. Girshick. Mask R-CNN. In ICCV, 2017.
- A. Frome, G. S. Corrado, J. Shlens, S. Bengio, J. Dean,T. Mikolov, et al. Devise: A deep visual-semantic embedding model. In NIPS, 2013.
- A. Karpathy, A. Joulin, and L. Fei-Fei. Deep fragment embeddingsfor bidirectional image sentence mapping. arXiv preprint arXiv:1406.5679, 2014.
- J. Johnson, B. Hariharan, L. van der Maaten, L. Fei-Fei, C. L.Zitnick, and R. Girshick. CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning. In CVPR, 2017.
- J. Johnson, B. Hariharan, L. van der Maaten, J. Hoffman, L. Fei-Fei, C. L. Zitnick, and R. Girshick.Inferring and executing programs for visual reasoning. Technical report, Stanford, 2017.