Parallel Injection Method for Improving Descriptive Performance of Bi-GRU Image Captions |
Lee, Jun Hee
(Dept. of Electrical and Electronics Engineering, Korea Maritime and Ocean University)
Lee, Soo Hwan (Dept. of Electrical and Electronics Engineering, Korea Maritime and Ocean University) Tae, Soo Ho (Dept. of Electrical and Electronics Engineering, Korea Maritime and Ocean University) Seo, Dong Hoan (Div. of Electronics and Electrical Information Engineering, Korea Maritime and Ocean University) |
1 | E. Maggiori, Y. Tarabalka, G. Charpiat, and P. Alliez, “Convolutional Neural Networks for Large-scale Remote-sensing Image Classification,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 55, No. 2, pp. 645- 657, 2017. DOI |
2 | D.H. Kim, J.E. Kim, J.H. Song, Y.J. Shin, and S.S. Hwang, “Image-based Intelligent Surveillance System Using Unmanned Aircraft,” Journal of Korea Multimedia Society, Vol. 20, No. 3, pp. 437-445, 2017. DOI |
3 | S. Yu, S. Jia, and C. Xu, "Convolutional Neural Networks for Hyperspectral Image Classification," Neurocomputing, Vol. 219, pp. 88-98, 2017. DOI |
4 | P. Morales-Alvarez, A. Perez-Suay, R. Molina, and G. Camps-Valls, “Remote Sensing Image Classification with Large-scale Gaussian Processes,” IEEE Transactions on Geoscience and Remote Sensing, Vol. 56, No. 2, pp. 1103-1114, 2018. DOI |
5 | B. Gecer, G. Azzopardi, and N. Petkov, "Colorblob-based COSFIRE Filters for Object Recognition," Image and Vision Computing, Vol. 57, pp. 165-174, 2017. DOI |
6 | L.C. Chen, G. Papandreou, I. Kokkinos, K. Murphy, and A.L. Yuille, “Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFS,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 40, No. 4, pp. 834-848, 2018. DOI |
7 | K. Grm, V. Stuc, A. Artiges, M. Caron, and H.K. Ekenel, “Strengths and Weaknesses of Deep Learning Models for Face Recognition Against Image Degradations,” The Institution of Engineering and Technology Biometrics, Vol. 7, No. 1, pp. 81-89, 2017. |
8 | J. Cleveland, D. Thakur, P. Dames, C. Phillips, T. Kientz, K. Daniilidis, et al., “Automated System for Semantic Object Labeling with Soft-object Recognition and Dynamic Programming Segmentation,” IEEE Transactions on Automation Science and Engineering, Vol. 14, No. 2, pp. 820-833, 2017. DOI |
9 | X. Yang, W. Wu, K. Liu, P.W. Kim, A.K. Sangaiah, and G. Jeon, "Long-distance Object Recognition with Image Super Resolution: A Comparative Study," IEEE Access, Vol. 6, pp. 13429-13438, 2018. DOI |
10 | D. Marmanis, K. Schindler, J.D. Wegner, S. Galliani, M. Datcu, and U. Stilla, "Classification with an Edge: Improving Semantic Image Segmentation with Boundary Detection," Journal of International Society for Photogrammetry and Remote Sensing, Vol. 135, pp. 158-172, 2018. DOI |
11 | K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-scale Image Recognition," arXiv Preprint arXive:1409.1556, 2014. |
12 | C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelow, et al., "Going Deeper with Convolutions," Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1-9, 2015. |
13 | K. He, X. Zhang, S. Ren, and J. Sun, "Deep Residual Learning for Image Recognition," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016. |
14 | J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only Look once: Unified, Realtime Object Detection," Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779-788, 2016. |
15 | S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-cnn: Towards Real-time Object Detection with Region Proposal Networks," Advances in Neural Information Processing Systems, pp. 91-99, 2015. |
16 | O. Vinyals, A. Toshev, S. Bengio, and D. Erhan, "Show and Tell: A Neural Image Caption Generator," Proceeding of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3156-3164, 2015. |
17 | J. Chung, C. Gulcehre, K. Cho, and Y. Bengio, "Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling," arXiv Preprint arXiv:1412.3555, 2014. |
18 | K. Xu, J. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhudinov, et al., "Show, Attend and Tell: Neural Image Caption Generation with Visual Attention," Proceeding of International Conference on Machine Learning, pp. 2048-2057, 2015. |
19 | J. Mao, W. Xu, Y. Yang, J. Wang, Z. Huang, and A. Yuille, "Deep Captioning with Multimodal Recurrent Neural Networks (m-RNN)," arXiv Preprint arXiv:1412.6632, 2014. |
20 | M. Schuster and K.K. Paliwal, “Bidirectional Recurrent Neural Networks,” IEEE Transactions on Signal Processing, Vol. 45, No. 11, pp. 2673-2681, 1997. DOI |
21 | M. Hodosh, P. Young, and J. Hockenmaier, "Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics," Journal of Artificial Intelligence Research, Vol. 47, pp. 853-899, 2013. DOI |
22 | P. Young, A. Lai, M. Hodosh, and J. Hockenmaier, "From Image Descriptions to Visual Denotations: New Similarity Metrics for Semantic Inference over Event Descriptions," Transactions of the Association for Computational Linguistics, Vol. 2, pp. 67-78, 2014. DOI |
23 | J. Guan and E. Wang, "Repeated Review Based Image Captioning for Image Evidence Review," Signal Processing: Image Communication, Vol. 63, pp. 141-148, 2018. DOI |
24 | T.Y. Lin, M. Maire, S. Belongje, J. Hays, P. Perona, D. Ramanan, et al., "Microsoft Coco: Common Objects in Context," arXiv Preprint arXiv:1405.0312, 2014. |
25 | K. Papineni, S. Roukos, T. Ward, and W.J. Zhu, "BLEU: a Method for Automatic Evaluation of Machine Translation," Asscociation for Computational Linguistics, pp. 311-318, 2002. |
26 | S. Banerjee and A. Lavie, "METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments," Proceeding of the Association for Computational Linguistics Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/ or Summarization, pp. 65-72, 2005. |