Browse > Article
http://dx.doi.org/10.3837/tiis.2020.11.010

DA-Res2Net: a novel Densely connected residual Attention network for image semantic segmentation  

Zhao, Xiaopin (Institute of Information Science, Beijing Jiaotong University)
Liu, Weibin (Institute of Information Science, Beijing Jiaotong University)
Xing, Weiwei (School of Software Engineering, Beijing Jiaotong University)
Wei, Xiang (School of Software Engineering, Beijing Jiaotong University)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.14, no.11, 2020 , pp. 4426-4442 More about this Journal
Abstract
Since scene segmentation is becoming a hot topic in the field of autonomous driving and medical image analysis, researchers are actively trying new methods to improve segmentation accuracy. At present, the main issues in image semantic segmentation are intra-class inconsistency and inter-class indistinction. From our analysis, the lack of global information as well as macroscopic discrimination on the object are the two main reasons. In this paper, we propose a Densely connected residual Attention network (DA-Res2Net) which consists of a dense residual network and channel attention guidance module to deal with these problems and improve the accuracy of image segmentation. Specifically, in order to make the extracted features equipped with stronger multi-scale characteristics, a densely connected residual network is proposed as a feature extractor. Furthermore, to improve the representativeness of each channel feature, we design a Channel-Attention-Guide module to make the model focusing on the high-level semantic features and low-level location features simultaneously. Experimental results show that the method achieves significant performance on various datasets. Compared to other state-of-the-art methods, the proposed method reaches the mean IOU accuracy of 83.2% on PASCAL VOC 2012 and 79.7% on Cityscapes dataset, respectively.
Keywords
Semantic segmentation; Densely connected; Attention network; Channel-Attention-Guide module; Feature fusion;
Citations & Related Records
연도 인용수 순위
  • Reference
1 T.Hui.O and K.Ma.K, "Semantic image segmentation using oriented pattern analysis," in Proc. of IEEE Conference on Information, Communications & Signal Processing, pp. 13-16, 2011.
2 J. Long, E. Shelhamer, and T. Darrell, "Fully convolutional networks for semantic segmentation," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431-3440, 2015.
3 C. Yu, J. Wang, C. Peng, C. Gao, G. Yu, and N. sang, "Learning a Discriminative Feature Network for Semantic Segmentation," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1857-1866, 2018.
4 K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 770-778, 2016.
5 S. H. Gao, M. M. Cheng, K. Zhao, et al., "Res2Net: A New Multi-scale Backbone Architecture," in Proc. of IEEE TPAMI 2020, arXiv: 1904.01169, 2019.
6 H. Zhao, J. Shi, X. Qi, Wang, X, J, Jia "Pyramid scene parsing network," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881-2890, 2017.
7 W. Liu, A. Rabinovich, and A. C. Berg, "Parsenet: Looking wider to see better," arXiv: 1506.04579, 2015.
8 L.C. Chen, G. Papandreou, F. Schroff, H. Adam, "Rethinking atrous convolution for semantic image segmentation," in Proc. of IEEE International Conference on Robotics and Automation, 2020.
9 Y G. Cinar, H. Mirisaee, P. Goswami, et al., "Position-based Content Attention for Time Series Forecasting with Sequence-to-sequence RNNs," in Proc. of International Conference on Neural Information Processing, pp 533-544. 2017.
10 Li, X. Chen, Z. Zhu, et al., "Attention-guided Unified Network for Panoptic Segmentation," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 7026-7035, 2019.
11 J. Hu, L. Shen, and G. Sun, "Squeeze-and-excitation networks," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 7132-7141, 2017.
12 Y. Chen, J. Li, H. Xiao, S. Yan, "Dual path networks," in Proc. of Neural Information Processing, Systems, 2017.
13 S. Z. Li, "Markov random field models in computer vision," in Proc. of European Conference on Computer Vision, pp 361-370, 1994.
14 P. Krahenbuhl. V. Koltun, "Efficient inference in fully connected CRFs with gaussian edge potentials," in Proc. of the Advances in Neural information Processing Systems, pp. 109-117, 2011.
15 F. Shen, R. Gan S. Yan and G. Zeng, "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, arXiv: 1511.00561, 2015.
16 J. Jiang. Z. Zhang. Y. Huang. L Zheng, "Incorporating depth into both cnn and crf for indoor semantic segmentation," in Proc. of 8th IEEE International Conference on Software Engineering and Service Science. IEEE, pp. 525-530, 2007.
17 Saining Xie, Ross Girshick, Piotr Dollar, Zhuowen Tu, Kaiming He, "Aggregated Residual Transformations for Deep Neural Networks," in Proc. of The IEEE Conference on Computer Vision and Pattern Recognition, pp. 1492-1500, 2017.
18 Guosheng Lin, Chunhua Shen, Anton van den Hengel, Ian Reid, "Distant Supervision for Relation Extraction via Piecewise Convolutional Neural Networks," in Proc. of The IEEE Conference on Computer Vision and Pattern Recognition, pp. 3194-3203, 2016.
19 G Huang, Z. Liu, K. Q. Weinberger, and L. van der Maaten, "Densely connected convolutional networks," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 4700-4708, 2017.
20 Sara Vicente, Joao Carreira, Lourdes Agapito, Jorge Batista, "Beyond PASCAL: A benchmark for 3D object detection in the wild," in Proc. of The IEEE Conference on Computer Vision and Pattern Recognition, pp. 41-48, 2014.
21 A. Kendall, Y. Gal, R. Cipolla, "Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 7482-7491, 2017.
22 Falong Shen, Rui Gan, Shuicheng Yan, Gang Zeng, "Semantic Segmentation via Structured Patch Prediction, Context CRF and Guidance CRF," in Proc. of The IEEE Conference on Computer Vision and Pattern Recognition, pp. 1953-1961, 2017.
23 Marius Cordts, Mohamed Omran, Sebastian Ramos, Timo Rehfeld, Markus Enzweiler, Rodrigo Benenson, Uwe Franke, Stefan Roth, Bernt Schiele, The IT NowOxford University Press, pp. 10-10, 1997.
24 S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. Torr, "Conditional random fields as recurrent neural networks," in Proc. of the IEEE International Conference on Computer Vision, pp. 1529-1537, 2015.
25 T. Wu, S. Tang, R. Zhang, et al, "Tree-structured Kronecker Convolutional Networks for Semantic Segmentation," in Proc. of 2019 IEEE International Conference on Multimedia and Expo (ICME), 2019.