Browse > Article
http://dx.doi.org/10.15207/JKCS.2021.12.10.055

A Study on Attention Mechanism in DeepLabv3+ for Deep Learning-based Semantic Segmentation  

Shin, SeokYong (Department of Plasma Bio Display, Kwangwoon University)
Lee, SangHun (Ingenium College of Liberal Arts, Kwangwoon University)
Han, HyunHo (College of General Education, University of Ulsan)
Publication Information
Journal of the Korea Convergence Society / v.12, no.10, 2021 , pp. 55-61 More about this Journal
Abstract
In this paper, we proposed a DeepLabv3+ based encoder-decoder model utilizing an attention mechanism for precise semantic segmentation. The DeepLabv3+ is a semantic segmentation method based on deep learning and is mainly used in applications such as autonomous vehicles, and infrared image analysis. In the conventional DeepLabv3+, there is little use of the encoder's intermediate feature map in the decoder part, resulting in loss in restoration process. Such restoration loss causes a problem of reducing segmentation accuracy. Therefore, the proposed method firstly minimized the restoration loss by additionally using one intermediate feature map. Furthermore, we fused hierarchically from small feature map in order to effectively utilize this. Finally, we applied an attention mechanism to the decoder to maximize the decoder's ability to converge intermediate feature maps. We evaluated the proposed method on the Cityscapes dataset, which is commonly used for street scene image segmentation research. Experiment results showed that our proposed method improved segmentation results compared to the conventional DeepLabv3+. The proposed method can be used in applications that require high accuracy.
Keywords
Attention mechanism; DeepLab; Deep learning Convergence image processing; Encoder-decoder; Semantic segmentation;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. He, X. Zhang, S. Ren & J. Sun. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
2 M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth & B. Schiele. (2016). The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016-Decem, 3213-3223. DOI : 10.1109/CVPR.2016.350   DOI
3 I. Loshchilov & F. Hutter. (2019). Decoupled Weight Decay Regularization. 7th International Conference on Learning Representations, ICLR 2019. http://arxiv.org/abs/1711.05101
4 S. Y. Shin, S. H. Lee & J. S. Kim (2021) Modified DeepLabV3+ for Semantic Segmentation based on Deep Learning. The 11th International Conference on Convergence Technology. (pp.266-367). Jeju : KCS.
5 V. Badrinarayanan, A. Kendall & R. Cipolla. (2017). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481-2495. DOI : 10.1109/TPAMI.2016.2644615   DOI
6 S. Y. Shin, H. H. Han & S. H. Lee (2021). Improved YOLOv3 with duplex FPN for object detection based on deep learning. The International Journal of Electrical Engineering & Education, 002072092098352. DOI : 10.1177/0020720920983524   DOI
7 E. Shelhamer, J. Long & T. Darrell. (2017). Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640-651. DOI : 10.1109/TPAMI.2016.2572683   DOI
8 O. Ronneberger, P. Fischer & T. Brox. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9351, Issue Cvd, pp. 234-241). DOI : 10.1007/978-3-319-24574-4_28   DOI
9 L. Chen, G. Papandreou, F. Schroff & H. Adam. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv preprint arXiv:1706.05587. http://arxiv.org/abs/1706.05587
10 L. C. Chen, Y. Zhu, G. Papandreou, F. Schroff & H. Adam. (2018). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Pertanika Journal of Tropical Agricultural Science, 34(1), 833-851. DOI : 10.1007/978-3-030-01234-2_49   DOI
11 E. Sovetkin, E. J. Achterberg, T. Weber & B. E. Pieters. (2021). Encoder-Decoder Semantic Segmentation Models for Electroluminescence Images of Thin-Film Photovoltaic Modules. IEEE Journal of Photovoltaics, 11(2), 444-452. DOI : 10.1109/JPHOTOV.2020.3041240   DOI
12 S. Estrada, S. Conjeti, M. Ahmad, N. Navab & M. Reuter. (2018). Competition vs. Concatenation in Skip Connections of Fully Convolutional Networks. In International Workshop on Machine Learning in Medical Imaging (pp. 214-222). Springer, Cham. DOI : 10.1007/978-3-030-00919-9_25   DOI
13 F. Chollet. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017-Janua, 1800-1807. DOI : 10.1109/CVPR.2017.195   DOI
14 L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy & A. L. Yuille. (2018). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834-848. DOI : 10.1109/TPAMI.2017.2699184   DOI
15 L. Chen, G. Papandreou, I. Kokkinos, K. Murphy & A. L. Yuille. (2014). Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. arXiv preprint arXiv:1412.7062. 1-14. http://arxiv.org/abs/1412.7062
16 S. Y. Shin, S. H. Lee & H. H. Han (2021). A Study on Residual U-Net for Semantic Segmentation based on Deep Learning. Journal of Digital Convergence, 19(6), 251-258. DOI : 10.14400/JDC.2021.19.6.251   DOI