[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.15207/JKCS.2021.12.10.055

A Study on Attention Mechanism in DeepLabv3+ for Deep Learning-based Semantic Segmentation

Shin, SeokYong (Department of Plasma Bio Display, Kwangwoon University)
Lee, SangHun (Ingenium College of Liberal Arts, Kwangwoon University)
Han, HyunHo (College of General Education, University of Ulsan)

Publication Information

Journal of the Korea Convergence Society / v.12, no.10, 2021 , pp. 55-61 More about this Journal

Abstract

In this paper, we proposed a DeepLabv3+ based encoder-decoder model utilizing an attention mechanism for precise semantic segmentation. The DeepLabv3+ is a semantic segmentation method based on deep learning and is mainly used in applications such as autonomous vehicles, and infrared image analysis. In the conventional DeepLabv3+, there is little use of the encoder's intermediate feature map in the decoder part, resulting in loss in restoration process. Such restoration loss causes a problem of reducing segmentation accuracy. Therefore, the proposed method firstly minimized the restoration loss by additionally using one intermediate feature map. Furthermore, we fused hierarchically from small feature map in order to effectively utilize this. Finally, we applied an attention mechanism to the decoder to maximize the decoder's ability to converge intermediate feature maps. We evaluated the proposed method on the Cityscapes dataset, which is commonly used for street scene image segmentation research. Experiment results showed that our proposed method improved segmentation results compared to the conventional DeepLabv3+. The proposed method can be used in applications that require high accuracy.

Keywords

Attention mechanism; DeepLab; Deep learning Convergence image processing; Encoder-decoder; Semantic segmentation;

Citations & Related Records

Reference

1	K. He, X. Zhang, S. Ren & J. Sun. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
2	M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth & B. Schiele. (2016). The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016-Decem, 3213-3223. DOI : 10.1109/CVPR.2016.350 DOI
3	I. Loshchilov & F. Hutter. (2019). Decoupled Weight Decay Regularization. 7th International Conference on Learning Representations, ICLR 2019. http://arxiv.org/abs/1711.05101
4	S. Y. Shin, S. H. Lee & J. S. Kim (2021) Modified DeepLabV3+ for Semantic Segmentation based on Deep Learning. The 11th International Conference on Convergence Technology. (pp.266-367). Jeju : KCS.
5	V. Badrinarayanan, A. Kendall & R. Cipolla. (2017). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481-2495. DOI : 10.1109/TPAMI.2016.2644615 DOI
6	S. Y. Shin, H. H. Han & S. H. Lee (2021). Improved YOLOv3 with duplex FPN for object detection based on deep learning. The International Journal of Electrical Engineering & Education, 002072092098352. DOI : 10.1177/0020720920983524 DOI
7	E. Shelhamer, J. Long & T. Darrell. (2017). Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640-651. DOI : 10.1109/TPAMI.2016.2572683 DOI
8	O. Ronneberger, P. Fischer & T. Brox. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9351, Issue Cvd, pp. 234-241). DOI : 10.1007/978-3-319-24574-4_28 DOI
9	L. Chen, G. Papandreou, F. Schroff & H. Adam. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv preprint arXiv:1706.05587. http://arxiv.org/abs/1706.05587
10	L. C. Chen, Y. Zhu, G. Papandreou, F. Schroff & H. Adam. (2018). Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Pertanika Journal of Tropical Agricultural Science, 34(1), 833-851. DOI : 10.1007/978-3-030-01234-2_49 DOI
11	E. Sovetkin, E. J. Achterberg, T. Weber & B. E. Pieters. (2021). Encoder-Decoder Semantic Segmentation Models for Electroluminescence Images of Thin-Film Photovoltaic Modules. IEEE Journal of Photovoltaics, 11(2), 444-452. DOI : 10.1109/JPHOTOV.2020.3041240 DOI
12	S. Estrada, S. Conjeti, M. Ahmad, N. Navab & M. Reuter. (2018). Competition vs. Concatenation in Skip Connections of Fully Convolutional Networks. In International Workshop on Machine Learning in Medical Imaging (pp. 214-222). Springer, Cham. DOI : 10.1007/978-3-030-00919-9_25 DOI
13	F. Chollet. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017-Janua, 1800-1807. DOI : 10.1109/CVPR.2017.195 DOI
14	L. C. Chen, G. Papandreou, I. Kokkinos, K. Murphy & A. L. Yuille. (2018). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834-848. DOI : 10.1109/TPAMI.2017.2699184 DOI
15	L. Chen, G. Papandreou, I. Kokkinos, K. Murphy & A. L. Yuille. (2014). Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs. arXiv preprint arXiv:1412.7062. 1-14. http://arxiv.org/abs/1412.7062
16	S. Y. Shin, S. H. Lee & H. H. Han (2021). A Study on Residual U-Net for Semantic Segmentation based on Deep Learning. Journal of Digital Convergence, 19(6), 251-258. DOI : 10.14400/JDC.2021.19.6.251 DOI

KSCI

A Study on Attention Mechanism in DeepLabv3+ for Deep Learning-based Semantic Segmentation 딥러닝 기반의 Semantic Segmentation을 위한 DeepLabv3+에서 강조 기법에 관한 연구

A Study on Attention Mechanism in DeepLabv3+ for Deep Learning-based Semantic Segmentation