[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.22156/CS4SMB.2021.11.10.045

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning

Shin, SeokYong (Department of Plasma Bio Display, Kwangwoon University)
Lee, SangHun (Ingenium College of Liberal Arts, Kwangwoon University)
Han, HyunHo (College of General Education, University of Ulsan)

Publication Information

Journal of Convergence for Information Technology / v.11, no.10, 2021 , pp. 45-52 More about this Journal

Abstract

In this paper, we proposed an Atrous Residual U-Net (AR-UNet) to improve the segmentation accuracy of semantic segmentation method based on U-Net. The U-Net is mainly used in fields such as medical image analysis, autonomous vehicles, and remote sensing images. The conventional U-Net lacks extracted features due to the small number of convolution layers in the encoder part. The extracted features are essential for classifying object categories, and if they are insufficient, it causes a problem of lowering the segmentation accuracy. Therefore, to improve this problem, we proposed the AR-UNet using residual learning and ASPP in the encoder. Residual learning improves feature extraction ability and is effective in preventing feature loss and vanishing gradient problems caused by continuous convolutions. In addition, ASPP enables additional feature extraction without reducing the resolution of the feature map. Experiments verified the effectiveness of the AR-UNet with Cityscapes dataset. The experimental results showed that the AR-UNet showed improved segmentation results compared to the conventional U-Net. In this way, AR-UNet can contribute to the advancement of many applications where accuracy is important.

Keywords

ASPP; Deep learning; Encoder-Decoder; Residual learning; Semantic segmentation; U-Net;

Citations & Related Records

Reference

1	K. He, X. Zhang, S. Ren & J. Sun. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
2	E. Sovetkin, E. J. Achterberg, T. Weber & B. E. Pieters. (2021). Encoder-Decoder Semantic Segmentation Models for Electroluminescence Images of Thin-Film Photovoltaic Modules. IEEE Journal of Photovoltaics, 11(2), 444-452. DOI : 10.1109/JPHOTOV.2020.3041240 DOI
3	M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth & B. Schiele. (2016). The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016-Decem, (pp. 3213-3223). DOI : 10.1109/CVPR.2016.350 DOI
4	F. Chollet. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017-Janua, (pp. 1800-1807). DOI : 10.1109/CVPR.2017.195 DOI
5	S. Y. Shin, S. H. Lee & H, H, Han (2021). A Study on Residual U-Net for Semantic Segmentation based on Deep Learning. Journal of Digital Convergence, 19(6), 251-258. DOI : 10.14400/JDC.2021.19.6.251 DOI
6	S. Shin, H. Han & S. H. Lee. (2021). Improved YOLOv3 with duplex FPN for object detection based on deep learning. The International Journal of Electrical Engineering & Education, 002072092098352. DOI : 10.1177/0020720920983524 DOI
7	S. Y. Shin, S. H. Lee & J. S. Kim (2021, January) Modified Encoder-Decoder model of U-Net for Semantic Segmentation based on Deep Learning. The 7th International Conference on Small & Medium Business. (pp.379-380). Jeju : SMB.
8	V. Badrinarayanan, A. Kendall & R. Cipolla. (2017). SegNet: A Deep Convolutional EncoderDecoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481-2495. DOI : 10.1109/TPAMI.2016.2644615 DOI
9	A. Kirillov, K. He, R. Girshick, C. Rother & P. Dollar. (2019). Panoptic Segmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019-June, 9396-9405. DOI : 10.1109/CVPR.2019.00963 DOI
10	S. Estrada, S. Conjeti, M. Ahmad, N. Navab & M. Reuter. (2018). Competition vs. Concatenation in Skip Connections of Fully Convolutional Networks, In International Workshop on Machine Learning in Medical Imaging (pp. 214-222). DOI : 10.1007/978-3-030-00919-9_25 DOI
11	L. Chen, G. Papandreou, F. Schroff & H. Adam. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv preprint arXiv:1706.05587
12	I. Loshchilov & F. Hutter (2019). Decoupled Weight Decay Regularization. 7th International Conference on Learning Representations, ICLR 2019. . arXiv preprint arXiv:1711.05101.
13	E. Shelhamer, J. Long & T. Darrell. (2017). Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640-651. DOI : 10.1109/TPAMI.2016.2572683 DOI
14	H. Zhao, X. Qi, X. Shen, J. Shi & J. Jia. (2018). ICNet for Real-Time Semantic Segmentation on High-Resolution Images. In European Conference on Computer Vision (pp. 418-434). DOI : 10.1007/978-3-030-01219-9_25 DOI
15	L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy & A. L. Yuille. (2018). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834-848. DOI : 10.1109/TPAMI.2017.2699184 DOI
16	O. Ronneberger, P. Fischer & T. Brox. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer.

KSCI

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning 딥러닝 기반 거리 영상의 Semantic Segmentation을 위한 Atrous Residual U-Net

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning