Browse > Article
http://dx.doi.org/10.22156/CS4SMB.2021.11.10.045

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning  

Shin, SeokYong (Department of Plasma Bio Display, Kwangwoon University)
Lee, SangHun (Ingenium College of Liberal Arts, Kwangwoon University)
Han, HyunHo (College of General Education, University of Ulsan)
Publication Information
Journal of Convergence for Information Technology / v.11, no.10, 2021 , pp. 45-52 More about this Journal
Abstract
In this paper, we proposed an Atrous Residual U-Net (AR-UNet) to improve the segmentation accuracy of semantic segmentation method based on U-Net. The U-Net is mainly used in fields such as medical image analysis, autonomous vehicles, and remote sensing images. The conventional U-Net lacks extracted features due to the small number of convolution layers in the encoder part. The extracted features are essential for classifying object categories, and if they are insufficient, it causes a problem of lowering the segmentation accuracy. Therefore, to improve this problem, we proposed the AR-UNet using residual learning and ASPP in the encoder. Residual learning improves feature extraction ability and is effective in preventing feature loss and vanishing gradient problems caused by continuous convolutions. In addition, ASPP enables additional feature extraction without reducing the resolution of the feature map. Experiments verified the effectiveness of the AR-UNet with Cityscapes dataset. The experimental results showed that the AR-UNet showed improved segmentation results compared to the conventional U-Net. In this way, AR-UNet can contribute to the advancement of many applications where accuracy is important.
Keywords
ASPP; Deep learning; Encoder-Decoder; Residual learning; Semantic segmentation; U-Net;
Citations & Related Records
연도 인용수 순위
  • Reference
1 K. He, X. Zhang, S. Ren & J. Sun. (2016). Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
2 E. Sovetkin, E. J. Achterberg, T. Weber & B. E. Pieters. (2021). Encoder-Decoder Semantic Segmentation Models for Electroluminescence Images of Thin-Film Photovoltaic Modules. IEEE Journal of Photovoltaics, 11(2), 444-452. DOI : 10.1109/JPHOTOV.2020.3041240   DOI
3 M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth & B. Schiele. (2016). The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016-Decem, (pp. 3213-3223). DOI : 10.1109/CVPR.2016.350   DOI
4 F. Chollet. (2017). Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017-Janua, (pp. 1800-1807). DOI : 10.1109/CVPR.2017.195   DOI
5 S. Y. Shin, S. H. Lee & H, H, Han (2021). A Study on Residual U-Net for Semantic Segmentation based on Deep Learning. Journal of Digital Convergence, 19(6), 251-258. DOI : 10.14400/JDC.2021.19.6.251   DOI
6 S. Shin, H. Han & S. H. Lee. (2021). Improved YOLOv3 with duplex FPN for object detection based on deep learning. The International Journal of Electrical Engineering & Education, 002072092098352. DOI : 10.1177/0020720920983524   DOI
7 S. Y. Shin, S. H. Lee & J. S. Kim (2021, January) Modified Encoder-Decoder model of U-Net for Semantic Segmentation based on Deep Learning. The 7th International Conference on Small & Medium Business. (pp.379-380). Jeju : SMB.
8 V. Badrinarayanan, A. Kendall & R. Cipolla. (2017). SegNet: A Deep Convolutional EncoderDecoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481-2495. DOI : 10.1109/TPAMI.2016.2644615   DOI
9 A. Kirillov, K. He, R. Girshick, C. Rother & P. Dollar. (2019). Panoptic Segmentation. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019-June, 9396-9405. DOI : 10.1109/CVPR.2019.00963   DOI
10 S. Estrada, S. Conjeti, M. Ahmad, N. Navab & M. Reuter. (2018). Competition vs. Concatenation in Skip Connections of Fully Convolutional Networks, In International Workshop on Machine Learning in Medical Imaging (pp. 214-222). DOI : 10.1007/978-3-030-00919-9_25   DOI
11 L. Chen, G. Papandreou, F. Schroff & H. Adam. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation. arXiv preprint arXiv:1706.05587
12 I. Loshchilov & F. Hutter (2019). Decoupled Weight Decay Regularization. 7th International Conference on Learning Representations, ICLR 2019. . arXiv preprint arXiv:1711.05101.
13 E. Shelhamer, J. Long & T. Darrell. (2017). Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640-651. DOI : 10.1109/TPAMI.2016.2572683   DOI
14 H. Zhao, X. Qi, X. Shen, J. Shi & J. Jia. (2018). ICNet for Real-Time Semantic Segmentation on High-Resolution Images. In European Conference on Computer Vision (pp. 418-434). DOI : 10.1007/978-3-030-01219-9_25   DOI
15 L.-C. Chen, G. Papandreou, I. Kokkinos, K. Murphy & A. L. Yuille. (2018). DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4), 834-848. DOI : 10.1109/TPAMI.2017.2699184   DOI
16 O. Ronneberger, P. Fischer & T. Brox. (2015). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Springer.