[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.15207/JKCS.2021.12.11.045

Modified Pyramid Scene Parsing Network with Deep Learning based Multi Scale Attention

Kim, Jun-Hyeok (Dept. of Plasma Bio Display, KwangWoon University)
Lee, Sang-Hun (Ingenium College of Liberal Arts, KwangWoon University)
Han, Hyun-Ho (College of General Education, University of Ulsan)

Publication Information

Journal of the Korea Convergence Society / v.12, no.11, 2021 , pp. 45-51 More about this Journal

Abstract

With the development of deep learning, semantic segmentation methods are being studied in various fields. There is a problem that segmenation accuracy drops in fields that require accuracy such as medical image analysis. In this paper, we improved PSPNet, which is a deep learning based segmentation method to minimized the loss of features during semantic segmentation. Conventional deep learning based segmentation methods result in lower resolution and loss of object features during feature extraction and compression. Due to these losses, the edge and the internal information of the object are lost, and there is a problem that the accuracy at the time of object segmentation is lowered. To solve these problems, we improved PSPNet, which is a semantic segmentation model. The multi-scale attention proposed to the conventional PSPNet was added to prevent feature loss of objects. The feature purification process was performed by applying the attention method to the conventional PPM module. By suppressing unnecessary feature information, eadg and texture information was improved. The proposed method trained on the Cityscapes dataset and use the segmentation index MIoU for quantitative evaluation. As a result of the experiment, the segmentation accuracy was improved by about 1.5% compared to the conventional PSPNet.

Keywords

Deep learning; Image processing; Multi scale; Semantic segmentation; ResNeXt;

Citations & Related Records

Reference

1	S. Xie, R. Girshick, P. Dollar, Z. Tu & K. He. (2017). Aggregated Residual Transformations for Deep Neural Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI : 10.1109/ CVPR.2017.634. DOI
2	L. C. Chen, G. Papandreou, F. Schroff & H. Adam. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation.
3	A. Sinha & J. Dolz. (2021). Multi-Scale Self-Guided Attention for Medical Image Segmentation. IEEE Journal of Biomedical and Health Informatics, 25(1), 121-130. DOI
4	Y. Su, Y. Wu, M. Wang, F. Wang & J. Cheng. (2019). Semantic Segmentation of High Resolution Remote Sensing Image Based on Batch-Attention Mechanism. International Geoscience and Remote Sensing Symposium (IGARSS), 3856-3859.
5	L. C. Chen, Y. Zhu, G. Papandreou, F. Schroff & H. Adam. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11211 LNCS, 833-851.
6	C. S. Park, S. H. Lee & H. H. Han. (2021). A Study on Lightweight Model with Attention Process for Efficient Object Detection. Journal of Digital Convergence, 19(5), 307-313. DOI : 10.14400/JDC.2021.19.5.307 DOI
7	S. Shin, S. Lee & H. Han. (2021). A Study on Residual U-Net for Semantic Segmentation based on Deep Learning. Journal of Digital Convergence, 19(6), 251-258. DOI : 10.14400/JDC.2021.19.6.251 DOI
8	V. Badrinarayanan, A. Kendall & R. Cipolla (2017). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481-2495. DOI
9	O. Ronneberger, P. Fischer & T. Brox. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9351, Issue Cvd, pp. 234-241).
10	H. Zhao, J. Shi, X. Qi, X. Wang & J. Jia. (2017). Pyramid Scene Parsing Network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI : 10.1109/CVPR.2017.660 DOI
11	S. Woo, J. Park, J. Y. Lee & I. S. Kweon. (2018). CBAM: Convolutional block attention module. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11211 LNCS, 3-19.
12	E. Sovetkin, E. J. Achterberg, T. Weber & B. E. Pieters. (2021). Encoder-Decoder Semantic Segmentation Models for Electroluminescence Images of Thin-Film Photovoltaic Modules. IEEE Journal of Photovoltaics, 11(2), 444-452. DOI
13	L. Sun, W. Shao, D. Zhang & M. Liu. (2020). Anatomical Attention Guided Deep Networks for ROI Segmentation of Brain MR Images. IEEE Transactions on Medical Imaging, 39(6), 2000-2012. DOI
14	S. Shin, H. Han & S. H. Lee. (2021). Improved YOLOv3 with duplex FPN for object detection based on deep learning. The International Journal of Electrical Engineering & Education. DOI : 10.1177/0020720920983524. DOI
15	E. Shelhamer, J. Long & T. Darrell. (2017). Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640-651. DOI

KSCI

Modified Pyramid Scene Parsing Network with Deep Learning based Multi Scale Attention 딥러닝 기반의 Multi Scale Attention을 적용한 개선된 Pyramid Scene Parsing Network

Modified Pyramid Scene Parsing Network with Deep Learning based Multi Scale Attention