DOI QR코드

DOI QR Code

Modified Pyramid Scene Parsing Network with Deep Learning based Multi Scale Attention

딥러닝 기반의 Multi Scale Attention을 적용한 개선된 Pyramid Scene Parsing Network

  • Kim, Jun-Hyeok (Dept. of Plasma Bio Display, KwangWoon University) ;
  • Lee, Sang-Hun (Ingenium College of Liberal Arts, KwangWoon University) ;
  • Han, Hyun-Ho (College of General Education, University of Ulsan)
  • 김준혁 (광운대학교 플라즈마바이오디스플레이학과) ;
  • 이상훈 (광운대학교 인제니움학부) ;
  • 한현호 (울산대학교 교양대학)
  • Received : 2021.08.17
  • Accepted : 2021.11.20
  • Published : 2021.11.28

Abstract

With the development of deep learning, semantic segmentation methods are being studied in various fields. There is a problem that segmenation accuracy drops in fields that require accuracy such as medical image analysis. In this paper, we improved PSPNet, which is a deep learning based segmentation method to minimized the loss of features during semantic segmentation. Conventional deep learning based segmentation methods result in lower resolution and loss of object features during feature extraction and compression. Due to these losses, the edge and the internal information of the object are lost, and there is a problem that the accuracy at the time of object segmentation is lowered. To solve these problems, we improved PSPNet, which is a semantic segmentation model. The multi-scale attention proposed to the conventional PSPNet was added to prevent feature loss of objects. The feature purification process was performed by applying the attention method to the conventional PPM module. By suppressing unnecessary feature information, eadg and texture information was improved. The proposed method trained on the Cityscapes dataset and use the segmentation index MIoU for quantitative evaluation. As a result of the experiment, the segmentation accuracy was improved by about 1.5% compared to the conventional PSPNet.

딥러닝의 발전으로 인하여 의미론적 분할 방법은 다양한 분야에서 연구되고 있다. 의료 영상 분석과 같이 정확성을 요구하는 분야에서 분할 정확도가 떨어지는 문제가 있다. 본 논문은 의미론적 분할 시 특징 손실을 최소화하기 위해 딥러닝 기반 분할 방법인 PSPNet을 개선하였다. 기존 딥러닝 기반의 분할 방법은 특징 추출 및 압축 과정에서 해상도가 낮아져 객체에 대한 특징 손실이 발생한다. 이러한 손실로 윤곽선이나 객체 내부 정보에 손실이 발생하여 객체 분류 시 정확도가 낮아지는 문제가 있다. 이러한 문제를 해결하기 위해 의미론적 분할 모델인 PSPNet을 개선하였다. 기존 PSPNet에 제안하는 multi scale attention을 추가하여 객체의 특징 손실을 방지하였다. 기존 PPM 모듈에 attention 방법을 적용하여 특징 정제 과정을 수행하였다. 불필요한 특징 정보를 억제함으로써 윤곽선 및 질감 정보가 개선되었다. 제안하는 방법은 Cityscapes 데이터 셋으로 학습하였으며, 정량적 평가를 위해 분할 지표인 MIoU를 사용하였다. 실험을 통해 기존 PSPNet 대비 분할 정확도가 약 1.5% 향상되었다.

Keywords

References

  1. C. S. Park, S. H. Lee & H. H. Han. (2021). A Study on Lightweight Model with Attention Process for Efficient Object Detection. Journal of Digital Convergence, 19(5), 307-313. DOI : 10.14400/JDC.2021.19.5.307
  2. S. Shin, S. Lee & H. Han. (2021). A Study on Residual U-Net for Semantic Segmentation based on Deep Learning. Journal of Digital Convergence, 19(6), 251-258. DOI : 10.14400/JDC.2021.19.6.251
  3. S. Shin, H. Han & S. H. Lee. (2021). Improved YOLOv3 with duplex FPN for object detection based on deep learning. The International Journal of Electrical Engineering & Education. DOI : 10.1177/0020720920983524.
  4. V. Badrinarayanan, A. Kendall & R. Cipolla (2017). SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(12), 2481-2495. https://doi.org/10.1109/TPAMI.2016.2644615
  5. E. Sovetkin, E. J. Achterberg, T. Weber & B. E. Pieters. (2021). Encoder-Decoder Semantic Segmentation Models for Electroluminescence Images of Thin-Film Photovoltaic Modules. IEEE Journal of Photovoltaics, 11(2), 444-452. https://doi.org/10.1109/JPHOTOV.2020.3041240
  6. E. Shelhamer, J. Long & T. Darrell. (2017). Fully Convolutional Networks for Semantic Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4), 640-651. https://doi.org/10.1109/TPAMI.2016.2572683
  7. O. Ronneberger, P. Fischer & T. Brox. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9351, Issue Cvd, pp. 234-241).
  8. H. Zhao, J. Shi, X. Qi, X. Wang & J. Jia. (2017). Pyramid Scene Parsing Network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI : 10.1109/CVPR.2017.660
  9. S. Woo, J. Park, J. Y. Lee & I. S. Kweon. (2018). CBAM: Convolutional block attention module. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11211 LNCS, 3-19.
  10. Y. Su, Y. Wu, M. Wang, F. Wang & J. Cheng. (2019). Semantic Segmentation of High Resolution Remote Sensing Image Based on Batch-Attention Mechanism. International Geoscience and Remote Sensing Symposium (IGARSS), 3856-3859.
  11. L. Sun, W. Shao, D. Zhang & M. Liu. (2020). Anatomical Attention Guided Deep Networks for ROI Segmentation of Brain MR Images. IEEE Transactions on Medical Imaging, 39(6), 2000-2012. https://doi.org/10.1109/tmi.2019.2962792
  12. A. Sinha & J. Dolz. (2021). Multi-Scale Self-Guided Attention for Medical Image Segmentation. IEEE Journal of Biomedical and Health Informatics, 25(1), 121-130. https://doi.org/10.1109/JBHI.2020.2986926
  13. S. Xie, R. Girshick, P. Dollar, Z. Tu & K. He. (2017). Aggregated Residual Transformations for Deep Neural Networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), DOI : 10.1109/ CVPR.2017.634.
  14. L. C. Chen, G. Papandreou, F. Schroff & H. Adam. (2017). Rethinking Atrous Convolution for Semantic Image Segmentation.
  15. L. C. Chen, Y. Zhu, G. Papandreou, F. Schroff & H. Adam. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 11211 LNCS, 833-851.