DOI QR코드

DOI QR Code

A Deep Neural Network Architecture for Real-Time Semantic Segmentation on Embedded Board

임베디드 보드에서 실시간 의미론적 분할을 위한 심층 신경망 구조

  • 이준엽 (인하대학교 정보통신공학) ;
  • 이영완 (인하대학교 정보통신공학)
  • Received : 2017.05.22
  • Accepted : 2017.10.18
  • Published : 2018.01.15

Abstract

We propose Wide Inception ResNet (WIR Net) an optimized neural network architecture as a real-time semantic segmentation method for autonomous driving. The neural network architecture consists of an encoder that extracts features by applying a residual connection and inception module, and a decoder that increases the resolution by using transposed convolution and a low layer feature map. We also improved the performance by applying an ELU activation function and optimized the neural network by reducing the number of layers and increasing the number of filters. The performance evaluations used an NVIDIA Geforce GTX 1080 and TX1 boards to assess the class and category IoU for cityscapes data in the driving environment. The experimental results show that the accuracy of class IoU 53.4, category IoU 81.8 and the execution speed of $640{\times}360$, $720{\times}480$ resolution image processing 17.8fps and 13.0fps on TX1 board.

본 논문은 자율주행을 위한 실시간 의미론적 분할 방법으로 최적화된 심층 신경망 구조인 Wide Inception ResNet (WIR Net)을 제안한다. 신경망 구조는 Residual connection과 Inception module을 적용하여 특징을 추출하는 인코더와 Transposed convolution과 낮은 층의 특징 맵을 사용하여 해상도를 높이는 디코더로 구성하였고 ELU 활성화 함수를 적용함으로써 성능을 올렸다. 또한 신경망의 전체 층수를 줄이고 필터 수를 늘리는 방법을 통해 성능을 최적화하였다. 성능평가는 NVIDIA Geforce gtx 1080과 TX1 보드를 사용하여 주행환경의 Cityscapes 데이터에 대해 클래스와 카테고리별 IoU를 평가하였다. 실험 결과를 통해 클래스 IoU 53.4, 카테고리 IoU 81.8의 정확도와 TX1 보드에서 $640{\times}360$, $720{\times}480$ 해상도 영상처리에 17.8fps, 13.0fps의 실행속도를 보여주는 것을 확인하였다.

Keywords

References

  1. Vijay Badrinarayanan, "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Imag Segmentation," Journal of IEEE, Vol. 39, pp. 2481-2495, 2017.
  2. Evan Shelhamer, "Fully Convolutional Networks for Semantic Segmentation," Journal of IEEE, Vol. 39, pp. 640-651, 2017.
  3. Sergey Zagoruyko, "Wide Residual Networks," BMVC'16 Conference, pp. 87.1-87.12, 2016.
  4. Christian Szegedy, "Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning," AAAI'17 Conference, pp. 4278-4284, 2017.
  5. Kaiming He, "Identity mappings in Deep Residual Networks," ECCV'16 Conference, pp. 630-645, 2016.
  6. Pedro O. Pinheiro, "Learning to Refine Object Segments," ECCV'16 Conference, pp. 75-91, 2016.
  7. Marius Cordts, "The Cityscapes Dataset for Semantic Urban Scene Understanding," IEEE Conference, pp. 3213-3223, 2016.
  8. Adam Paszke, "ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation," ICLR'17 Conference, pp. 1, 2017.
  9. Fisher Yu, "Multi-Scale Context Aggregation by Dilated Convolutions," ICLR'16 Conference, pp. 1, 2016.