DOI QR코드

DOI QR Code

DNN Based Multi-spectrum Pedestrian Detection Method Using Color and Thermal Image

DNN 기반 컬러와 열 영상을 이용한 다중 스펙트럼 보행자 검출 기법

  • Lee, Yongwoo (School of Electronic and Electrical Engineering, Sungkyunkwan University) ;
  • Shin, Jitae (School of Electronic and Electrical Engineering, Sungkyunkwan University)
  • 이용우 (성균관대학교 전자전기공학부) ;
  • 신지태 (성균관대학교 전자전기공학부)
  • Received : 2018.03.30
  • Accepted : 2018.04.30
  • Published : 2018.05.30

Abstract

As autonomous driving research is rapidly developing, pedestrian detection study is also successfully investigated. However, most of the study utilizes color image datasets and those are relatively easy to detect the pedestrian. In case of color images, the scene should be exposed by enough light in order to capture the pedestrian and it is not easy for the conventional methods to detect the pedestrian if it is the other case. Therefore, in this paper, we propose deep neural network (DNN)-based multi-spectrum pedestrian detection method using color and thermal images. Based on single-shot multibox detector (SSD), we propose fusion network structures which simultaneously employ color and thermal images. In the experiment, we used KAIST dataset. We showed that proposed SSD-H (SSD-Halfway fusion) technique shows 18.18% lower miss rate compared to the KAIST pedestrian detection baseline. In addition, the proposed method shows at least 2.1% lower miss rate compared to the conventional halfway fusion method.

자율주행 자동차의 연구가 빠르게 발전하는 가운데 보행자 검출에 대한 연구 또한 성공적으로 진행되고 있다. 그러나 대부분의 연구에서 사용되는 데이터셋이 컬러영상을 기반하고 있고 또한 보행자의 인식이 상대적으로 쉬운 영상이 많다. 컬러 영상의 경우 보행자가 빛에 노출되는 정도에 따라 영상에 제대로 포착이 되지 않을 수 있고 이로 인해 기존 방식들로는 이러한 보행자를 제대로 검출하지 못하는 상황이 발생한다. 따라서 본 논문에서는 DNN (deep neural network) 기반 컬러 영상과 열 영상을 이용한 다중 스펙트럼 보행자 검출 기법을 제안하고자 한다. 기존의 SSD (single shot multibox detector) 기법을 기반으로 하여 컬러 영상과 열 영상을 동시에 활용하는 퓨전 네트워크 구조를 제안한다. 실험은 KAIST의 데이터셋을 이용하여 실시하였고 제안한 기법인 SSD-H (SSD-Halfway fusion)의 방식이 KAIST 보행자 검출기준의 기준치보다 18.18% 낮은 miss rate를 획득하였고 또한 기존 halfway fusion 기법에 비해 최소 2.1% 낮은 miss rate를 획득하였다.

Keywords

References

  1. S. Zhang, R. Benenson, M. Omran, J. Hosang, and B. Schiele, "How far are we from solving pedestrian detection?," IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp.1259-1267, 2016.
  2. J. Wagner, V. Fischer, M. Herman, and S. Behnke, "Multispectral pedestrian detection using deep fusion convolutional neural networks," European Symposium on Artificial Neural Networks, Bruges, Belgium, pp. 509-514, 2016.
  3. R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," IEEE Conference on Computer Vision and Pattern Recognition, Columbus, USA, pp. 580-587, 2014.
  4. R. Girshick, "Fast r-cnn," arXiv preprint arXive:1504.08083, 2015.
  5. S. Ren, K. He, R. Girshick, and J. Sun, "Faster R-CNN: Towards real- time object detection with region proposal networks," Neural Information Processing Systems, Montreal, Canada, pp. 91-99, 2015.
  6. K. He, G. Gkioxari, P. Dollar, R. Girshick, "Mask R-CNN," IEEE International Conference on Computer Vision, Venice, Italy, pp. 2980-2988, 2017.
  7. J. Liu, S. Zhang, S. Wang, D. N. Metaxas, "Multispectral deep neural networks for pedestrian detection," arXiv preprint arXiv:1611.02644, 2016.
  8. W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. -Y. Fu, and A. C. Berg, "SSD: Single Shot MultiBox Detector," European Conference on Computer Vision, Amsterdam, the Netherlands, pp. 21-37, 2016.
  9. S. Hwang, J. Park, N. Kim, Y. Choi, and I. S. Kweon, "Multispectral pedestrian detection: benchmark dataset and baseline," IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 1037-1045, 2015.
  10. K. Simonyan, A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
  11. M. Lin, Q. Chen, S. Yan, "Network in Network," arXiv preprint arXive:1312.4400, 2013.
  12. P. Dollar, C. Wojek, B. Schiele, and P. Perona, "Pedestrian detection: A benchmark," IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA, pp. 304-311, 2009.
  13. P. Dollar, R. Appel, S. Belongie, and P. Perona, “Fast feature pyramids for object detection,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 36, No. 8, pp. 1532-1545, Jan. 2014. https://doi.org/10.1109/TPAMI.2014.2300479
  14. N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," IEEE Conference on Computer Vision and Pattern Recognition, San Diego, USA, pp. 886-893, 2005.