Deep Convolutional Neural Networks를 이용한 객체 검출 성능의 발전 동향

Jo, Seon-Yeong;Sin, Yeong-Suk;

Broadcasting and Media Magazine (방송과미디어)

Volume 22 Issue 1
/
Pages.19-33
/
2017
/
2383-9708(pISSN)

The Korean Institute of Broadcast and Media Engineers (한국방송∙미디어공학회)

Deep Convolutional Neural Networks를 이용한 객체 검출 성능의 발전 동향

조선영 (국방과학연구소) ;
신영숙 (국방과학연구소)

Published : 2017.01.30

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

새로운 영상 미디어 서비스 기술의 발전으로 인해 다양한 영상 인식 기술이 요구되고 있으며, 특히 영상으로부터 특정 객체를 검출하는 기술은 객체와 관련된 광고나 서비스 등의 다양한 활용 분야를 창출하는 핵심 기술이다. 객체 검출 기술이 방송미디어 기술에 적극적으로 활용되기 위해서는 빠르면서도 정확한 성능을 가진 알고리즘 개발이 필수적이다. 본 논문에서는 전통적인 객체 검출 방법들에 비해 우수한 성능을 가지는 Deep Convolutional Neural Networks 기반 객체 검출 방법들을 분석한다. 최근에 소개된 주요 객체 검출 방법들의 연구 배경과 발전 동향을 소개하고, 각 방법의 핵심 알고리즘 및 장단점에 대해 분석한다. 또한 객체 검출의 성능을 평가하기 위해 사용되는 대표적인 데이터셋을 소개하고, 다양한 네트워크 구조/크기 및 학습 데이터 등의 관점에서 각 방법들의 성능을 비교한다. 마지막으로 기존의 객체 검출 방법들을 분석한 내용을 바탕으로 향후 객체 검출 방법들의 발전 방향 및 활용 가능성을 예측해보고자 한다.

Keywords

References

C. Szegedy, A. Toshev, and D. Erhan. Deep neural networks for object detection. In NIPS, 2013.
H. A. Rowley, S. Baluja, and T. Kanade. Neural network-based face detection. TPAMI, 1998.
P. Sermanet, K. Kavukcuoglu, S. Chintala, and Y. LeCun. Pedestrian detection with unsupervised multi-stage feature learning. In CVPR, 2013.
R. Vaillant, C. Monrocq, and Y. LeCun. One approach for the localization of objects in images. IEEE Proc on Vision, Image, and Signal Processing, 1994.
C. Gu, J. J. Lim, P. Arbelaez, and J. Malik. Recognition using regions. In CVPR, 2009.
J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders. Selective search for object recognition. IJCV, 2013.
J. Carreira and C. Sminchisescu. CPMC: Automatic object segmentation using constrained parametric min-cuts. TPAMI, 2012.
R. Girshick, J. Donahue, T. Darrel, and J. Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In CVPR, 2014.
P. Sermanet, D. Eigen, X. Zhang, M. Mathieu, R. Fergus, and Y. LeCun. OverFeat: Integrated recognition, localization and detection using convolutional networks. In ICLR, 2014.
K. He, X. Zhang, S. Ren, and J. Sun. Spatial pyramid pooling in deep convolutional networks for visual recognition. In ECCV, 2014.
R. Girshick. Fast R-CNN. In ICCV, 2015.
S. Ren, K. He, R. Girshick, and J. Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In NIPS, 2015.
A. Krizhevsky, I. Sutskever, and G. Hinton. ImageNet classification with deep convolutional neural networks. In NIPS, 2012.
Dalal and Triggs, Histograms of Oriented Gradients for Human Detection. In CVPR 2005.
J. Uijlings, K. van de Sande, T. Gevers, and A. Smeulders. Selective search for object recognition. In IJCV, 2013.
B. Alexe, T. Deselaers, and V. Ferrari. Measuring the objectness of image windows. TPAMI, 2012.
C. L. Zitnick and P. Dollar. Edge boxes: Locating object proposals from edges. In ECCV, 2014.
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proc. Of the IEEE, 1998.
K. Fukushima. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36(4):193-202, 1980. https://doi.org/10.1007/BF00344251
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, A. Rabinovich. Going deeper with convolutions. Technical report, 2014.
M. D. Zeiler, R. Fergus. Visualizing and understanding convolutional networks. CoRR, 2013.
K. Simonyan, A. Vedaldi, A. Zisserman. Deep fisher networks for large-scale image classification. In NIPS, 2013.
K. He, X. Zhang, S. Ren, J. Sun. Deep residual learning for image recognition. In CVPR, 2016.
D. Lowe. Distinctive image features from scale-invariant keypoints. IJCV, 2004.
P. Felzenszwalb, R. Girshick, D. McAllester, and D. Ramanan. Object detection with discriminatively trained part based models. TPAMI, 2010.
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR, 2009.
J. Hosang, M. Omran, R. Benenson, and B. Schiele. Taking a deeper look at pedestrians. In CVPR, 2015.
M. Everingham, S. M. Ali Eslami, L. V. Gool, C. K. I. Williams, J. Winn, A. Zisserman. The PASCAL visual object classes challenge: a retrospective. IJCV, 2015.
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, L. Fei-Fei. ImageNet large scale visual recognition challenge. IJCV, 2015.
T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollr, C. L. Zitnick. Microsoft COCO: Common objects in context. In ECCV, 2014.
R. Girshick, P. Felzenszwalb, D. McAllester. Discriminatively trained deformable part models, release 5. In http://people.cs.uchicago.edu/rbg/latent-release5/.
K. Chatfield, K. Simonyan, A. Vedaldi, A. Zisserman. Return of the devil in the details: Delving deep into convolutional nets. In BMVC, 2014.
K. Simonyan, A. Zisserman. Very deep convolutional networks for large-scale image recognition. In ICLR, 2015.
J. Redmon, S. Divvala, R. Girshick, A. Farhadi. You Only Look Once: unified, real-time object detection. In CVPR, 2016.

Broadcasting and Media Magazine (방송과미디어)

Deep Convolutional Neural Networks를 이용한 객체 검출 성능의 발전 동향

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)