DOI QR코드

DOI QR Code

Multi-scale face detector using anchor free method

  • Lee, Dong-Ryeol (School of Computer Science and Engineering, Kangwon National University) ;
  • Kim, Yoon (Dept. of Computer Science and Engineering, Kangwon National University)
  • Received : 2020.06.11
  • Accepted : 2020.07.03
  • Published : 2020.07.31

Abstract

In this paper, we propose one stage multi-scale face detector based Fully Convolution Network using anchor free method. Recently almost all state-of-the-art face detectors which predict location of faces using anchor-based methods rely on pre-defined anchor boxes. However this face detectors need to hyper-parameters and additional computation in training. The key idea of the proposed method is to eliminate hyper-parameters and additional computation using anchor free method. To do this, we apply two ideas. First, by eliminating the pre-defined set of anchor boxes, we avoid the additional computation and hyper-parameters related to anchor boxes. Second, our detector predicts location of faces using multi-feature maps to reduce foreground/background imbalance issue. Through Quantitative evaluation, the performance of the proposed method is evaluated and analyzed. Experimental results on the FDDB dataset demonstrate the effective of our proposed method.

본 논문에서는 앵커 프리 방법을 이용한 FCN(Fully Convolutional Network)기반의 1단계 다중 크기 얼굴 검출기를 제안한다. 최근 대부분의 연구들은 사전 정의된 앵커를 사용하여 얼굴이 있을 만한 위치를 예측한다. 그러나 사전 정의 앵커를 이용함으로써 학습 시 하이퍼 파라미터의 설정과 추가적인 계산이 필요하다. 제안하는 방법의 핵심 아이디어는 앵커 프리 방법을 사용하여 하이퍼 파라미터를 없애고 여러 개의 특징 맵을 사용함으로써 클래스 내 불균형 문제를 완화하는 것이다. 이 방법들은 다음과 같은 효과가 있다. 첫째로 사전정의 앵커를 없앰으로써 앵커와 관련된 하이퍼 파라미터와 추가적인 계산을 피한다. 둘째로 클래스 내 불균형을 완화하기 위해 여러개의 특징 맵으로부터 얼굴을 예측한다. 정량적 평가를 통해 제안하는 방법에 따른 검출 성능을 평가 및 분석한다. FDDB(Face Detection Dataset & Benchmark) 데이터 셋의 실험 결과에서 제안하는 방법이 효과가 있음을 증명했다.

Keywords

References

  1. J. Li, T. Wang, and Y. Zhang, "Face detection using surf cascade," in Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, pp. 2183-2190, IEEE, 2011.
  2. K.-K. Sung and T. Poggio. "Example-based learning for viewbased human face detection." IEEE Transactions on pattern analysis and machine intelligence, 20(1):39-51, 1998. https://doi.org/10.1109/34.655648
  3. P. Viola and M. J. Jones. "Robust real-time face detection." International journal of computer vision, 57(2):137-154,2004 https://doi.org/10.1023/b:visi.0000013087.49260.fb
  4. X. Zhu and D. Ramanan. "Face detection, pose estimation,and landmark localization in the wild." In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on,pages 2879-2886. IEEE, 2012
  5. Ren, S., He, K., Girshick, R., & Sun, J. "Faster r-cnn: Towards real-time object detection with region proposal networks." In Advances in neural information processing systems (pp. 91-99). 2015
  6. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. "You only look once: Unified, real-time object detection." In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788). 2016
  7. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). "Ssd: Single shot multibox detector." In European conference on computer vision (pp. 21-37). Springer, Cham.
  8. Lin, T. Y., Dollar, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. "Feature pyramid networks for object detection." In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117-2125). 2017
  9. Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. "Focal loss for dense object detection." In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 2980-2988,2017.
  10. He, K., Gkioxari, G., Dollar, P., & Girshick, R. "Mask r-cnn." In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969). 2017
  11. S. Yang, P. Luo, C.-C. Loy, and X. Tang. "Wider face: A face detection benchmark." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5525-5533, 2016.
  12. H. Jiang and E. Learned-Miller. "Face detection with the faster r-cnn." In Automatic Face & Gesture Recognition (FG 2017), 2017 12th IEEE International Conference on, pages 650-657. IEEE, 2017.
  13. Wang, H., Li, Z., Ji, X., & Wang, Y. "Face r-cnn." arXiv preprint arXiv:1706.01061. 2017
  14. Yang, W., & Jiachun, Z. "Real-time face detection based on YOLO." In 2018 1st IEEE International Conference on Knowledge Innovation and Invention (ICKII) (pp. 221-224). IEEE. September 2018
  15. Tang, X., Du, D. K., He, Z., & Liu, J. "Pyramidbox: A context-assisted single shot face detector." In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 797-813). 2018
  16. Deng, J., Guo, J., Zhou, Y., Yu, J., Kotsia, I., & Zafeiriou, S. "Retinaface: Single-stage dense face localisation in the wild." arXiv preprint arXiv:1905.00641. 2019
  17. Cakiroglu, O., Ozer, C., & Gunsel, B. "Design of a Deep Face Detector by Mask R-CNN." In 2019 27th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE. April 2019
  18. Li, J., Wang, Y., Wang, C., Tai, Y., Qian, J., Yang, J., ... & Huang, F. "DSFD: dual shot face detector." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5060-5069) 2019
  19. Najibi, M., Samangouei, P., Chellappa, R., & Davis, L. S. "Ssh: Single stage headless face detector." In Proceedings of the IEEE International Conference on Computer Vision (pp. 4875-4884). 2017
  20. Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., & Li, S. Z. "S3fd: Single shot scale-invariant face detector." In Proceedings of the IEEE International Conference on Computer Vision (pp. 192-201). 2017
  21. Tian, Z., Shen, C., Chen, H., & He, T. "Fcos: Fully convolutional one-stage object detection." In Proceedings of the IEEE International Conference on Computer Vision (pp. 9627-9636). 2019
  22. Wang, C., Luo, Z., Lian, S., & Li, S. "Anchor Free Network for Multi-Scale Face Detection." In 2018 24th International Conference on Pattern Recognition (ICPR) (pp. 1554-1559). IEEE. August 2018
  23. L. Huang, Y. Yang, Y. Deng, and Y. Yu, "Densebox: Unifying landmark localization with end to end object detection," arXiv preprint arXiv:1509.04874, 2015.
  24. J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, "Unitbox: An advanced object detection network," in Proc. ACM Multimedia, 2016, pp. 516-520.
  25. Xu, Y., Yan, W., Sun, H., Yang, G., & Luo, J. "CenterFace: Joint Face Detection and Alignment Using Face as Point." arXiv preprint arXiv:1911.03599. 2019
  26. Yotam Abramson, Bruno Steux, and Hicham Ghorayeb. "Yet even faster (yef) real-time object detection." International Journal of Intelligent Systems Technologies and Applications, 2(2-3):102-112, 2007. 2 https://doi.org/10.1504/IJISTA.2007.012476
  27. P. Hu and D. Ramanan, "Finding tiny faces," in Proc. CVPR, 2017, pp.1522-1530.
  28. Zhang, Z., Shen, W., Qiao, S., Wang, Y., Wang, B., & Yuille, A. "Robust face detection via learning small faces on hard images". In The IEEE Winter Conference on Applications of Computer Vision (pp. 1361-1370). 2020
  29. Yu Chen, Chunhua Shen, Xiu-Shen Wei, Lingqiao Liu, and Jian Yang. Adversarial "PoseNet: A structure-aware convolutional network for human pose estimation." In Proc. IEEE Int. Conf. Comp. Vis., 2017.
  30. Sun, K., Xiao, B., Liu, D., & Wang, J. "Deep high-resolution representation learning for human pose estimation." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5693-5703). 2019
  31. K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.
  32. Lokesh Boominathan, Srinivas SS Kruthiventi, and R Venkatesh Babu. "Crowdnet: A deep convolutional network for dense crowd counting." In Proc. ACM Int. Conf. Multimedia, pages 640-644. ACM, 2016.
  33. Sam, D. B., Peri, S. V., Kamath, A., & Babu, R. V. "Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection." arXiv preprint arXiv:1906.07538. 2019
  34. Sindagi, V. A., & Patel, V. M. "Inverse attention guided deep crowd counting network." In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (pp. 1-8). IEEE. September 2019.
  35. Jonathan Long, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 3431-3440, 2015
  36. Zhang, Kaipeng, et al. "Joint face detection and alignment using multitask cascaded convolutional networks." IEEE Signal Processing Letters 23.10 (2016): 1499-1503. https://doi.org/10.1109/LSP.2016.2603342
  37. Huang, Gary B., et al. "Labeled faces in the wild: A database forstudying face recognition in unconstrained environments." 2008.
  38. Jain, Vidit, and Erik Learned-Miller. "Fddb: A benchmark for face detection" in unconstrained settings. Vol. 2. No. 6. UMass Amherst technical report, 2010.
  39. B. Alexe, T. Deselaers, V. Ferrari, Measuring the objectness of image windows, in: Proceedings of the TPAMI, 2012.
  40. J.R. Uijlings, K.E. Van De Sande, T. Gevers, A.W. Smeulders, Selective search for object recognition, in: Proceedings of the IJCV, 2013.
  41. Liu, Li, et al. "Deep learning for generic object detection: A survey." International journal of computer vision 128.2 (2020): 261-318. https://doi.org/10.1007/s11263-019-01247-4
  42. Zhang, Shifeng, et al. "Refineface: Refinement neural network for high performance face detection." IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).

Cited by

  1. Automatic Detection System of Underground Pipe Using 3D GPR Exploration Data and Deep Convolutional Neural Networks vol.26, pp.2, 2020, https://doi.org/10.9708/jksci.2021.26.02.027