Multi-scale face detector using anchor free method

Lee, Dong-Ryeol;Kim, Yoon;

doi:10.9708/jksci.2020.25.07.047

Journal of the Korea Society of Computer and Information (한국컴퓨터정보학회논문지)

Volume 25 Issue 7
/
Pages.47-55
/
2020
/
1598-849X(pISSN)
/
2383-9945(eISSN)

Korean Society of Computer Information (한국컴퓨터정보학회)

DOI QR Code

Multi-scale face detector using anchor free method

Lee, Dong-Ryeol (School of Computer Science and Engineering, Kangwon National University) ;
Kim, Yoon (Dept. of Computer Science and Engineering, Kangwon National University)

Received : 2020.06.11
Accepted : 2020.07.03
Published : 2020.07.31

https://doi.org/10.9708/jksci.2020.25.07.047 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we propose one stage multi-scale face detector based Fully Convolution Network using anchor free method. Recently almost all state-of-the-art face detectors which predict location of faces using anchor-based methods rely on pre-defined anchor boxes. However this face detectors need to hyper-parameters and additional computation in training. The key idea of the proposed method is to eliminate hyper-parameters and additional computation using anchor free method. To do this, we apply two ideas. First, by eliminating the pre-defined set of anchor boxes, we avoid the additional computation and hyper-parameters related to anchor boxes. Second, our detector predicts location of faces using multi-feature maps to reduce foreground/background imbalance issue. Through Quantitative evaluation, the performance of the proposed method is evaluated and analyzed. Experimental results on the FDDB dataset demonstrate the effective of our proposed method.

본 논문에서는 앵커 프리 방법을 이용한 FCN(Fully Convolutional Network)기반의 1단계 다중 크기 얼굴 검출기를 제안한다. 최근 대부분의 연구들은 사전 정의된 앵커를 사용하여 얼굴이 있을 만한 위치를 예측한다. 그러나 사전 정의 앵커를 이용함으로써 학습 시 하이퍼 파라미터의 설정과 추가적인 계산이 필요하다. 제안하는 방법의 핵심 아이디어는 앵커 프리 방법을 사용하여 하이퍼 파라미터를 없애고 여러 개의 특징 맵을 사용함으로써 클래스 내 불균형 문제를 완화하는 것이다. 이 방법들은 다음과 같은 효과가 있다. 첫째로 사전정의 앵커를 없앰으로써 앵커와 관련된 하이퍼 파라미터와 추가적인 계산을 피한다. 둘째로 클래스 내 불균형을 완화하기 위해 여러개의 특징 맵으로부터 얼굴을 예측한다. 정량적 평가를 통해 제안하는 방법에 따른 검출 성능을 평가 및 분석한다. FDDB(Face Detection Dataset & Benchmark) 데이터 셋의 실험 결과에서 제안하는 방법이 효과가 있음을 증명했다.

Keywords

References

J. Li, T. Wang, and Y. Zhang, "Face detection using surf cascade," in Computer Vision Workshops (ICCV Workshops), 2011 IEEE International Conference on, pp. 2183-2190, IEEE, 2011.
K.-K. Sung and T. Poggio. "Example-based learning for viewbased human face detection." IEEE Transactions on pattern analysis and machine intelligence, 20(1):39-51, 1998. https://doi.org/10.1109/34.655648
P. Viola and M. J. Jones. "Robust real-time face detection." International journal of computer vision, 57(2):137-154,2004 https://doi.org/10.1023/b:visi.0000013087.49260.fb
X. Zhu and D. Ramanan. "Face detection, pose estimation,and landmark localization in the wild." In Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on,pages 2879-2886. IEEE, 2012
Ren, S., He, K., Girshick, R., & Sun, J. "Faster r-cnn: Towards real-time object detection with region proposal networks." In Advances in neural information processing systems (pp. 91-99). 2015
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. "You only look once: Unified, real-time object detection." In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788). 2016
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, October). "Ssd: Single shot multibox detector." In European conference on computer vision (pp. 21-37). Springer, Cham.
Lin, T. Y., Dollar, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. "Feature pyramid networks for object detection." In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117-2125). 2017
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. "Focal loss for dense object detection." In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 2980-2988,2017.
He, K., Gkioxari, G., Dollar, P., & Girshick, R. "Mask r-cnn." In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969). 2017
S. Yang, P. Luo, C.-C. Loy, and X. Tang. "Wider face: A face detection benchmark." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5525-5533, 2016.
H. Jiang and E. Learned-Miller. "Face detection with the faster r-cnn." In Automatic Face & Gesture Recognition (FG 2017), 2017 12th IEEE International Conference on, pages 650-657. IEEE, 2017.
Wang, H., Li, Z., Ji, X., & Wang, Y. "Face r-cnn." arXiv preprint arXiv:1706.01061. 2017
Yang, W., & Jiachun, Z. "Real-time face detection based on YOLO." In 2018 1st IEEE International Conference on Knowledge Innovation and Invention (ICKII) (pp. 221-224). IEEE. September 2018
Tang, X., Du, D. K., He, Z., & Liu, J. "Pyramidbox: A context-assisted single shot face detector." In Proceedings of the European Conference on Computer Vision (ECCV) (pp. 797-813). 2018
Deng, J., Guo, J., Zhou, Y., Yu, J., Kotsia, I., & Zafeiriou, S. "Retinaface: Single-stage dense face localisation in the wild." arXiv preprint arXiv:1905.00641. 2019
Cakiroglu, O., Ozer, C., & Gunsel, B. "Design of a Deep Face Detector by Mask R-CNN." In 2019 27th Signal Processing and Communications Applications Conference (SIU) (pp. 1-4). IEEE. April 2019
Li, J., Wang, Y., Wang, C., Tai, Y., Qian, J., Yang, J., ... & Huang, F. "DSFD: dual shot face detector." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5060-5069) 2019
Najibi, M., Samangouei, P., Chellappa, R., & Davis, L. S. "Ssh: Single stage headless face detector." In Proceedings of the IEEE International Conference on Computer Vision (pp. 4875-4884). 2017
Zhang, S., Zhu, X., Lei, Z., Shi, H., Wang, X., & Li, S. Z. "S3fd: Single shot scale-invariant face detector." In Proceedings of the IEEE International Conference on Computer Vision (pp. 192-201). 2017
Tian, Z., Shen, C., Chen, H., & He, T. "Fcos: Fully convolutional one-stage object detection." In Proceedings of the IEEE International Conference on Computer Vision (pp. 9627-9636). 2019
Wang, C., Luo, Z., Lian, S., & Li, S. "Anchor Free Network for Multi-Scale Face Detection." In 2018 24th International Conference on Pattern Recognition (ICPR) (pp. 1554-1559). IEEE. August 2018
L. Huang, Y. Yang, Y. Deng, and Y. Yu, "Densebox: Unifying landmark localization with end to end object detection," arXiv preprint arXiv:1509.04874, 2015.
J. Yu, Y. Jiang, Z. Wang, Z. Cao, and T. Huang, "Unitbox: An advanced object detection network," in Proc. ACM Multimedia, 2016, pp. 516-520.
Xu, Y., Yan, W., Sun, H., Yang, G., & Luo, J. "CenterFace: Joint Face Detection and Alignment Using Face as Point." arXiv preprint arXiv:1911.03599. 2019
Yotam Abramson, Bruno Steux, and Hicham Ghorayeb. "Yet even faster (yef) real-time object detection." International Journal of Intelligent Systems Technologies and Applications, 2(2-3):102-112, 2007. 2 https://doi.org/10.1504/IJISTA.2007.012476
P. Hu and D. Ramanan, "Finding tiny faces," in Proc. CVPR, 2017, pp.1522-1530.
Zhang, Z., Shen, W., Qiao, S., Wang, Y., Wang, B., & Yuille, A. "Robust face detection via learning small faces on hard images". In The IEEE Winter Conference on Applications of Computer Vision (pp. 1361-1370). 2020
Yu Chen, Chunhua Shen, Xiu-Shen Wei, Lingqiao Liu, and Jian Yang. Adversarial "PoseNet: A structure-aware convolutional network for human pose estimation." In Proc. IEEE Int. Conf. Comp. Vis., 2017.
Sun, K., Xiao, B., Liu, D., & Wang, J. "Deep high-resolution representation learning for human pose estimation." In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 5693-5703). 2019
K. Simonyan and A. Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556, 2014.
Lokesh Boominathan, Srinivas SS Kruthiventi, and R Venkatesh Babu. "Crowdnet: A deep convolutional network for dense crowd counting." In Proc. ACM Int. Conf. Multimedia, pages 640-644. ACM, 2016.
Sam, D. B., Peri, S. V., Kamath, A., & Babu, R. V. "Locate, Size and Count: Accurately Resolving People in Dense Crowds via Detection." arXiv preprint arXiv:1906.07538. 2019
Sindagi, V. A., & Patel, V. M. "Inverse attention guided deep crowd counting network." In 2019 16th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS) (pp. 1-8). IEEE. September 2019.
Jonathan Long, Evan Shelhamer, and Trevor Darrell. "Fully convolutional networks for semantic segmentation." In Proc. IEEE Conf. Comp. Vis. Patt. Recogn., pages 3431-3440, 2015
Zhang, Kaipeng, et al. "Joint face detection and alignment using multitask cascaded convolutional networks." IEEE Signal Processing Letters 23.10 (2016): 1499-1503. https://doi.org/10.1109/LSP.2016.2603342
Huang, Gary B., et al. "Labeled faces in the wild: A database forstudying face recognition in unconstrained environments." 2008.
Jain, Vidit, and Erik Learned-Miller. "Fddb: A benchmark for face detection" in unconstrained settings. Vol. 2. No. 6. UMass Amherst technical report, 2010.
B. Alexe, T. Deselaers, V. Ferrari, Measuring the objectness of image windows, in: Proceedings of the TPAMI, 2012.
J.R. Uijlings, K.E. Van De Sande, T. Gevers, A.W. Smeulders, Selective search for object recognition, in: Proceedings of the IJCV, 2013.
Liu, Li, et al. "Deep learning for generic object detection: A survey." International journal of computer vision 128.2 (2020): 261-318. https://doi.org/10.1007/s11263-019-01247-4
Zhang, Shifeng, et al. "Refineface: Refinement neural network for high performance face detection." IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).

Cited by

Automatic Detection System of Underground Pipe Using 3D GPR Exploration Data and Deep Convolutional Neural Networks vol.26, pp.2, 2020, https://doi.org/10.9708/jksci.2021.26.02.027

Journal of the Korea Society of Computer and Information (한국컴퓨터정보학회논문지)

Multi-scale face detector using anchor free method

Abstract

Keywords

References

Cited by

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)