[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2019.11.015

Facial Action Unit Detection with Multilayer Fused Multi-Task and Multi-Label Deep Learning Network

He, Jun (College of Information Science and Technology, Beijing Normal University)
Li, Dongliang (College of Information Science and Technology, Beijing Normal University)
Bo, Sun (College of Information Science and Technology, Beijing Normal University)
Yu, Lejun (College of Information Science and Technology, Beijing Normal University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.13, no.11, 2019 , pp. 5546-5559 More about this Journal

Abstract

Facial action units (AUs) have recently drawn increased attention because they can be used to recognize facial expressions. A variety of methods have been designed for frontal-view AU detection, but few have been able to handle multi-view face images. In this paper we propose a method for multi-view facial AU detection using a fused multilayer, multi-task, and multi-label deep learning network. The network can complete two tasks: AU detection and facial view detection. AU detection is a multi-label problem and facial view detection is a single-label problem. A residual network and multilayer fusion are applied to obtain more representative features. Our method is effective and performs well. The F1 score on FERA 2017 is 13.1% higher than the baseline. The facial view recognition accuracy is 0.991. This shows that our multi-task, multi-label model could achieve good performance on the two tasks.

Keywords

facial action unit; multi-task learning; multi-label learning; multilayer fusion; deep learning;

Citations & Related Records

Reference

1	M. Pantic, L. J.M. Rothkranz, "Expert System for Automatic Analysis of Facial Expression," Image and Vision Computing, 18(11), 881-905, 2000. DOI
2	Ekman P., "An argument for basic emotions," Cognition & emotion, 6(3-4), 169-200, 1992. DOI
3	Ekman P, Friesen W, "Facial Action Coding System," Facial Action Coding System (FACS), 1978.
4	He, K., Zhang, X., Ren, S., Sun, J., "Deep residual learning for image recognition," in Proc. of CVPR, 2016.
5	Michel F. Valstar, Enrique Sanchez-Lozano, Jeff F. Cohn, Laszlo A. Jeni, Jeff. M. Girard, Lijun Yin, Zheng Zhang, Maja Pantic, "FERA 2017 - Addressing Head Pose in the Third Facial Expression Recognition and Analysis Challenge," in Proc. of 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), 2017.
6	Michel F. Valstar, T. Almaev, Jeff M. Girard, G. McKeown, M. Mehu, Lijun Yin, & Jeff F. Cohn, "FERA 2015 - Second Facial Expression Recognition and Analysis Challenge," in Proc. of 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition (FG), 2015.
7	Y. Wu and Q. Ji, "Constrained joint cascade regression framework for simultaneous facial action unit recognition and facial landmark detection," in Proc. of Computer Vision and Pattern Recognition (CVPR), 2016.
8	Dapogny, A., Bailly, K., & Dubuisson, S., "Multi-Output Random Forests for Facial Action Unit Detection," in Proc. of 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 135-140, May 2017.
9	Hao, L., Wang, S., Peng, G., & Ji, Q., "Facial action unit recognition augmented by their dependencies," in Proc. of 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 187-194, May 2018.
10	Wang, S., Peng, G., Chen, S., & Ji, Q., "Weakly Supervised Facial Action Unit Recognition with Domain Knowledge," IEEE transactions on cybernetics, 48(11), 3265-3276, 2018. DOI
11	W. Zhang, S. Shan, W. Gao, X. Chen, and H. Zhang, "Local gabor binary pattern histogram sequence (lgbphs): A novel non-statistical model for face representation and recognition," in Proc. of IEEE International Conference on Computer Vision, 1, 786-791, 2005.
12	N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," Computer Vision and Pattern Recognition, vol. l, pp. 886-893, 2005.
13	Thibaud Senechal, Vincent Rapp, Hanan Salam, Renaud Seguier, Kevin Bailly, and Lionel Prevost, "Facial Action Recognition Combining Heterogeneous Features via Multikernel Learning," Systems Man and Cybernetics Part B: Cybernetics, 42(4), 993-1005, 2012. DOI
14	Z. Ming, A. Bugeau, J.-L. Rouas, T Shochi, "Facial action units intensity estimation by the fusion of features with multi-kernel support vector machine Automatic face and gesture recognition (FG)," in Proc. of 2015 11th IEEE International Conference and Workshops on 6, IEEE, pp. 1-6, 2015.
15	A. Gudi, H. E. Tasli, T. M. den Uyl, and A. Maroulis, "Deep learning based facs action unit occurrence and intensity estimation," in Proc. of Facial Expression Recognition and Analysis Challenge, conjunction with IEEE Int'l Conf. on Face and Gesture Recognition, 2015.
16	Shashank Jaiswal, Michel Valstar, "Deep Learning the Dynamic Appearance and Shape of Facial Action Units," in Proc. of WACV, 2016.
17	S. Li, W., Abtahi, F., & Zhu, Z., "Action unit detection with region adaptation, multi-labeling learning and optimal temporal fusing," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1841-1850, 2017.
18	K. Zhao, W.-S. Chu, and H. Zhang, "Deep region and multi-label learning for facial action unit detection," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3391-3399, 2016.
19	S. Ghosh, E. Laksana, S. Scherer, and L.-P. Morency, "A multi-label convolutional neural network approach to cross-domain action unit detectionm," Affective Computing and Intelligent Interaction(ACII), 2015.
20	Walecki, R., Pavlovic, V., Schuller, B., & Pantic, M, "Deep structured learning for facial action unit intensity estimation," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition, pp. 3405-3414, 2017.
21	Z Toser, LA Jeni, A Lorincz, JF Cohn, "Deep Learning for Facial Action Unit Detection Under Large Head Poses," in Proc. of Computer Vision - ECCV 2016 Workshops, pp. 359-371, 2016.
22	Li, X., Chen, S., & Jin, Q., "Facial action units detection with multi-features and-aus fusion," in Proc. of 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 860-865, 2017 May.
23	Batista, J. C., Albiero, V., Bellon, O. R., & Silva, L., "Aumpnet: simultaneous action units detection and intensity estimation on multipose facial images using a single convolutional neural network," in Proc. of 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 866-871, May 2017.
24	He, J., Li, D., Yang, B., Cao, S., Sun, B., & Yu, L, "Multi view facial action unit detection based on cnn and blstm-rnn," in Proc. of 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 848-853, May 2017.
25	Cha Zhang and Zhengyou Zhang, "Improving Multiview Face Detection with Multi-Task Deep Convolutuinal Neural Networks," in Proc. of WACV, 2014.
26	Tang, C., Zheng, W., Yan, J., Li, Q., Li, Y., Zhang, T., & Cui, Z, "View-independent facial action unit detection," in Proc. of 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), pp. 878-882, May 2017.
27	Zhanpeng Zhang, Ping Luo, Chen Change Loy and Xiaoou Tang, "Facial Landmark Detection by Deep Multi-task Learning," in Proc. of ECCV, pp. 94-108, 2014.
28	Junho Yim, Heechul Jung ByungIn Yoo, Changkyu Choi, Dusik Park and Junmo Kim, "Rotating Your Face Using Multi-task Deep Neural Network," in Proc. of CVPR, 2015.
29	Nitish Srivastava, "Improving neural networks with dropout," Ph.D. thesis, University of Toronto, 2013.
30	Zhu, Yi, and S. Newsam, "Efficient Action Detection in Untrimmed Videos via Multi-task Learning," in Proc. of Applications of Computer Vision IEEE, 2017.
31	Xing Zhang, Lijun Yin, Jeff Cohn, Shaun Canavan, Michael Reale, Andy Horowitz, Peng Liu, and Jeff Girard, "BP4D-Spontaneous: A high resolution spontaneous 3D dynamic facial expression database," Image and Vision Computing, 32(10), pp. 692-706, 2014. (special issue of the Best of FG13). DOI
32	Zheng Zhang, Jeff Girard, Yue Wu, Xing Zhang, Peng Liu, Umur Ciftci, Shaun Canavan, Micheal Reale, Andy Horowitz, Huiyuan Yang, Jeff Cohn, Qiang Ji, and Lijun Yin, "Multimodal Spontaneous Emotion Corpus for Human Behavior Analysis," in Proc. of CVPR, 2016.