Browse > Article
http://dx.doi.org/10.3745/JIPS.01.0067

Video Expression Recognition Method Based on Spatiotemporal Recurrent Neural Network and Feature Fusion  

Zhou, Xuan (Dept. of Information Technology Center, Hangzhou Normal University Qianjiang College)
Publication Information
Journal of Information Processing Systems / v.17, no.2, 2021 , pp. 337-351 More about this Journal
Abstract
Automatically recognizing facial expressions in video sequences is a challenging task because there is little direct correlation between facial features and subjective emotions in video. To overcome the problem, a video facial expression recognition method using spatiotemporal recurrent neural network and feature fusion is proposed. Firstly, the video is preprocessed. Then, the double-layer cascade structure is used to detect a face in a video image. In addition, two deep convolutional neural networks are used to extract the time-domain and airspace facial features in the video. The spatial convolutional neural network is used to extract the spatial information features from each frame of the static expression images in the video. The temporal convolutional neural network is used to extract the dynamic information features from the optical flow information from multiple frames of expression images in the video. A multiplication fusion is performed with the spatiotemporal features learned by the two deep convolutional neural networks. Finally, the fused features are input to the support vector machine to realize the facial expression classification task. The experimental results on cNTERFACE, RML, and AFEW6.0 datasets show that the recognition rates obtained by the proposed method are as high as 88.67%, 70.32%, and 63.84%, respectively. Comparative experiments show that the proposed method obtains higher recognition accuracy than other recently reported methods.
Keywords
Double Layer Cascade Structure; Facial Expression Recognition; Feature Fusion; Image Detection; Spatiotemporal Recursive Neural Network;
Citations & Related Records
연도 인용수 순위
  • Reference
1 J. Zhao, X. Mao, and J. Zhang, "Learning deep facial expression features from image and optical flow sequences using 3D CNN," The Visual Computer, vol. 34, no. 10, pp. 1461-1475, 2018.   DOI
2 A. M. Shabat and J. R. Tapamo, "Angled local directional pattern for texture analysis with an application to facial expression recognition," IET Computer Vision, vol. 12, no. 5, pp. 603-608, 2018.   DOI
3 Z. Gong and H. Chen, "Sequential data classification by dynamic state warping," Knowledge and Information Systems, vol. 57, no. 3, pp. 545-570, 2018.   DOI
4 O. Yi, H. Tavafoghi, and D. Teneketzis, "Dynamic games with asymmetric information: common information based perfect Bayesian equilibria and sequential decomposition," IEEE Transactions on Automatic Control, vol. 62, no. 1, pp. 222-237, 2016.   DOI
5 E. Zangeneh and A. Moradi, "Facial expression recognition by using differential geometric features," The Imaging Science Journal, vol. 66, no. 8, pp. 463-470, 2018. https://doi.org/10.1080/13682199.2018.1509176   DOI
6 Z. Sun, Z. P. Hu, R. Chiong, M. Wang, and W. He, "Combining the kernel collaboration representation and deep subspace learning for facial expression recognition," Journal of Circuits, Systems and Computers, vol. 27, no. 8, article no. 1850121, 2018. https://doi.org/10.1142/S0218126618501219   DOI
7 A. Moeini, K. Faez, H. Moeini, and A. M. Safai, "Facial expression recognition using dual dictionary learning," Journal of Visual Communication and Image Representation, vol. 45, pp. 20-33, 2017.   DOI
8 X. Fan, X. Yang, Q. Ye, and Y. Yang, "A discriminative dynamic framework for facial expression recognition in video sequences," Journal of Visual Communication and Image Representation, vol. 56, pp. 182-187, 2018.   DOI
9 J. Li, Y. Mi, G. Li, and Z. Ju, "CNN-based facial expression recognition from annotated rgb-d images for human-robot interaction," International Journal of Humanoid Robotics, vol. 16, no. 4, article no. 1941002, 2019. https://doi.org/10.1142/S0219843619410020   DOI
10 M. U. Nagaral and T. H. Reddy, "Hybrid approach for facial expression recognition using HJDLBP and LBP histogram in video sequences," International Journal of Image, Graphics and Signal Processing, vol. 10, no. 2, pp. 1-9, 2018. https://doi.org/10.5815/ijigsp.2018.02.01   DOI
11 F. Ahmed and M. H. Kabir, "Facial expression recognition under difficult conditions: a comprehensive study on edge directional texture patterns," International Journal of Applied Mathematics and Computer Science, vol. 28, no. 2, pp. 399-409, 2018. http://dx.doi.org/10.2478/amcs-2018-0030   DOI
12 S. Yuan and X. Mao, "Exponential elastic preserving projections for facial expression recognition," Neurocomputing, vol. 275, pp. 711-724, 2018.   DOI
13 N. Jain, S. Kumar, A. Kumar, P. Shamsolmoali, and M. Zareapoor, "Hybrid deep neural networks for face emotion recognition," Pattern Recognition Letters, vol. 115, pp. 101-106, 2018.   DOI
14 N. P. Gopalan and S. Bellamkonda, "Pattern averaging technique for facial expression recognition using support vector machines," IJ Image, Graphics and Signal Processing, vol. 9, 27-33, 2018. https://doi.org/10.5815/ijigsp.2018.09.04   DOI
15 M. S. Hossain and M. A. Yousuf, "Real time facial expression recognition for nonverbal communication," International Arab Journal of Information Technology, vol. 15, no. 2, pp. 278-288, 2018.
16 Y. Chen, J. Du, Q. Liu, L. Zhang, and Y. Zeng, Robust and energy-efficient expression recognition based on improved deep ResNets," Biomedical Engineering/Biomedizinische Technik, vol. 64, no. 5, pp. 519-528, 2019. https://doi.org/10.1515/bmt-2018-0027   DOI
17 F. Khan, "Facial expression recognition using facial landmark detection and feature extraction via neural networks," 2018 [Online]. Available: https://arxiv.org/abs/1812.04510
18 X. Liu, Y. Ge, C. Yang, and P. Jia, "Adaptive metric learning with deep neural networks for video-based facial expression recognition," Journal of Electronic Imaging, vol. 27, no. 1, article no. 013022, 2008. https://doi.org/10.1117/1.JEI.27.1.013022   DOI
19 H. Li, J. Sun, Z. Xu, and L. Chen, "Multimodal 2D+ 3D facial expression recognition with deep fusion convolutional neural network," IEEE Transactions on Multimedia, vol. 19, no. 12, pp. 2816-2831, 2017.   DOI
20 H. Yan, "Collaborative discriminative multi-metric learning for facial expression recognition in video," Pattern Recognition, vol. 75, pp. 33-40, 2018.   DOI
21 N. Wang, M. J. Er, and M. Han, "Parsimonious extreme learning machine using recursive orthogonal least squares," IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 10, pp. 1828-1841, 2014.   DOI
22 Z. Yu, Q. Liu, and G. Liu, "Deeper cascaded peak-piloted network for weak expression recognition," The Visual Computer, vol. 34, no. 12, pp. 1691-1699, 2018.   DOI
23 Y. Zhou and N. Chen, "The LAP under facility disruptions during early post-earthquake rescue using PSO-GA hybrid algorithm," Fresenius Environmental Bulletin, vol. 28, no. 12 A, pp. 9906-9914, 2019.
24 J. Jian, Y. Guo, L. Jiang, Y. An, and J. Su, "A multi-objective optimization model for green supply chain considering environmental benefits," Sustainability, vol. 11, no. 21, article no. 5911, 2019. https://doi.org/10.3390/su11215911   DOI
25 M. Li, X. Shi, X. Li, W. Ma, J. He, and T. Liu, "Epidemic forest: a spatiotemporal model for communicable diseases," Annals of the American Association of Geographers, vol. 109, no. 3, pp. 812-836, 2019. https://doi.org/10.1080/24694452.2018.1511413   DOI
26 S. Yu, H. Zhu, Z. Fu, and J. Wang, "Single image dehazing using multiple transmission layer fusion," Journal of Modern Optics, vol. 63, no. 6, pp. 519-535, 2016. https://doi.org/10.1080/09500340.2015.1083129   DOI
27 H. Boughrara, M. Chtourou, C. B. Amar, and L. Chen, "MLP neural network using modified constructive training algorithm: application to face recognition," International Journal of Intelligent Systems Technologies and Applications, vol. 16, no. 1, pp. 53-79, 2017.   DOI
28 E. Owusu, J. D. Abdulai, and Y. Zhan, "Face detection based on multilayer feed-forward neural network and Haar features," Software: Practice and Experience, vol. 49, no. 1, pp. 120-129, 2019. https://doi.org/10.1002/spe.2646   DOI
29 L. H. Nguyen and J. A. Goulet, "Real-time anomaly detection with Bayesian dynamic linear models," Structural Control and Health Monitoring, vol. 26, no. 9, article no. e2404, 2019. https://doi.org/10.1002/stc.2404   DOI
30 Y. Huang, Y. Yan, S. Chen, and H. Wang, "Expression-targeted feature learning for effective facial expression recognition," Journal of Visual Communication and Image Representation, vol. 55, pp. 677-687, 2018.   DOI