DOI QR코드

DOI QR Code

Video Expression Recognition Method Based on Spatiotemporal Recurrent Neural Network and Feature Fusion

  • Zhou, Xuan (Dept. of Information Technology Center, Hangzhou Normal University Qianjiang College)
  • Received : 2020.04.09
  • Accepted : 2020.06.17
  • Published : 2021.04.30

Abstract

Automatically recognizing facial expressions in video sequences is a challenging task because there is little direct correlation between facial features and subjective emotions in video. To overcome the problem, a video facial expression recognition method using spatiotemporal recurrent neural network and feature fusion is proposed. Firstly, the video is preprocessed. Then, the double-layer cascade structure is used to detect a face in a video image. In addition, two deep convolutional neural networks are used to extract the time-domain and airspace facial features in the video. The spatial convolutional neural network is used to extract the spatial information features from each frame of the static expression images in the video. The temporal convolutional neural network is used to extract the dynamic information features from the optical flow information from multiple frames of expression images in the video. A multiplication fusion is performed with the spatiotemporal features learned by the two deep convolutional neural networks. Finally, the fused features are input to the support vector machine to realize the facial expression classification task. The experimental results on cNTERFACE, RML, and AFEW6.0 datasets show that the recognition rates obtained by the proposed method are as high as 88.67%, 70.32%, and 63.84%, respectively. Comparative experiments show that the proposed method obtains higher recognition accuracy than other recently reported methods.

Keywords

References

  1. J. Li, Y. Mi, G. Li, and Z. Ju, "CNN-based facial expression recognition from annotated rgb-d images for human-robot interaction," International Journal of Humanoid Robotics, vol. 16, no. 4, article no. 1941002, 2019. https://doi.org/10.1142/S0219843619410020
  2. M. U. Nagaral and T. H. Reddy, "Hybrid approach for facial expression recognition using HJDLBP and LBP histogram in video sequences," International Journal of Image, Graphics and Signal Processing, vol. 10, no. 2, pp. 1-9, 2018. https://doi.org/10.5815/ijigsp.2018.02.01
  3. X. Fan, X. Yang, Q. Ye, and Y. Yang, "A discriminative dynamic framework for facial expression recognition in video sequences," Journal of Visual Communication and Image Representation, vol. 56, pp. 182-187, 2018. https://doi.org/10.1016/j.jvcir.2018.09.011
  4. F. Ahmed and M. H. Kabir, "Facial expression recognition under difficult conditions: a comprehensive study on edge directional texture patterns," International Journal of Applied Mathematics and Computer Science, vol. 28, no. 2, pp. 399-409, 2018. http://dx.doi.org/10.2478/amcs-2018-0030
  5. H. Yan, "Collaborative discriminative multi-metric learning for facial expression recognition in video," Pattern Recognition, vol. 75, pp. 33-40, 2018. https://doi.org/10.1016/j.patcog.2017.02.031
  6. J. Zhao, X. Mao, and J. Zhang, "Learning deep facial expression features from image and optical flow sequences using 3D CNN," The Visual Computer, vol. 34, no. 10, pp. 1461-1475, 2018. https://doi.org/10.1007/s00371-018-1477-y
  7. A. M. Shabat and J. R. Tapamo, "Angled local directional pattern for texture analysis with an application to facial expression recognition," IET Computer Vision, vol. 12, no. 5, pp. 603-608, 2018. https://doi.org/10.1049/iet-cvi.2017.0340
  8. Z. Gong and H. Chen, "Sequential data classification by dynamic state warping," Knowledge and Information Systems, vol. 57, no. 3, pp. 545-570, 2018. https://doi.org/10.1007/s10115-017-1139-9
  9. O. Yi, H. Tavafoghi, and D. Teneketzis, "Dynamic games with asymmetric information: common information based perfect Bayesian equilibria and sequential decomposition," IEEE Transactions on Automatic Control, vol. 62, no. 1, pp. 222-237, 2016. https://doi.org/10.1109/TAC.2016.2544936
  10. L. H. Nguyen and J. A. Goulet, "Real-time anomaly detection with Bayesian dynamic linear models," Structural Control and Health Monitoring, vol. 26, no. 9, article no. e2404, 2019. https://doi.org/10.1002/stc.2404
  11. E. Zangeneh and A. Moradi, "Facial expression recognition by using differential geometric features," The Imaging Science Journal, vol. 66, no. 8, pp. 463-470, 2018. https://doi.org/10.1080/13682199.2018.1509176
  12. Z. Sun, Z. P. Hu, R. Chiong, M. Wang, and W. He, "Combining the kernel collaboration representation and deep subspace learning for facial expression recognition," Journal of Circuits, Systems and Computers, vol. 27, no. 8, article no. 1850121, 2018. https://doi.org/10.1142/S0218126618501219
  13. A. Moeini, K. Faez, H. Moeini, and A. M. Safai, "Facial expression recognition using dual dictionary learning," Journal of Visual Communication and Image Representation, vol. 45, pp. 20-33, 2017. https://doi.org/10.1016/j.jvcir.2017.02.007
  14. E. Owusu, J. D. Abdulai, and Y. Zhan, "Face detection based on multilayer feed-forward neural network and Haar features," Software: Practice and Experience, vol. 49, no. 1, pp. 120-129, 2019. https://doi.org/10.1002/spe.2646
  15. N. Jain, S. Kumar, A. Kumar, P. Shamsolmoali, and M. Zareapoor, "Hybrid deep neural networks for face emotion recognition," Pattern Recognition Letters, vol. 115, pp. 101-106, 2018. https://doi.org/10.1016/j.patrec.2018.04.010
  16. N. P. Gopalan and S. Bellamkonda, "Pattern averaging technique for facial expression recognition using support vector machines," IJ Image, Graphics and Signal Processing, vol. 9, 27-33, 2018. https://doi.org/10.5815/ijigsp.2018.09.04
  17. M. S. Hossain and M. A. Yousuf, "Real time facial expression recognition for nonverbal communication," International Arab Journal of Information Technology, vol. 15, no. 2, pp. 278-288, 2018.
  18. S. Yuan and X. Mao, "Exponential elastic preserving projections for facial expression recognition," Neurocomputing, vol. 275, pp. 711-724, 2018. https://doi.org/10.1016/j.neucom.2017.08.067
  19. Y. Chen, J. Du, Q. Liu, L. Zhang, and Y. Zeng, Robust and energy-efficient expression recognition based on improved deep ResNets," Biomedical Engineering/Biomedizinische Technik, vol. 64, no. 5, pp. 519-528, 2019. https://doi.org/10.1515/bmt-2018-0027
  20. F. Khan, "Facial expression recognition using facial landmark detection and feature extraction via neural networks," 2018 [Online]. Available: https://arxiv.org/abs/1812.04510
  21. Y. Huang, Y. Yan, S. Chen, and H. Wang, "Expression-targeted feature learning for effective facial expression recognition," Journal of Visual Communication and Image Representation, vol. 55, pp. 677-687, 2018. https://doi.org/10.1016/j.jvcir.2018.08.002
  22. X. Liu, Y. Ge, C. Yang, and P. Jia, "Adaptive metric learning with deep neural networks for video-based facial expression recognition," Journal of Electronic Imaging, vol. 27, no. 1, article no. 013022, 2008. https://doi.org/10.1117/1.JEI.27.1.013022
  23. H. Li, J. Sun, Z. Xu, and L. Chen, "Multimodal 2D+ 3D facial expression recognition with deep fusion convolutional neural network," IEEE Transactions on Multimedia, vol. 19, no. 12, pp. 2816-2831, 2017. https://doi.org/10.1109/TMM.2017.2713408
  24. Z. Yu, Q. Liu, and G. Liu, "Deeper cascaded peak-piloted network for weak expression recognition," The Visual Computer, vol. 34, no. 12, pp. 1691-1699, 2018. https://doi.org/10.1007/s00371-017-1443-0
  25. H. Boughrara, M. Chtourou, C. B. Amar, and L. Chen, "MLP neural network using modified constructive training algorithm: application to face recognition," International Journal of Intelligent Systems Technologies and Applications, vol. 16, no. 1, pp. 53-79, 2017. https://doi.org/10.1504/IJISTA.2017.081316
  26. Y. Zhou and N. Chen, "The LAP under facility disruptions during early post-earthquake rescue using PSO-GA hybrid algorithm," Fresenius Environmental Bulletin, vol. 28, no. 12 A, pp. 9906-9914, 2019.
  27. J. Jian, Y. Guo, L. Jiang, Y. An, and J. Su, "A multi-objective optimization model for green supply chain considering environmental benefits," Sustainability, vol. 11, no. 21, article no. 5911, 2019. https://doi.org/10.3390/su11215911
  28. N. Wang, M. J. Er, and M. Han, "Parsimonious extreme learning machine using recursive orthogonal least squares," IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 10, pp. 1828-1841, 2014. https://doi.org/10.1109/TNNLS.2013.2296048
  29. M. Li, X. Shi, X. Li, W. Ma, J. He, and T. Liu, "Epidemic forest: a spatiotemporal model for communicable diseases," Annals of the American Association of Geographers, vol. 109, no. 3, pp. 812-836, 2019. https://doi.org/10.1080/24694452.2018.1511413
  30. S. Yu, H. Zhu, Z. Fu, and J. Wang, "Single image dehazing using multiple transmission layer fusion," Journal of Modern Optics, vol. 63, no. 6, pp. 519-535, 2016. https://doi.org/10.1080/09500340.2015.1083129