DOI QR코드

DOI QR Code

Analysis of facial expression recognition

표정 분류 연구

  • Son, Nayeong (Department of Statistics, Ewha Womans University) ;
  • Cho, Hyunsun (Department of Statistics, Ewha Womans University) ;
  • Lee, Sohyun (Department of Statistics, Ewha Womans University) ;
  • Song, Jongwoo (Department of Statistics, Ewha Womans University)
  • 손나영 (이화여자대학교 통계학과) ;
  • 조현선 (이화여자대학교 통계학과) ;
  • 이소현 (이화여자대학교 통계학과) ;
  • 송종우 (이화여자대학교 통계학과)
  • Received : 2017.11.02
  • Accepted : 2018.09.04
  • Published : 2018.10.31

Abstract

Effective interaction between user and device is considered an important ability of IoT devices. For some applications, it is necessary to recognize human facial expressions in real time and make accurate judgments in order to respond to situations correctly. Therefore, many researches on facial image analysis have been preceded in order to construct a more accurate and faster recognition system. In this study, we constructed an automatic recognition system for facial expressions through two steps - a facial recognition step and a classification step. We compared various models with different sets of data with pixel information, landmark coordinates, Euclidean distances among landmark points, and arctangent angles. We found a fast and efficient prediction model with only 30 principal components of face landmark information. We applied several prediction models, that included linear discriminant analysis (LDA), random forests, support vector machine (SVM), and bagging; consequently, an SVM model gives the best result. The LDA model gives the second best prediction accuracy but it can fit and predict data faster than SVM and other methods. Finally, we compared our method to Microsoft Azure Emotion API and Convolution Neural Network (CNN). Our method gives a very competitive result.

최근 등장하는 다양한 사물인터넷 기기 혹은 상황인식 기반의 인공지능에서는 사용자와 기기의 상호작용이 중요시 된다. 특히 인간을 대상으로 상황에 맞는 대응을 하기 위해서는 인간의 표정을 실시간으로 인식하여 빠르고 정확한 판단을 내리는 것이 필요하다. 따라서, 보다 빠르고 정확하게 표정을 인식하는 시스템을 구축하기 위해 얼굴 이미지 분석에 대한 많은 연구들이 선행되어 왔다. 본 연구에서는 웹사이트 Kaggle에서 제공한 48*48 8-bit grayscale 이미지 데이터셋을 사용하여 얼굴인식과 표정분류로 구분된 두 단계를 거치는 얼굴표정 자동 인식 시스템을 구축하였고, 이를 기존의 연구와 비교하여 자료 및 방법론의 특징을 고찰하였다. 분석 결과, Face landmark 정보에 주성분분석을 적용하여 단 30개의 주성분만으로도 빠르고 효율적인 예측모형을 얻을 수 있음이 밝혀졌다. LDA, Random forest, SVM, Bagging 중 SVM방법을 적용했을 때 가장 높은 정확도를 보이며, LDA방법을 적용하는 경우는 SVM 다음으로 높은 정확도를 보이며, 매우 빠르게 적합하고 예측하는 것이 가능하다.

Keywords

References

  1. Bradski, G. and Kaehler, A. (2008). Learning OpenCV: Computer vision with the OpenCV library, O'Reilly Media, Sebastopol.
  2. Brieman, L. (1996). Bagging predictors, Machine Learning, 24, 123-140 .
  3. Breiman, L. (2001). Random forests, Machine Learning , 45, 5-32. https://doi.org/10.1023/A:1010933404324
  4. Cootes, T. F., Edwards, G. J., and Taylor, C. J. (2001). Active appearance models, IEEE Transactions on Pattern Analysis and Machine Intelligence, 23, 681-685 . https://doi.org/10.1109/34.927467
  5. Cootes, T. F., Taylor, C. J., Cooper, D. H., and Graham, J. (1995). Active shape models-their training and application, Computer Vision and Image Understanding, 61, 38-59. https://doi.org/10.1006/cviu.1995.1004
  6. Cortes, C. and Vapnik, V. (1995). Support-vector networks, Machine Learning, 20, 273-297 .
  7. Daugman, J. G. (1988). Complete discrete 2-D Gabor transforms by neural networks for image analysis and compression, IEEE Transactions on Acoustics, Speech, and Signal Processing, 36, 1169-1179. https://doi.org/10.1109/29.1644
  8. Ekman, P. (1992). An argument for basic emotions, Cognition & Emotion, 6, 169-200. https://doi.org/10.1080/02699939208411068
  9. Fogel, I. and Sagi, D. (1989). Gabor filters as texture discriminator, Biological Cybernetics, 61, 103-113 .
  10. Goodfellow, I. J., Erhan, D., Carrier, P. L., et al. (2013). Challenges in representation learning: a report on three machine learning contests, arXiv:1307.0414, from: https://www.kaggle.com/c/challenges-in-representation-learning-facial-expression-recognition-challenge/data
  11. Jain, A. K. and Farrokhnia, F. (1991). Unsupervised texture segmentation using Gabor filters, Pattern Recognition, 24 , 1167-1186. https://doi.org/10.1016/0031-3203(91)90143-S
  12. Kazemi, V. and Sullivan, J. (2014). One millisecond face alignment with an ensemble of regression trees. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1867-1874.
  13. King, D. E. (2009). Dlib-ml: a machine learning toolkit, Journal of Machine Learning Research, 10, 1755-1758 .
  14. Liaw, A. and Wiener, M. (2002). Classification and Regression by randomforest, R News, 2, 18-22.
  15. Lyons, M. J., Budynek, J., and Akamatsu, S. (1999). Automatic classification of single facial images, IEEE Transactions on Pattern Analysis and Machine Intelligence, 21, 1357-1362. https://doi.org/10.1109/34.817413
  16. Meyer, D., Dimitriadou, E., Hornik, K., Weingessel, A., Leisch, F., Chang, C. C., and Lin, C. C. (2017). e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien, R package version 1.6-8. from: https://CRAN.R-project.org/package=e1071
  17. Ojala, T., Pietikainen, M., and Maenpaa, T. (2002). Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 971-987. https://doi.org/10.1109/TPAMI.2002.1017623
  18. Paul van Gent (2016). http://www.paulvangent.com/2016/08/05/emotion-recognition-using-facial-landmarks/
  19. R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria, from: https://www.R-project.org/
  20. Saatci, Y. and Town, C. (2006). Cascaded classification of gender and facial expression using active appearance models, In 7th International Conference on Automatic Face and Gesture Recognition (FGR06), IEEE, 393-398.
  21. Saragih, J. M., Lucey, S., and Cohn, J. F. (2009). Face alignment through subspace constrained mean-shifts, in Proceedings / IEEE International Conference on Computer Vision, 1034-1041.
  22. Shan, C., Gong, S., and McOwan, P. W. (2009). Facial expression recognition based on local binary patterns: a comprehensive study, Image and Vision Computing, 27, 803-816. https://doi.org/10.1016/j.imavis.2008.08.005
  23. Steven Puttemans (2013). https://github.com/opencv/opencv/tree/master/data/Haarcascades.
  24. Venables, W. N. and Ripley, B. D. (2002). Modern Applied Statistics with S (4th ed), Springer, New York.
  25. Viola, P. and Jones, M. J. (2001). Rapid object detection using a boosted cascade of simple features. In Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, Kauai.
  26. Viola, P. and Jones, M. J. (2004). Robust real-time face detection, International Journal of Computer Vision, 57, 137-154. https://doi.org/10.1023/B:VISI.0000013087.49260.fb
  27. Weihs, C., Ligges, U., Luebke, K., and Raabe, N. (2005). klaR analyzing German business cycles, In Data Analysis and Decision Support, 335-343, Springer, Heidelberg.