Head Pose Estimation with Accumulated Historgram and Random Forest

누적 히스토그램과 랜덤 포레스트를 이용한 머리방향 추정

  • 문성희 (전남대학교 전자컴퓨터공학과) ;
  • 이칠우 (전남대학교 전자컴퓨터공학과)
  • Received : 2015.11.25
  • Accepted : 2016.03.29
  • Published : 2016.03.31

Abstract

As smart environment is spread out in our living environments, the needs of an approach related to Human Computer Interaction(HCI) is increases. One of them is head pose estimation. it related to gaze direction estimation, since head has a close relationship to eyes by the body structure. It's a key factor in identifying person's intention or the target of interest, hence it is an essential research in HCI. In this paper, we propose an approach for head pose estimation with pre-defined several directions by random forest classifier. We use canny edge detector to extract feature of the different facial image which is obtained between input image and averaged frontal facial image for extraction of rotation information of input image. From that, we obtain the binary edge image, and make two accumulated histograms which are obtained by counting the number of pixel which has non-zero value along each of the axes. This two accumulated histograms are used to feature of the facial image. We use CAS-PEAL-R1 Dataset for training and testing to random forest classifier, and obtained 80.6% accuracy.

스마트 환경 구축이 보편화됨에 따라 사람과 컴퓨터 사이의 상호작용(HCI)에 관한 연구가 활발히 진행되고 있다. 인간-컴퓨터 상호작용에서 사람의 얼굴과 시선 방향을 안다는 것은 그 사람의 의도나 관심의 대상을 파악하는데 중요한 정보를 제공할 뿐만 아니라 신체 구조를 이해하는데도 하나의 기준이 될 수 있으므로 중요한 연구 테마이다. 본 논문에서는 랜덤 포레스트를 이용하여 얼굴 방향을 미리 정해놓은 각도로 분류하는 방법을 제안한다. 먼저 영상은 전처리를 거친 뒤 회전정보를 얻기 위하여 평균 정면 얼굴과의 차영상을 이용하여 회전정보를 추출한다. 캐니에지 검출법을 이용하여 얼굴의 특징을 검출하고 이를 이용하여 에지 영상을 구한 뒤, 이 영상에 대해 가로 세로축 각각에 대해 픽셀 수를 누적하여 히스토그램을 작성한다. 누적히스토그램을 특징으로 랜덤 포레스트를 생성하였으며, 랜덤 포레스트의 학습과 테스트에는 CAS-PEAL-R1 데이터를 사용하여 80.6%의 인식률을 얻었다.

Keywords

References

  1. http://www.jdl.ac.cn/peal/JDL-PEAL-Release.htm
  2. L. Breiman, "Random Forests," Machine Learning, vol. 45, no. 1, pp.5 -32, 2001. https://doi.org/10.1023/A:1010933404324
  3. A. Gee and R. Cipolla, "Non-intrusive Gaze Tracking of Human Computer Interaction," Cambridge University, 1995.
  4. T. Horprasert, Y. Yacoob and L.S. Davis, "Computing 3-D Head Orientation from a Monocular Image Secuqence," Proc. 2nd Int. Conf. on Automatic Face and Gesture Recognition, pp.242-247, 1996.
  5. J.G. Wang and E. Sung, "EM Enhancement of 3D Head Pose Estimated by Point at Infinity," Image and Vision Computing, vol. 25, no. 12, pp. 1864-1874, 2007. https://doi.org/10.1016/j.imavis.2005.12.017
  6. G. Fadda, G. L. Marcialis, F. Roli, L. Ghiani, "Exploiting the Golden Ratio on Human Faces for Head-Pose Estimation." In: Image Analysis and Processing-ICIAP 2013, Springer Berlin Heidelberg, pp. 280-289, 2013.
  7. T. Maurer and C. von der Malsburg, "Tracking and Learning Graphs and Pose on Image Sequences of Faces," Proc. 2nd Int. Conf. on Automatic Face and Gesture Recognition, pp. 176-181, 1996.
  8. S.G. Kong and Ralph Oyini Mbouna, "Head Pose Estimation From a 2D Face Image Using 3D Face Morphing With Depth Parameters," IEEE Trans. on Image Processing, Vol.24, No.6, pp.1801-1808, 2015. https://doi.org/10.1109/TIP.2015.2405483
  9. S. Niyogi and W. Freeman, "Example-Based Head Tracking", Proc. 2nd Int. Conf. on Automatic Face and Gesture Recognition, pp.374-377, 1996.
  10. A. Lanitis, C.J. Taylor, T.F. Cootes and T.ahmed, "Automatic Interpretation of Human Faces and Hand Gestures Using Flexible Models", Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, pp.98-103, 1995.
  11. Y. Sumi and Y. Ohta, "Detection of face orientation and facial components using distributed appearance modeling", Proc. IEEE Int. Conf. on Automatic Face and Gesture Recognition, pp. 254-259, 1995.
  12. C. Huang, H.Ai, Y. LI and S.Lao, "High-performance rotation invariant multiview face detection," IEEE Trans. on Pattern Ana-lysis and Machine Intelligence, Vol.29, No.4, pp. 671-686, 2007. https://doi.org/10.1109/TPAMI.2007.1011
  13. N. Gourier, J. Maisonnasse, D. Hall and J.L. Crowley, "Head pose estimation on low resolution images," Lecture Notes in Computer Science 4122, pp.270-280, 2007.