Real-Time Hand Pose Tracking and Finger Action Recognition Based on 3D Hand Modeling

3차원 손 모델링 기반의 실시간 손 포즈 추적 및 손가락 동작 인식

  • 석흥일 (고려대학교 컴퓨터학과) ;
  • 이지홍 (고려대학교 컴퓨터학과) ;
  • 이성환 (고려대학교 컴퓨터.통신공학부)
  • Published : 2008.12.15

Abstract

Modeling hand poses and tracking its movement are one of the challenging problems in computer vision. There are two typical approaches for the reconstruction of hand poses in 3D, depending on the number of cameras from which images are captured. One is to capture images from multiple cameras or a stereo camera. The other is to capture images from a single camera. The former approach is relatively limited, because of the environmental constraints for setting up multiple cameras. In this paper we propose a method of reconstructing 3D hand poses from a 2D input image sequence captured from a single camera by means of Belief Propagation in a graphical model and recognizing a finger clicking motion using a hidden Markov model. We define a graphical model with hidden nodes representing joints of a hand, and observable nodes with the features extracted from a 2D input image sequence. To track hand poses in 3D, we use a Belief Propagation algorithm, which provides a robust and unified framework for inference in a graphical model. From the estimated 3D hand pose we extract the information for each finger's motion, which is then fed into a hidden Markov model. To recognize natural finger actions, we consider the movements of all the fingers to recognize a single finger's action. We applied the proposed method to a virtual keypad system and the result showed a high recognition rate of 94.66% with 300 test data.

손 포즈 모델링 및 추적은 컴퓨터 시각 분야에서 어려운 문제로 알려져 있다. 손 포즈 3차원 복원을 위한 방법에는 사용되는 카메라의 수에 따라 다중 카메라 또는 스테레오 카메라 기반 방식과 단일카메라 기반 방식이 있다. 다중 카메라의 경우 여러 대의 카메라를 설치하거나 동기화를 시키는 등에 대한 제약사항이 따른다. 본 논문에서는 확률 그래프 모델에서 신뢰 전파 (Belief Propagation) 알고리즘을 이용하여 단안 카메라에서 획득된 2차원 입력 영상으로부터 3차원 손 포즈를 추정하는 방법을 제안한다. 또한, 은닉 마르코프 모델(Hidden Markov Model)을 인식기로 하여 손가락 클릭 동작을 인식한다. 은닉 노드로 손가락의 관절 정보를 표현하고, 2차원 입력 영상에서 추출된 특징을 관측 노드로 표현한 확률 그래프 모델을 정의한다. 3차원 손 포즈 추적을 위해 그래프 모델에서의 신뢰 전파 알고리즘을 이용한다. 신뢰 전파 알고리즘을 통해 3차원 손 포즈를 추정 및 복원하고, 복원된 포즈로부터 손가락의 움직임에 대한 특징을 추출한다. 추출된 정보는 은닉 마르코프 모델의 입력값이 된다. 손가락의 자연스러운 동작을 위해 본 논문에서는 한 손가락의 클릭 동작 인식에 여러 손가락의 움직임을 함께 고려한다. 제안한 방법을 가상 키패드 시스템에 적응한 결과 300개의 동영상 테스트 데이타에 대해 94.66%의 높은 인식률을 보였다.

Keywords

References

  1. A. Heap and D. Hogg, "Improving Specificity in PDMs using a Hierarchical Approach," Proc. British Machine Vision Conference, Essex, UK, Vol. 1, pp. 80-89, Sept. 1997
  2. R. Rosales, S. Sclaroff, and V. Athitsos, "3D Hand Pose Reconstruction using Specialized Mappings," Proc. 8th IEEE International Conference on Computer Vision, Vancouver, Canada, Vol. 1, pp. 378-385, July 2001
  3. Y. Wu and T. Huang, "View-Independent Recognition of Hand Postures," Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, South California, USA, Vol. 2, pp. 88-94, June 2000
  4. J. Deutscher, A. Blake, and I. Reid, "Articulated Body Motion Capture by Annealed Particle Filtering," Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, South California, USA, Vol. 2, pp. 126-133, June 2000
  5. N. Shimada, Y. Shirai, Y. Kuno, and J. Miura. "Hand Gesture Estimation and Model Refinement using Monocular Camera Ambiguity Limitation by Inequality Constraints," Proc. 3rd IEEE International Conference on Automatic Face and Gesture Recognition, Nara, Japan, pp. 268-273, 1998
  6. Y. Wu and T. Huang, "Capturing Articulated Human Hand Motion: A Divide-and-Conquer Approach," Proc. 7th IEEE International Conference on Computer Vision, Kerkyra, Greece, Vol. 1, pp. 606-611, 1999
  7. T. Han, H. Ning, and T. Huang, "Efficient Nonparametric Belief Propagation with Application to Articulated Body Tracking," Proc. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, New York, USA, Vol. 1, pp. 214-221, June 2006
  8. J. Rehg and T. Kanade, "Model-based Tracking of Self-Occluding Articulated Object," Proc. 5th International Conference on Computer Vision, Cambridge, USA, pp. 612-617, June. 1995
  9. B. Stenger, A. Thayananthan, P. Torr, and R. Cipolla, "Hand Pose Estimation Using Hierarchical Detection," Proc. European Conference on Computer Vision, Lecture Notes in Computer Science, Prague, Czech Republic, Vol. 3058, pp. 105-116, May 2004
  10. J. Kuch and T. Huang, "Vision based Hand Modeling and Tracking for Virtual Teleconferencing and Telecollaboration," Proc. 5th International Conference on Computer Vision, Cambridge, USA, pp. 666-671, June 1995
  11. J. Lee and T. Knuii, "Model-based Analysis of Hand Posture," Proc. IEEE Computer Graphics and Application, New York, USA, Vol. 15, No. 5, pp. 77-86, 1995 https://doi.org/10.1109/38.403831
  12. M. Vittrup, M. Srensen, and B. McCane, "Pose Estimation by Applied Numerical Techniques," Proc. Image and Vision Computing New Zealand, Auckland, New Zealand, Vol. 2, pp. 35-38, Nov. 2002
  13. O. Bernier and P. Cheung-Mon-Chan, "Real-Time 3D Articulated Pose Tracking using Particle Filters Interacting through Belief Propagation," Proc. 18th IAPR/IEEE International Conference on Pattern Recognition, Hong Kong, China, Vol. 1, pp. 90-93, Aug. 2006
  14. H. Rijpkema and M. Girard, "Computer Animation of Knowledge-based Human Grasping," Proc. International Conference on Computer Graphics and Interactive Techniques, New York, USA, Vol. 25, No. 4, pp. 339-348, Aug. 1991
  15. C. Bishop, Pattern Recognition and Machine Learning, Chapter 8, Springer, 2007
  16. M. Isard and A. Blake, "CONDENSATION - Conditional Density Propagation for Visual Tracking," International Journal of Computer Vision, Vol. 29, No. 1, pp. 5-28, Aug. 1998 https://doi.org/10.1023/A:1008078328650
  17. P. Viola and M. Jones, "Robust Real-Time Face Detection," International Journal of Computer Vision, Vol. 57, No. 2, pp. 137-154, 2004 https://doi.org/10.1023/B:VISI.0000013087.49260.fb
  18. L. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proceedings of the IEEE, Vol. 77, No. 2, pp. 257-285, Feb. 1989 https://doi.org/10.1109/5.18626
  19. M. Tosas and B. Li, "Virtual Touch Screen for Mixed Reality," Proc. European Conference on Computer Vision, Lecture Notes in Computer Science, Prague, Czech Republic, Vol. 3058, pp. 48-59, May 2004