동적 베이스망 기반의 양손 제스처 인식

Dynamic Bayesian Network based Two-Hand Gesture Recognition

  • 석흥일 (부경대학교 컴퓨터공학과) ;
  • 신봉기 (부경대학교 컴퓨터멀티미디어공학부)
  • 발행 : 2008.04.15


손 제스처를 이용한 사람과 컴퓨터간의 상호 작용은 오랜 기간 많은 사람들이 연구해 오고 있으며 커다란 발전을 보이고 있지만, 여전히 만족스러운 결과를 보이지는 못하고 있다. 본 논문에서는 동적 베이스망 프레임워크를 이용한 손 제스처 인식 방법을 제안한다. 유선 글러브를 이용하는 방법들과는 달리, 카메라 기반의 방법에서는 영상 처리와 특징 추출 단계의 결과들이 인식 성능에 큰 영향을 미친다. 제안하는 제스처 모델에서의 추론에 앞서 피부 색상 모델링 및 검출과 움직임 추적을 수행한다. 특징들간의 관계와 새로운 정보들을 쉽게 모델에 반영할 수 있는 동적 베이스망을 이용하여 두 손 제스처와 한 손 제스처 모두를 인식할 수 있는 새로운 모델을 제안한다. 10가지 독립 제스처에 대한 실험에서 최대 99.59%의 높은 인식 성능을 보였다. 제안하는 모델과 관련 방법들은 수화 인식과 같은 다른 문제들에도 적용 가능할 것으로 판단된다.

The idea of using hand gestures for human-computer interaction is not new and has been studied intensively during the last dorado with a significant amount of qualitative progress that, however, has been short of our expectations. This paper describes a dynamic Bayesian network or DBN based approach to both two-hand gestures and one-hand gestures. Unlike wired glove-based approaches, the success of camera-based methods depends greatly on the image processing and feature extraction results. So the proposed method of DBN-based inference is preceded by fail-safe steps of skin extraction and modeling, and motion tracking. Then a new gesture recognition model for a set of both one-hand and two-hand gestures is proposed based on the dynamic Bayesian network framework which makes it easy to represent the relationship among features and incorporate new information to a model. In an experiment with ten isolated gestures, we obtained the recognition rate upwards of 99.59% with cross validation. The proposed model and the related approach are believed to have a strong potential for successful applications to other related problems such as sign languages.



  1. G. Johansson, "Visual Perception of Biological Motion and a Model for Its Analysis," Perception and Psychophysics, Vol.14, pp. 201-211, 1973 https://doi.org/10.3758/BF03212378
  2. J. Arggarwal and Q. Cai, "Human Motion Analysis-A Review," Computer Vision and Image Understanding, Vol.73, pp. 428-440, 1999 https://doi.org/10.1006/cviu.1998.0744
  3. V. Pavlovic, R. Sharma, and T. S. Huang, "Visual Interpretation of Hand Gestures for Human- Computer Interaction A Review," IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol.19, No.7, pp. 677-695, 1997 https://doi.org/10.1109/34.598226
  4. C. Myers and L. Rabiner, "A Comparative Study of Several Dynamic Time-Warping Algorithms for Connected Word Recognition," The Bell System Technical Journal, Vol.60, pp. 1389-1409, 1981 https://doi.org/10.1002/j.1538-7305.1981.tb00272.x
  5. J. Yamato, J. Ohya, and K. Ishii, "Recognizing Human Action in Time-Sequential Images Using Hidden Markov Model," In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, Champaign, USA, pp. 379- 385, June 1992
  6. H.-I. Suk and B.-K. Sin, "HMM-Based Gait Recognition with Human Profiles," In Proceedings of Joint IAPR International Workshops SSPR 2006 and SPR2006, Hong Kong, China, pp. 596-603, August 2006
  7. H.-K. Lee and J.-H. Kim, "An HMM-Based Threshold Model Approach for Gesture Recognition," IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol.21, No.10, pp. 961-973, 1999 https://doi.org/10.1109/34.799904
  8. M. Brand, N. Oliver, and A. Pentland, "Coupled Hidden Markov Models for Complex Action Recognition," In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, San Juan, Puerto Rico, pp. 994-999, June 1997
  9. F. Jensen, Bayesian Networks and Decision Graphs, Chapter 1, pp. 3-34, Springer, 2001
  10. Y. Du, F. Chen, W. Xu, and Y. Li, "Recognizing Interaction Activities using Dynamic Bayesian Network," In Proceedings of IEEE International Conference on Pattern Recognition, Hong Kong, China, Vol.1, pp. 618-621, August 2006
  11. S.-H Park and J. Aggarwal, "A Hierarchical Bayesian Network for Event Recognition of Human Actions and Interactions," ACM Journal of Multimedia Systems, Vol.10, No.2, pp. 164-179, 2004 https://doi.org/10.1007/s00530-004-0148-1
  12. H. Aviles-Arriaga, L. Sucar, and C. Mendoza, "Visual Recognition of Similar Gestures," In Proceedings of IEEE International Conference on Pattern Recognition, Hong Kong, China, Vol.1, pp. 1100-1103, August 2006
  13. V. Pavlovic, Dynamic Bayesian Networks for Information Fusion with Applications to Human- Computer Interfaces, Ph. D. Dissertation, University of Illinois at Urbana-Champaign, 1999
  14. A. Wilson, Adaptive Models for the Recognition of Human Gestures, Ph. D, Dissertation, MIT Program in Arts and Sciences, 2000
  15. N. Oliver and E. Horvitz, "A Comparison of HMMs and Dynamic Bayesian Networks for Recognizing Office Activities," User Modeling, pp. 199-209, 2005
  16. R. Leon, "Continuous Activity Recognition with Missing Data," In Proceedings of IEEE International Conference on Pattern Recognition, Quebec, Canada, Vol.1, pp. 439-446, August 2002
  17. M. Yang and N. Ahuja, "Recognizing Hand Gestures Using Motion Trajectories," In Proceedings of IEEE International Conference on Computer Vision and Pattern Recognition, Fort Collins, USA, Vol.1, pp. 23-25, June 1999
  18. A. Nefina, L. Liang, X. Pi, X. Liu, and K. Murphy, "Dynamic Bayesian Networks for Audio- Visual Speech Recognition," Journal of Applied Signal Processing, Vol.11, No.1, pp. 1-15, 2002
  19. S. Wong and R. Cipolla, "Continuous Gesture Recognition Using a Sparse Bayesian Classifier," In Proceedings of IEEE International Conference on Pattern Recognition, Hong Kong, China, Vol.1, pp. 1084-1087, August 2006
  20. A. Argyros and M. Lourakis, "Real-Time Tracking of Multiple Skin-Colored Objects with a Possibly Moving Camera," In Proceedings of European Conference on Computer Vision, Prague, Czech Republic, Vol.3, pp. 368-379, May 2004
  21. P. Kakumanu, S. Makrogiannis, and N. Bourbakis, "A Survey on Pixel-Based Skin Color Detection Techniques," Pattern Recognition, Vol.40, No.3, pp. 1106-1122, 2007 https://doi.org/10.1016/j.patcog.2006.06.010
  22. P. Viola and M. Jones, "Robust Real-Time Face Detection," International Journal of Computer Vision, Vol.57, No.2, pp. 137-154, 2004 https://doi.org/10.1023/B:VISI.0000013087.49260.fb
  23. G. Bradski, "Computer Vision Face Tracking for Use in a Perceptual User Interface," Intel Technology Journal Q2, pp. 1-15., 1998
  24. L. Rabiner, "A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition," Proceedings of the IEEE, Vol.77, pp. 257-285, 1989 https://doi.org/10.1109/5.18626
  25. C. Bishop, Pattern Recognition and Machine Learning, Chapter 8, pp. 359-422, Springer, 2007
  26. K. Murphy, Dynamic Bayesian Network: Representation, Inference and Learning, Ph.D. Dissertation, University of California, Berkeley, 2002
  27. C. Huang and A. Darwiche, "Inference in Belief Networks: A Procedural Guide," International Journal of Approximate Reasoning, Vol.15, No.3, pp. 225-263, 1994 https://doi.org/10.1016/S0888-613X(96)00069-2
  28. A. Dempster, N. Laird, and D. Rubin, "Maximum Likelihood from Incomplete Data via the EM Algorithm," Journal of the Royal Statistical Society, Vol.39, No.1, pp. 1-38, 1977
  29. http://bnt.sourceforge.net/
  30. http://sourceforge.net/projects/opencvlibrary/