Sign Language Spotting Based on Semi-Markov Conditional Random Field

Cho, Seong-Sik;Lee, Seong-Whan;

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

Volume 36 Issue 12
/
Pages.1034-1037
/
2009
/
1229-6848(pISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

Sign Language Spotting Based on Semi-Markov Conditional Random Field

세미-마르코프 조건 랜덤 필드 기반의 수화 적출

조성식 (고려대학교 컴퓨터학과) ;
이성환 (고려대학교 컴퓨터.통신공학부)

Published : 2009.12.15

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

Sign language spotting is the task of detecting the start and end points of signs from continuous data and recognizing the detected signs in the predefined vocabulary. The difficulty with sign language spotting is that instances of signs vary in both motion and shape. Moreover, signs have variable motion in terms of both trajectory and length. Especially, variable sign lengths result in problems with spotting signs in a video sequence, because short signs involve less information and fewer changes than long signs. In this paper, we propose a method for spotting variable lengths signs based on semi-CRF (semi-Markov Conditional Random Field). We performed experiments with ASL (American Sign Language) and KSL (Korean Sign Language) dataset of continuous sign sentences to demonstrate the efficiency of the proposed method. Experimental results show that the proposed method outperforms both HMM and CRF.

수화 적출이란 연속된 영상에서 수화의 시작과 끝점을 찾고, 이를 사전에 정의된 수화 단어로 인식하는 방법을 말한다. 수화는 매우 다양한 손의 움직임과 모양으로 구성되어 있고, 그 변화가 다양하여 적출에 많은 어려움이 있다. 특히, 다양한 길이의 궤적 정보로 구성된 수화는 길이가 긴 수화에 대해 짧은 길이를 갖는 수화가 인식에 필요한 정보를 추출하기 어려운 문제점 있다. 본 논문에서는 다양한 길이를 갖는 입력 데이터의 특징을 반영할 수 있는 Semi-Markov Conditional Random Field에 기반하여 다양한 수화의 길이에 강인하게 수화를 적출하는 방법을 제안한다. 성능 평가를 위해 미국 수화와 한국 수화 데이터베이스를 사용하여 연속된 수화 영상에서의 수화 적출 성능을 평가하였고, 실험 결과 기존의 Hidden Markov Model과 Conditional Random Field보다 뛰어난 성능을 보였다.

Keywords

References

J. Alon, V. Athitsos, Q. Yuan, and S. Sclaroff, 'A Unified Framework for Gesture Recognition and Spatiotemporal Gesture Segmentation,' IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.31, 2009
H. Brashear, T. Starner, P. Lukowicz, and H. Junker, 'Using Multiple Sensors for Mobile Sign Language Recognition,' Proc. IEEE International Symposium on Wearable Computers, Florida, USA, pp.45-52, October 2003
J. Alon, Spatiotemporal Gesture Segmentation, PhD Dissertation, Department of Computer Science, Boston University, 2006
C. Ong and S. Ranganath, 'Automatic Sign Language Analysis: A Surevey and the Future beyond Lexical Meaning,' IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.27, no.10, pp.1848-1852, 2007
C. Vogler and D. Metaxas, 'A Framework for Recognizing the Simultaneous Aspects of American Sign Language,' Computer Vision and Image Understanding, vol.81, no.3, pp.358-384, 2001 https://doi.org/10.1006/cviu.2000.0895
H.-K. Lee and J.-H. Kim, 'An HMM-based Threshold Model Approach for Gesture Recognition,' IEEE Trans. on Pattern Analysis and Machine Intelligence, vol.21, no.10, pp.961-973, 1999 https://doi.org/10.1109/34.799904
S. Sarawagi and W. Cohen, 'Semi-Markov Conditional Random Fields for Information Extraction,' Proc. Annual Conference on Neural Information Processing Systems, Vancouver, Canada, pp. 1185-1192, December 2004