Spatial-Temporal Scale-Invariant Human Action Recognition using Motion Gradient Histogram

Kim, Kwang-Soo;Kim, Tae-Hyoung;Kwak, Soo-Yeong;Byun, Hye-Ran;

Journal of KIISE:Software and Applications (한국정보과학회논문지:소프트웨어및응용)

Volume 34 Issue 12
/
Pages.1075-1082
/
2007
/
1229-6848(pISSN)

Korean Institute of Information Scientists and Engineers (한국정보과학회)

Spatial-Temporal Scale-Invariant Human Action Recognition using Motion Gradient Histogram

모션 그래디언트 히스토그램 기반의 시공간 크기 변화에 강인한 동작 인식

김광수 (현대자동차 CL사업부) ;
김태형 (LG전자 MC사업부) ;
곽수영 (연세대학교 컴퓨터과학과) ;
변혜란 (연세대학교 컴퓨터과학과)

Published : 2007.12.15

PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we propose the method of multiple human action recognition on video clip. For being invariant to the change of speed or size of actions, Spatial-Temporal Pyramid method is applied. Proposed method can minimize the complexity of the procedures owing to select Motion Gradient Histogram (MGH) based on statistical approach for action representation feature. For multiple action detection, Motion Energy Image (MEI) of binary frame difference accumulations is adapted and then we detect each action of which area is represented by MGH. The action MGH should be compared with pre-learning MGH having pyramid method. As a result, recognition can be done by the analyze between action MGH and pre-learning MGH. Ten video clips are used for evaluating the proposed method. We have various experiments such as mono action, multiple action, speed and site scale-changes, comparison with previous method. As a result, we can see that proposed method is simple and efficient to recognize multiple human action with stale variations.

본 논문은 동영상에 등장하는 다수 사람의 동작을 검출하여 검출된 동작을 개별적으로 인식하는 방법을 제안한다. 동작이 수행되는 속도 또는 크기 변화에 강인한 인식 성능을 갖기 위해 시공간축 피라미드(Spatial-Temporal Pyramid)방식을 적용한다. 동작 표현 방식을 통계적 특성 기반의 모션 그래디언트 히스토그램(MGH:Motion Gradient Histogram)으로 선택하여 인식 과정에서 발생하는 복잡도를 최소화 하였다. 다수의 동작을 검출하기 위하여 이진 차영상을 축적한 모션 에너지 이미지(MEI: Motion Energy Image) 방법을 적용하여 효율적으로 개별적 동작 영역을 획득한다. 각 영역은 동작 표현 방법인 MGH로 나타내어지고, 크기 변화에 강인하도록 피라미드 방식을 적응하여 학습된 템플릿 MGH와 유사도를 상호 비교하여 최종 인식 결과를 얻는다. 인식 성능의 평가를 위해 10개의 동영상을 활용하여 단일 객체, 다수 객체, 속도 및 크기 변화, 기존 방식과의 비교, 기타 추가 실험 등을 실시하여 다양한 조건의 영상에서 양호한 인식 결과를 확인 할 수 있었다.

Keywords

References

I.Haritaoglu, D.Harwood, L.S.Davis, 'W4:real-time surveillance of people and their activities,' IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(8), 2000, pp. 809-830 https://doi.org/10.1109/34.868683
Shearer, Bunke., Venkatesh, 'Video indexing and similarity retrieval by largest common subgraph detection using decision trees,' Pattern Recognition 34, 2001, pp. 1075-1091 https://doi.org/10.1016/S0031-3203(00)00048-0
Alex Pentland, 'Looking at people: sensing for ubiquitous and wearable computing,' IEEE Trans. on Pattern Analysis and Machine Intelligence, 22(1), 2000, pp. 107-119 https://doi.org/10.1109/34.824823
M.Yang, N.Ahuja, and M.Tabb, 'Extraction of 2D motion trajectories and its application to hand gesture recognition,' IEEE Trans. on Pattern Analysis and Machine Intelligence, 24(8):pp. 1061-1074, 2002 https://doi.org/10.1109/TPAMI.2002.1023803
Y.Yacoob and M.J.Black, 'Parameterized modeling and recognition of activities,' Journal of Computer Vision and Image Understanding 73(2):pp. 232-247, 1999 https://doi.org/10.1006/cviu.1998.0726
S.X.Ju, M.J.Black, and Y.Yacoob, 'Cardboard people: A parameterized model of articulated image motion,' In 2nd Int. Conf. On Automatic Face and Gesture Recognition, pp. 38-44, Oct. 1996
A.Bobick and J.Davis, 'The recognition of human movement using temporal templates,' IEEE Pattern Analysis and Machine Intelligence, 23(3):pp. 257-267, 2001 https://doi.org/10.1109/34.910878
M.Blank, L,Gorelick, E,Shechtman, M.Irani and R. Basri, 'Actions as Space-Time Shapes,' IEEE International Conference on Computer Vision, pp. 1395-1402, 2005
L.Zelnik Manor and M.Irani, 'Event-based analysis of video,' IEEE Conference on Computer Vision and Pattern Recognition, Vol.2, pp. 123-130, 2001
A.Efros, A.Berg, G.Mori and J.Malik, 'Recognizing action at a distance,' IEEE International Conference on Computer Vision, Vol.2, pp. 726-733, 2003
E. Shechtman and M. Irani, 'Space-Time Behavioral Correlation,' IEEE Conference on Computer Vision and Pattern Recognition, Vol.1, pp. 405-412, 2005