Browse > Article
http://dx.doi.org/10.3837/tiis.2017.03.019

Spatial-temporal texture features for 3D human activity recognition using laser-based RGB-D videos  

Ming, Yue (School of Electronic Engineering, Beijing University of Posts and Telecommunications)
Wang, Guangchao (School of Electronic Engineering, Beijing University of Posts and Telecommunications)
Hong, Xiaopeng (Department of Computer Science and Engineering, University of Oulu)
Publication Information
KSII Transactions on Internet and Information Systems (TIIS) / v.11, no.3, 2017 , pp. 1595-1613 More about this Journal
Abstract
The IR camera and laser-based IR projector provide an effective solution for real-time collection of moving targets in RGB-D videos. Different from the traditional RGB videos, the captured depth videos are not affected by the illumination variation. In this paper, we propose a novel feature extraction framework to describe human activities based on the above optical video capturing method, namely spatial-temporal texture features for 3D human activity recognition. Spatial-temporal texture feature with depth information is insensitive to illumination and occlusions, and efficient for fine-motion description. The framework of our proposed algorithm begins with video acquisition based on laser projection, video preprocessing with visual background extraction and obtains spatial-temporal key images. Then, the texture features encoded from key images are used to generate discriminative features for human activity information. The experimental results based on the different databases and practical scenarios demonstrate the effectiveness of our proposed algorithm for the large-scale data sets.
Keywords
Spatial-template texture features; 3D human activity recognition; RGB-D videos; depth information; Maximum Outline of the History Behavior Binary Image (MOHBBI);
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 P. Borges, N. Conci, and A. Cavallaro, "Video-based human behavior understanding: A survey," IEEE Transactions on Circuits and Systems for Video Technology, vol. 23, no. 11, pp. 1993-2008, 2013.   DOI
2 Jinpyung Kim, Gyujin Jang, Gyujin Kim and Moon-Hyun Kim, "Crowd activity recognition using Optical Flow Orientation Distribution," KSII Transactions on Internet and Information Systems, vol. 9, no. 8, pp. 2948-2963, 2015.   DOI
3 B. Ben Amor, J. Su and A. Srivastave, "Action recognition using rate-invariant analysis of skeletal shape trajectories,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 1, no. 99, pp. 1-12, 2015.
4 Jinseok Lee, Shung Han Cho, Sangjin Hong, Jaechan Lim and Oh Seong-Jun, "Object tracking in 3D space with passive acoustic sensors using particle Filter,'' KSII Transactions on Internet and Information Systems, vol. 5, no. 9, pp. 1632-1652, 2015.   DOI
5 L. Chen, H. Wei, and J. Ferryman, "A survey of human motion analysis using depth imagery,'' Pattern Reccognition Letters, vol.34, no.15, pp. 1995-2006, 2013.   DOI
6 S.S. Rautaray and A. Agrawal, "Vision based hand gesture recognition for human computer interaction: a survey," Artificial Intelligence Review, vol. 43, no. 1, pp. 1-54, 2015.   DOI
7 Y. Kong, Y. Jia, and Y. Fu, "Interactive phrases: Semantic descriptions for human interaction recognition,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.36, no.9, pp. 1775-1788, 2014.   DOI
8 J. Aggarwal, and M. Ryoo, "Human activity analysis: A review,'' ACM Computing Surveys, vol.43, no.3, pp. 1-47, 2011.
9 I. Everts, J. van Gemert, and T. Gevers, "Evaluation of color spatio-temporal interest points for human action recognition,'' IEEE Transactions on image processing, vol.16, no.2, pp. 1569-1580, 2014.
10 W. Lin, Y. Chen, J. Wu, H. Wang, B. Sheng, and H. Li, "A new network-based algorithm for human activity recognition in videos,'' IEEE Transactions on Circuits and Systems I, vol.24, no.5, pp. 826-841, 2013.
11 O. Brdiczka, M. Langet, J. Maisonnasse, and J. Crowly, "Detecting human behavior models from multimodal observation in a smart home," IEEE Transactions on Automation Science and Engineering, vol. 6, no. 4, pp. 588-597, 2009.   DOI
12 M. Singh, A. Basu, and M. Mandal, "Human activity recognition based on silhouette directionality," IEEE Transactions on Circuits and Systems for Video Technology, vol. 18, no. 9, pp. 1280-1292, Sept. 2008.   DOI
13 J. Y. Sung, C. Ponce, B. Selman, and A. Saxena, "Human activity detection from RGBD images," in Proc. of AAAI Conference on Artificial Intelligence Workshops, August 7-11, 2011.
14 L. Schwarz, D. Mateus, V. Castaneda, and N. Navab, "Manifold learning for tof-based human body tracking and activity recognition," in Proc. of British Machine Vision Conference, August 31 - September 3, 2010.
15 B. Liang and L. Zheng, "Gesture recognition from one example using depth images," Lecture Notes on Software Engineering, vol. 1, no. 4, 2013.
16 H. Zhang, C.M. Reardon, and L.E. Paker, "Real-time multiple human perception with color-depth cameras on a mobile robot," IEEE Transactions on Cybernetics, vol. 43, no. 5, pp. 1429-1441, 2013.   DOI
17 Hao Zhang and Lynne E. Parker, "CoDe4D: Color-depth local spatio-Temporal features for human activity recognition from RGB-D videos," IEEE Transactions on Circuits and Systems for Video Technology, vol. 26, no. 3, pp. 1280-1292, 2016.
18 N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 886-893, June20-25, 2005.
19 A. F. Bobick and J. W. Davis, "The recognition of human movement using temporal templates," IEEE T PAMI 23(3), 257-267, 2001.   DOI
20 Jun Wan, Guodong Guo, and Stan Z. Li, "Explore efficient local features from RGB-D data for one-shot learning gesture recognition," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.38, no.8, pp. 1626-1639, 2016.   DOI
21 Y. Ming, and Q. Ruan, "Activity recognition from kinect with 3d local spatiotemporal features,'' in Proc. of IEEE International Conference on Multimedia and Expo, pp. 344-349, July 9-13, 2012.
22 J. Wan, Q. Ruan, W. Li, and S. Deng, "One-shot learning gesture recognition from rgb-d data using bag of features,'' Journal of Machine Learning Research, vol.14, no.1, pp. 2549-2582. 2013.
23 G. Zhao, and M. Pietikainen, "Dynamic texture recognition using local binary patterns with an application to facial expressions,'' IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.29, no.6, pp. 915-928, 2007.   DOI
24 Yue Ming, Guangchao Wang, Chunxiao Fan, "Uniform Local Binary Pattern based Texture-Edge Feature for 3D Human Behavior Recognition,'' Plos One, vol.5, no.10, 2015.
25 G. Zhao, T. Ahonen, J. Matas, and M. Pietikainen, "Rotation invariant image and video description with local binary pattern features,'' IEEE Transactions on Image Processing, vol.21, no.4, pp. 1465-1467, 2012.   DOI
26 R. Mattivi, and L. Shao, "Human action recognition using lbp-top as sparse spatio-temporal feature descriptor,'' Computer Analysis of Images and Patterns, vol.16, no.2, pp. 641-648, 2009.
27 O. Barnich, and M. V. Droogenbroeck, "Vibe: A universal background substraction algorithm for video sequences,'' IEEE Transactions on Image Processing, vol.20, no.6, pp. 1709-1724, 2011.   DOI
28 D. He, and L. Wang, "Texture classification using texture spectrum,'' Pattern Recognition, vol.23, no.8, pp. 905-910, 1990.   DOI
29 T. Ojala, M. Pietikainen, and T. Maenpaa, "Multiresolution gray-scale and rotation invariant texture classification with local binary patterns," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.24, no.7, pp. 971-987, 2002.   DOI
30 N. Altman, "An introduction to kernel and nearest-neighbor nonparametric regression," The American Statistician, vol.19, no.3, pp. 175-185, 1992.
31 Y. Lin, M. Hu, and W. Cheng, "Human action recognition and retrieval using sole depth information," in Proc. of the ACM international conference on Multimedia, pp. 168-197, 1997.
32 Yan-Ching Lin, Min-Chun Hu, Wen-Huang Cheng, Yuang-Huan Hsieh, and Hong-Ming Chen, "Human action recognition and retrieval using sole depth information," in Proc. of 20th ACM International Conference on Multimedia, pp. 175-186, 2012.