Browse > Article
http://dx.doi.org/10.5370/KIEE.2016.65.10.1731

Depth Image-Based Human Action Recognition Using Convolution Neural Network and Spatio-Temporal Templates  

Eum, Hyukmin (Dept. of Electrical and Electronic Engineering, Yonsei University)
Yoon, Changyong (Dept. of Electrical Engineering, Suwon Science College)
Publication Information
The Transactions of The Korean Institute of Electrical Engineers / v.65, no.10, 2016 , pp. 1731-1737 More about this Journal
Abstract
In this paper, a method is proposed to recognize human actions as nonverbal expression; the proposed method is composed of two steps which are action representation and action recognition. First, MHI(Motion History Image) is used in the action representation step. This method includes segmentation based on depth information and generates spatio-temporal templates to describe actions. Second, CNN(Convolution Neural Network) which includes feature extraction and classification is employed in the action recognition step. It extracts convolution feature vectors and then uses a classifier to recognize actions. The recognition performance of the proposed method is demonstrated by comparing other action recognition methods in experimental results.
Keywords
Human action recognition; Convolution neural network; Spatio-temporal templates; Motion history image; Depth information;
Citations & Related Records
연도 인용수 순위
  • Reference
1 A. A. Chaaraoui, J. R. Padilla-Lopez, F. J. Ferrandez-Pastor, M. Nieto-Hidalgo, and F. Florez-Revuelta, "A Vision-Based System for Intelligent Monitoring: Human Behaviour Analysis and Privacy by Context," Sensors, vol. 14, no. 5, pp. 8895-8925, 2014.   DOI
2 Z. Ren, J. Yuan, J. Meng, and Z. Zhang, "Robust partbased hand gesture recognition using kinect sensor," IEEE Trans. Multimed, vol. 15, no. 5, pp. 1110-1120, 2013.   DOI
3 J. B. Kim, and H. J. Kim, "Model Based Gaze Direction Estimation Using Support Vector Machine", The Proceedings of Korean Institute of Electrical Engineers (KIEE) pp. 121-122, 2007. 10
4 D. Tao, X. Li, X. Wu, and S. J. Maybank, "General tensor discriminant analysis and gabor features for gait recognition," IEEE Trans. Pattern Anal. Mach. Intell, vol. 29, no. 10, pp. 1700-1715, 2007.   DOI
5 F. Lv and R. Nevatia, "Single view human action recognition using key pose matching and viterbi path searching," in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Minneapolis, pp. 1-8, 2007.
6 H. Eum, C. Yoon, H. Lee, and M. Park, "Continuous Human Action Recognition Using Depth-MHI-HOG and a Spotter Model," Sensors, vol. 15, no. 3, pp. 5197-5227, 2015.   DOI
7 M. A. Ahad, "Motion History Image," in Motion History Images for Action Recognition and Understanding, ed: Springer, 2013, pp. 31-76.
8 O. D. Lara and M. A. Labrador, "A survey on human activity recognition using wearable sensors," Communications Surveys & Tutorials, IEEE, vol. 15, no. 3, pp. 1192-1209, 2013.   DOI
9 S. Vishwakarma and A. Agrawal, "A survey on activity recognition and behavior understanding in video surveillance," The Visual Computer, vol. 29, no. 10, pp. 983-1009, 2013.   DOI
10 J. Wang, Z. Liu, and Y. Wu, "Learning actionlet ensemble for 3D human action recognition," in Human Action Recognition with Depth Cameras, ed: Springer, 2014, pp. 11-40.
11 R. Poppe, "A survey on vision-based human action recognition," Image and Vision computing, vol. 28, no. 6, pp. 976-990, 2010.   DOI
12 X. Wu, D. Xu, L. Duan, J. Luo, and Y. Jia, "Action Recognition Using Multilevel Features and Latent Structural SVM," IEEE Trans. Circuits Syst. Video Techn., vol. 23, no. 8, pp. 1422-1431, 2013.   DOI
13 K. He, X. Zhang, S. Ren, and J. Sun, "Spatial pyramid pooling in deep convolutional networks for visual recognition," Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 37, no. 9, pp. 1904-1916, 2015.   DOI
14 Y. Zhou and N.-M. Cheung, "Vehicle Classification using Transferable Deep Neural Network Features," arXiv preprint arXiv: 1601.01145, 2016.
15 N. Dalal and B. Triggs, "Histograms of oriented gradients for human detection," in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Diego, pp. 886-893, 2005.
16 L. Xia, C.-C. Chen, and J. Aggarwal, "Human detection using depth information by kinect," in IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Colorado Springs, pp. 15-22, 2011.
17 M. A. R. Ahad, J. Tan, H. Kim, and S. Ishikawa, "Motion history image: its variants and applications," Machine Vision and Applications, vol. 23, no. 2, pp. 255-281, 2012.   DOI
18 A. Vedaldi and K. Lenc, "MatConvNet: Convolutional neural networks for matlab," in Proceedings of the 23rd Annual ACM Conference on Multimedia Conference, pp. 689-692, 2015.