Browse > Article
http://dx.doi.org/10.5370/JEET.2015.10.3.1264

Human Action Recognition Bases on Local Action Attributes  

Zhang, Jing (School of Electronic Information Engineering, Tianjin University)
Lin, Hong (School of Electronic Information Engineering, Tianjin University)
Nie, Weizhi (School of Electronic Information Engineering, Tianjin University)
Chaisorn, Lekha (SeSamMe Center, Interative Digital Media Institute, National of Singapore)
Wong, Yongkang (SeSamMe Center, Interative Digital Media Institute, National of Singapore)
Kankanhalli, Mohan S (SeSamMe Center, Interative Digital Media Institute, National of Singapore)
Publication Information
Journal of Electrical Engineering and Technology / v.10, no.3, 2015 , pp. 1264-1274 More about this Journal
Abstract
Human action recognition received many interest in the computer vision community. Most of the existing methods focus on either construct robust descriptor from the temporal domain, or computational method to exploit the discriminative power of the descriptor. In this paper we explore the idea of using local action attributes to form an action descriptor, where an action is no longer characterized with the motion changes in the temporal domain but the local semantic description of the action. We propose an novel framework where introduces local action attributes to represent an action for the final human action categorization. The local action attributes are defined for each body part which are independent from the global action. The resulting attribute descriptor is used to jointly model human action to achieve robust performance. In addition, we conduct some study on the impact of using body local and global low-level feature for the aforementioned attributes. Experiments on the KTH dataset and the MV-TJU dataset show that our local action attribute based descriptor improve action recognition performance.
Keywords
Human action recognition; Action attributes; Support vector machine;
Citations & Related Records
Times Cited By KSCI : 2  (Citation Analysis)
연도 인용수 순위
1 A. Yilmaz and M. Shah, “Recognizing human actions in videos acquired by uncalibrated moving cameras,” in ICCV, 2005, pp. 150-157.
2 C. Schuldt, I. Laptev, and B. Caputo, “Recognizing human actions: A local svm approach,” in ICPR, 2004, pp. 32-36.
3 J. Wu and D. Hu, “Learning effective event models to recognize a large number of human actions,” IEEE Transactions on Multimedia, 2014.
4 J. Wang, Z. Liu, Y. Wu, and J. Yuan, “Learning actionlet ensemble for 3d human action recognition,” Transactions on Pattern Analysis and Machine Intelligence, 2014.
5 Z. Gao, H. Zhang, A.-A. Liu, Y. bing Xue, and G. ping Xu, “Human action recognition using pyramid histograms of oriented gradients and collaborative multi-task learning,” KSII Transactions on Internet and Information Systems, 2014.
6 G. K. M. Cheung, S. Baker, and T. Kanade, “Shapefrom- silhouette of articulated objects and its use for human body kinematics estimation and motion capture,” in CVPR, 2003, pp. 77-84.
7 E. I, van Gemert J, and G. T, “Evaluation of color spatio-temporal interest points for human action recognition,” IEEE Transactions on Image Processing, 2014.
8 A. Liu, Z. Gao, T. Hao, Y. Su, and Z. Yang, “Partwise bag of wordsbased multi-task learning for human action recognition,” Electronics Letters, 2013.
9 Z. Gao, J. ming Song, H. Zhang, A.-A. Liu, Y. bing Xue, and G. ping Xu, “Human action recognition via multi-modality information,” Journal of Electrical Engineering & Technology, vol. 8, pp. 742-751, 2013.
10 A. Farhadi, I. Endres, D. Hoiem, and D. A. Forsyth, “Describing objects by their attributes,” in CVPR, 2009, pp. 1778-1785.
11 D. Parikh and K. Grauman, “Interactively building a discriminative vocabulary of nameable attributes,” in CVPR, 2011, pp. 1681-1688.
12 S. Carlsson and J. Sullivan, “Action recognition by shape matching to key frames,” in IEEE Computer Society Workshop on Models Versus Exemplars in Computer Vision, 2001.
13 G. K. M. Cheung, S. Baker, and T. Kanade, “Shapefrom- silhouette of articulated objects and its use for human body kinematics estimation and motion capture,” in CVPR, 2003, pp. 77-84.
14 J. Liu, B. Kuipers, and S. Savarese, “Recognizing human actions by attributes,” in CVPR, 2011, pp. 3337-3344.
15 G. Mori, X. Ren, A. A. Efros, and J. Malik, “Recovering human body configurations: Combining segmentation and recognition,” in CVPR, 2004, pp. 326-333.
16 A. Yilmaz and M. Shah, “Actions sketch: A novel action representation,” in CVPR, 2005, pp. 984-989.
17 H. Lin, L. Chaisorn, Y. Wong, A. Liu, Y. Su, and M. S. Kankanhalli, “View-invariant feature discovering for multi-camera human action recognition,” in IEEE 16th International Workshop on Multimedia Signal Processing, MMSP 2014, Jakarta, Indonesia, September 22-24, 2014, 2014, pp. 1-6.
18 A. Liu, “Human action recognition with structured discriminative random fields,” Electronics Letters, vol. 47, no. 11, pp. 651-653, 2011.   DOI   ScienceOn
19 M. Z and P. M, “Training initialization of hidden markov models in human action recognition,” IEEE Transactions on Automation Science and Engineering, 2014.
20 A.-A. Liu and Y.-T. Su, “Coupled hidden conditional random fields for rgb-d human action recognition,” Singal Processing, 2014.
21 A. Liu, “Bidirectional integrated random fields for human behavior understanding,” Electronics Letters, vol. 48, no. 5, pp. 262-264, 2012.   DOI   ScienceOn
22 N. Kumar, A. C. Berg, P. N. Belhumeur, and S. K. Nayar, “Attribute and simile classifiers for face verification,” in ICCV, 2009, pp. 365-372.
23 B. Yao, X. Jiang, A. Khosla, A. L. Lin, L. J. Guibas, and F.-F. Li, “Human action recognition by learning bases of action attributes and parts,” in ICCV, 2011, pp. 1331-1338.
24 W. Nie, A. Liu, J. Yu, Y. Su, L. Chaisorn, Y. Wang, and M. S. Kankanhalli, “Multi-view action recognition by cross-domain learning,” in IEEE 16th International Workshop on Multimedia Signal Processing, MMSP 2014, Jakarta, Indonesia, September 22-24, 2014, 2014, pp. 1-6.
25 I. Laptev and T. Lindeberg, “Space-time interest points,” in ICCV, 2003, pp. 432-439.
26 A. A. Efros, A. C. Berg, G. Mori, and J. Malik, “Recognizing action at a distance,” in ICCV, 2003, pp. 726-733.
27 D. Xu and S.-F. Chang, “Video event recognition using kernel methods with multilevel temporal alignment,” TPAMI, vol. 30, no. 11, pp. 1985-1997, 2008.   DOI   ScienceOn
28 L. D. Bourdev and J. Malik, “Poselets: Body part detectors trained using 3d human pose annotations,” in ICCV, 2009, pp. 1365-1372.
29 F.-F. Li and P. Perona, “A bayesian hierarchical model for learning natural scene categories,” in CVPR, 2005, pp. 524-531.
30 H. Wang, M. M. Ullah, A. Klaser, I. Laptev, and C. Schmid, “Evaluation of local spatio-temporal features for action recognition,” in BMVC, 2009, pp. 1-11.
31 P. F. Felzenszwalb, R. B. Girshick, D. McAllester, D. Ramanan, “Object detection with discriminatively trained part-based models,” TPAMI, vol. 32, no. 9, pp. 1627-1645, 2010.   DOI   ScienceOn
32 A. Kl ¨aser, M. Marszalek, and C. Schmid, “A spatiotemporal descriptor based on 3d-gradients,” in BMVC, 2008, pp. 1-10.
33 M. Bregonzio, S. Gong, and T. Xiang, “Recognising action as clouds of space-time interest points,” in CVPR, 2009, pp. 1948-1955.
34 A.-A. Liu and Y.-T. Su, “Single/multi-view human action recognition via regularized multi-task learning,” Neurocomputing, 2014.
35 A.-A. Liu, Y.-T. Su, P. Jia, Z. Gao, T. Hao, and Z.-X. Yang, “Multipe/single-view human action recognition via part-induced multi-task structural learning,” IEEE Transactions on Cybernetics, 2014.
36 I.Laptev, M.Marszalek, C.Schmid, and B.Rozenfeld, “Learning realistic human actions from movies,” in CVPR, 2008.
37 S. Savarese, A. DelPozo, J. C. Niebles, and L. Fei-Fei, “Spatialtemporal correlatons for unsupervised action classification,” in WMVC, 2008, pp. 1-8.
38 Q. V. Le, W. Y. Zou, S. Y. Yeung, and A. Y. Ng, “Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis,” in CVPR, 2011, pp. 3361-3368.