[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.3837/tiis.2019.02.024

Optimised ML-based System Model for Adult-Child Actions Recognition

Alhammami, Muhammad (Faculty of Engineering, Multimedia University)
Hammami, Samir Marwan (Department of Management Information, Dhofar University)
Ooi, Chee-Pun (Faculty of Engineering, Multimedia University)
Tan, Wooi-Haw (Faculty of Engineering, Multimedia University)

Publication Information

KSII Transactions on Internet and Information Systems (TIIS) / v.13, no.2, 2019 , pp. 929-944 More about this Journal

Abstract

Many critical applications require accurate real-time human action recognition. However, there are many hurdles associated with capturing and pre-processing image data, calculating features, and classification because they consume significant resources for both storage and computation. To circumvent these hurdles, this paper presents a recognition machine learning (ML) based system model which uses reduced data structure features by projecting real 3D skeleton modality on virtual 2D space. The MMU VAAC dataset is used to test the proposed ML model. The results show a high accuracy rate of 97.88% which is only slightly lower than the accuracy when using the original 3D modality-based features but with a 75% reduction ratio from using RGB modality. These results motivate implementing the proposed recognition model on an embedded system platform in the future.

Keywords

Human action recognition; 2D Skeleton features; 3D Projection; Reduced data structure; Compound features selection method;

Citations & Related Records

Reference

1	Gavrila D, Davis L, "Towards 3-d model-based tracking and recognition of human movement: a multi-view approach," In: International workshop on automatic face-and gesture-recognition, pp 272-277, 1995.
2	Campbell LW, Bobick AE, "Recognition of human body motion using phase space constraints," in Proc. of Proceedings IEEE Fifth International Conference on Computer Vision, pp 624-630, 1995.
3	Yacoob Y, Black MJ, "Parameterized modeling and recognition of activities," in Proc. of Sixth IEEE International Conference on Computer Vision, pp 120-127, 1998.
4	Husz ZL, Wallace AM, Green PR, "Behavioural analysis with movement cluster model for concurrent actions," Journal on Image and Video Processing, 2011.
5	Ali S, Basharat A, Shah M, "Chaotic invariants for human action recognition," in Proc. of 11th IEEE International Conference on Computer Vision ICCV 2007, pp 1-8, 2007.
6	Tran KN, Kakadiaris IA, Shah, "SK Modeling motion of body parts for action recognition," In: Citeseer BMVC, vol 11, pp 1-12, 2011.
7	Yun K, Honorio J, Chattopadhyay D, Berg TL, Samaras D, "Two-person interaction detection using body-pose features and multiple instance learning," in Proc. of 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp 28-35, 2012.
8	Hu T, Zhu X, Guo W, Su K, "Efficient interaction recognition through positive action representation," Mathematical Problems in Engineering, 2013.
9	Wolf C, Lombardi E, Mille J, Celiktutan O, Jiu M, Dogan E, Eren G, Baccouche M, et al., "Evaluation of video activity localizations integrating quality and quantity measurements," Computer Vision and Image Understanding, 127, pp 14-30, 2014. DOI
10	Bloom V, Argyriou V, Makris D, "G3di: A gaming interaction dataset with a real time detection and evaluation framework," in Proc. of Workshop at the European Conference on Computer Vision, Springer, pp 698-712, 2014.
11	Wang K, Wang X, Lin L, Wang M, Zuo W, "3d human activity recognition with reconfigurable convolutional neural networks," in Proc. of the 22nd ACM international conference on Multimedia, pp 97-106, 2014.
12	Xu N, Liu A, Nie W, Wong Y, Li F, Su Y, "Multi-modal & multi-view & interactive benchmark dataset for human action recognition," in Proc. of the 23rd ACM international conference on Multimedia, pp 1195-1198, 2015.
13	Alhammami M, Ooi CP, Tan WH, "Violent actions against children," Data in Brief, 12, 480-484, 2017. DOI
14	Liang B, Zheng L, "A survey on human action recognition using depth sensors," in Proc. of 2015 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp 1-8, 2015.
15	Yang X, Tian YL, "Eigenjoints-based action recognition using naive-bayes-nearest-neighbor," in Proc. of 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, IEEE, pp 14-19, 2012.
16	Wang J, Liu Z, Wu Y, Yuan J, "Mining actionlet ensemble for action recognition with depth cameras," in Proc. of 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2012.
17	Luo J, Wang W, Qi H, "Group sparsity and geometry constrained dictionary learning for action recognition from depth maps," in Proc. of the IEEE International Conference on Computer Vision, pp 1809-1816, 2013.
18	Vemulapalli R, Arrate F, "Chellappa R, Human action recognition by representing 3d skeletons as points in a lie group," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp 588-595, 2014.
19	Girshick R, Shotton J, Kohli P, Criminisi A, Fitzgibbon, "An Efficient regression of general-activity human poses from depth images," in Proc. Of 2011 IEEE International Conference on Computer Vision, pp 415-422, 2011.
20	Janoch A, Karayev S, Jia Y, Barron JT, Fritz M, Saenko K, Darrell T, "A category-level 3d object dataset: Putting the Kinect to work," Consumer Depth Cameras for Computer Vision, Springer, pp 41-165, 2013.
21	Shotton J, Sharp T, Kipman A, Fitzgibbon A, Finocchio M, Blake A, Cook M, Moore R, "Real-time human pose recognition in parts from single depth images," Communications of the ACM 56, PP 116-124, 2013.
22	Wu D, Shao L, "Leveraging hierarchical parametric networks for skeletal joints based action segmentation and recognition," in Proc. of the IEEE Conference on Computer Vision and Pattern Recognition, pp 724-731, 2014.
23	Xia L, Chen CC, Aggarwal JK, "View invariant human action recognition using histograms of 3d joints," in Proc. of 2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), IEEE, pp 20-27, 2012.
24	Zhao X, Li X, Pang C, Zhu X, Sheng QZ, "Online human gesture recognition from motion data streams," in Proc. of the 21st ACM international conference on Multimedia, ACM, pp 23-32, 2013.
25	Zanfir M, Leordeanu M, Sminchisescu C, "The moving pose: An efficient 3d kinematics descriptor for low-latency action recognition and detection," in Proc. of IEEE International Conference on Computer Vision, pp 2752-2759, 2013.
26	Muller M, Roder T, Clausen M, "Efficient content-based retrieval of motion capture data," ACM Transactions on Graphics (TOG), ACM, vol 24, pp 677-685, 2005. DOI
27	Alhammami M, Ooi CP, Tan WH, "Violence recognition using harmonic mean of distances and relational velocity with k-nearest neighbour classifier," International Visual Informatics Conference, Springer, pp 132-139, 2015.
28	Hall MA, "Correlation-based feature selection for machine learning," Ph.D. thesis, The University of Waikato, 1999.
29	Haque A, Peng B, Luo Z, Alahi A, Yeung S, Fei-Fei L, "Viewpoint invariant 3d human pose estimation with recurrent error feedback," arXiv preprint, 2016.
30	Sun M, Kohli P, Shotton J, "Conditional regression forests for human pose estimation," in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2012, IEEE, pp 3394-3401, 2012.
31	Ferguson D, Silver D, "Pose estimation using long range features," US Patent 9255805, 2016.
32	Belagiannis V, Zisserman, "A Recurrent human pose estimation," arXiv preprint, 2016.
33	Carreira J, Agrawal P, Fragkiadaki K, Malik J, "Human pose estimation with iterative error feedback," arXiv preprint, 2015.