Search | Korea Science

Deep Learning-based Action Recognition using Skeleton Joints Mapping (스켈레톤 조인트 매핑을 이용한 딥 러닝 기반 행동 인식)

Tasnim, Nusrat;Baek, Joong-Hwan
- Journal of Advanced Navigation Technology
- /
- v.24 no.2
- /
- pp.155-162
- /
- 2020
Recently, with the development of computer vision and deep learning technology, research on human action recognition has been actively conducted for video analysis, video surveillance, interactive multimedia, and human machine interaction applications. Diverse techniques have been introduced for human action understanding and classification by many researchers using RGB image, depth image, skeleton and inertial data. However, skeleton-based action discrimination is still a challenging research topic for human machine-interaction. In this paper, we propose an end-to-end skeleton joints mapping of action for generating spatio-temporal image so-called dynamic image. Then, an efficient deep convolution neural network is devised to perform the classification among the action classes. We use publicly accessible UTD-MHAD skeleton dataset for evaluating the performance of the proposed method. As a result of the experiment, the proposed system shows better performance than the existing methods with high accuracy of 97.45%.
https://doi.org/10.12673/jant.2020.24.2.155 인용 PDF KSCI

Dual-Stream Fusion and Graph Convolutional Network for Skeleton-Based Action Recognition

Hu, Zeyuan;Feng, Yiran;Lee, Eung-Joo
- Journal of Korea Multimedia Society
- /
- v.24 no.3
- /
- pp.423-430
- /
- 2021
Aiming Graph convolutional networks (GCNs) have achieved outstanding performances on skeleton-based action recognition. However, several problems remain in existing GCN-based methods, and the problem of low recognition rate caused by single input data information has not been effectively solved. In this article, we propose a Dual-stream fusion method that combines video data and skeleton data. The two networks respectively identify skeleton data and video data and fuse the probabilities of the two outputs to achieve the effect of information fusion. Experiments on two large dataset, Kinetics and NTU-RGBC+D Human Action Dataset, illustrate that our proposed method achieves state-of-the-art. Compared with the traditional method, the recognition accuracy is improved better.
https://doi.org/10.9717/kmms.2020.24.3.423 인용 PDF KSCI HTML

Optimised ML-based System Model for Adult-Child Actions Recognition

Alhammami, Muhammad;Hammami, Samir Marwan;Ooi, Chee-Pun;Tan, Wooi-Haw
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.2
- /
- pp.929-944
- /
- 2019
Many critical applications require accurate real-time human action recognition. However, there are many hurdles associated with capturing and pre-processing image data, calculating features, and classification because they consume significant resources for both storage and computation. To circumvent these hurdles, this paper presents a recognition machine learning (ML) based system model which uses reduced data structure features by projecting real 3D skeleton modality on virtual 2D space. The MMU VAAC dataset is used to test the proposed ML model. The results show a high accuracy rate of 97.88% which is only slightly lower than the accuracy when using the original 3D modality-based features but with a 75% reduction ratio from using RGB modality. These results motivate implementing the proposed recognition model on an embedded system platform in the future.
https://doi.org/10.3837/tiis.2019.02.024 인용 PDF KSCI HTML

Two person Interaction Recognition Based on Effective Hybrid Learning

Ahmed, Minhaz Uddin;Kim, Yeong Hyeon;Kim, Jin Woo;Bashar, Md Rezaul;Rhee, Phill Kyu
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.2
- /
- pp.751-770
- /
- 2019
Action recognition is an essential task in computer vision due to the variety of prospective applications, such as security surveillance, machine learning, and human-computer interaction. The availability of more video data than ever before and the lofty performance of deep convolutional neural networks also make it essential for action recognition in video. Unfortunately, limited crafted video features and the scarcity of benchmark datasets make it challenging to address the multi-person action recognition task in video data. In this work, we propose a deep convolutional neural network-based Effective Hybrid Learning (EHL) framework for two-person interaction classification in video data. Our approach exploits a pre-trained network model (the VGG16 from the University of Oxford Visual Geometry Group) and extends the Faster R-CNN (region-based convolutional neural network a state-of-the-art detector for image classification). We broaden a semi-supervised learning method combined with an active learning method to improve overall performance. Numerous types of two-person interactions exist in the real world, which makes this a challenging task. In our experiment, we consider a limited number of actions, such as hugging, fighting, linking arms, talking, and kidnapping in two environment such simple and complex. We show that our trained model with an active semi-supervised learning architecture gradually improves the performance. In a simple environment using an Intelligent Technology Laboratory (ITLab) dataset from Inha University, performance increased to 95.6% accuracy, and in a complex environment, performance reached 81% accuracy. Our method reduces data-labeling time, compared to supervised learning methods, for the ITLab dataset. We also conduct extensive experiment on Human Action Recognition benchmarks such as UT-Interaction dataset, HMDB51 dataset and obtain better performance than state-of-the-art approaches.
https://doi.org/10.3837/tiis.2019.02.015 인용 PDF KSCI HTML

Effective Pose-based Approach with Pose Estimation for Emotional Action Recognition (자세 예측을 이용한 효과적인 자세 기반 감정 동작 인식)

Kim, Jin Ok
- KIPS Transactions on Software and Data Engineering
- /
- v.2 no.3
- /
- pp.209-218
- /
- 2013
Early researches in human action recognition have focused on tracking and classifying articulated body motions. Such methods required accurate segmentation of body parts, which is a sticky task, particularly under realistic imaging conditions. Recent trends of work have become popular towards the use of more and low-level appearance features such as spatio-temporal interest points. Given the great progress in pose estimation over the past few years, redefined views about pose-based approach are needed. This paper addresses the issues of whether it is sufficient to train a classifier only on low-level appearance features in appearance approach and proposes effective pose-based approach with pose estimation for emotional action recognition. In order for these questions to be solved, we compare the performance of pose-based, appearance-based and its combination-based features respectively with respect to scenario of various emotional action recognition. The experiment results show that pose-based features outperform low-level appearance-based approach of features, even when heavily spoiled by noise, suggesting that pose-based approach with pose estimation is beneficial for the emotional action recognition.
https://doi.org/10.3745/KTSDE.2013.2.3.209 인용 PDF KSCI

Multi-Region based Radial GCN algorithm for Human action Recognition (행동인식을 위한 다중 영역 기반 방사형 GCN 알고리즘)

Jang, Han Byul;Lee, Chil Woo
- Smart Media Journal
- /
- v.11 no.1
- /
- pp.46-57
- /
- 2022
In this paper, multi-region based Radial Graph Convolutional Network (MRGCN) algorithm which can perform end-to-end action recognition using the optical flow and gradient of input image is described. Because this method does not use information of skeleton that is difficult to acquire and complicated to estimate, it can be used in general CCTV environment in which only video camera is used. The novelty of MRGCN is that it expresses the optical flow and gradient of the input image as directional histograms and then converts it into six feature vectors to reduce the amount of computational load and uses a newly developed radial type network model to hierarchically propagate the deformation and shape change of the human body in spatio-temporal space. Another important feature is that the data input areas are arranged being overlapped each other, so that information is not spatially disconnected among input nodes. As a result of performing MRGCN's action recognition performance evaluation experiment for 30 actions, it was possible to obtain Top-1 accuracy of 84.78%, which is superior to the existing GCN-based action recognition method using skeleton data as an input.
https://doi.org/10.30693/SMJ.2022.11.1.46 인용 PDF KSCI

Computational Model of a Mirror Neuron System for Intent Recognition through Imitative Learning of Objective-directed Action (목적성 행동 모방학습을 통한 의도 인식을 위한 거울뉴런 시스템 계산 모델)

Ko, Kwang-Eun;Sim, Kwee-Bo
- Journal of Institute of Control, Robotics and Systems
- /
- v.20 no.6
- /
- pp.606-611
- /
- 2014
The understanding of another's behavior is a fundamental cognitive ability for primates including humans. Recent neuro-physiological studies suggested that there is a direct matching algorithm from visual observation onto an individual's own motor repertories for interpreting cognitive ability. The mirror neurons are known as core regions and are handled as a functionality of intent recognition on the basis of imitative learning of an observed action which is acquired from visual-information of a goal-directed action. In this paper, we addressed previous works used to model the function and mechanisms of mirror neurons and proposed a computational model of a mirror neuron system which can be used in human-robot interaction environments. The major focus of the computation model is the reproduction of an individual's motor repertory with different embodiments. The model's aim is the design of a continuous process which combines sensory evidence, prior task knowledge and a goal-directed matching of action observation and execution. We also propose a biologically inspired plausible equation model.
https://doi.org/10.5302/J.ICROS.2014.14.9033 인용 PDF KSCI

Decomposed "Spatial and Temporal" Convolution for Human Action Recognition in Videos

Sediqi, Khwaja Monib;Lee, Hyo Jong
- Proceedings of the Korea Information Processing Society Conference
- /
- 2019.05a
- /
- pp.455-457
- /
- 2019
In this paper we study the effect of decomposed spatiotemporal convolutions for action recognition in videos. Our motivation emerges from the empirical observation that spatial convolution applied on solo frames of the video provide good performance in action recognition. In this research we empirically show the accuracy of factorized convolution on individual frames of video for action classification. We take 3D ResNet-18 as base line model for our experiment, factorize its 3D convolution to 2D (Spatial) and 1D (Temporal) convolution. We train the model from scratch using Kinetics video dataset. We then fine-tune the model on UCF-101 dataset and evaluate the performance. Our results show good accuracy similar to that of the state of the art algorithms on Kinetics and UCF-101 datasets.
https://doi.org/10.3745/PKIPS.y2019m05a.455 인용 PDF

Vision-based garbage dumping action detection for real-world surveillance platform

Yun, Kimin;Kwon, Yongjin;Oh, Sungchan;Moon, Jinyoung;Park, Jongyoul
- ETRI Journal
- /
- v.41 no.4
- /
- pp.494-505
- /
- 2019
In this paper, we propose a new framework for detecting the unauthorized dumping of garbage in real-world surveillance camera. Although several action/behavior recognition methods have been investigated, these studies are hardly applicable to real-world scenarios because they are mainly focused on well-refined datasets. Because the dumping actions in the real-world take a variety of forms, building a new method to disclose the actions instead of exploiting previous approaches is a better strategy. We detected the dumping action by the change in relation between a person and the object being held by them. To find the person-held object of indefinite form, we used a background subtraction algorithm and human joint estimation. The person-held object was then tracked and the relation model between the joints and objects was built. Finally, the dumping action was detected through the voting-based decision module. In the experiments, we show the effectiveness of the proposed method by testing on real-world videos containing various dumping actions. In addition, the proposed framework is implemented in a real-time monitoring system through a fast online algorithm.
https://doi.org/10.4218/etrij.2018-0520 인용 PDF KSCI

Two-Stream Convolutional Neural Network for Video Action Recognition

Qiao, Han;Liu, Shuang;Xu, Qingzhen;Liu, Shouqiang;Yang, Wanggan
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.15 no.10
- /
- pp.3668-3684
- /
- 2021
Video action recognition is widely used in video surveillance, behavior detection, human-computer interaction, medically assisted diagnosis and motion analysis. However, video action recognition can be disturbed by many factors, such as background, illumination and so on. Two-stream convolutional neural network uses the video spatial and temporal models to train separately, and performs fusion at the output end. The multi segment Two-Stream convolutional neural network model trains temporal and spatial information from the video to extract their feature and fuse them, then determine the category of video action. Google Xception model and the transfer learning is adopted in this paper, and the Xception model which trained on ImageNet is used as the initial weight. It greatly overcomes the problem of model underfitting caused by insufficient video behavior dataset, and it can effectively reduce the influence of various factors in the video. This way also greatly improves the accuracy and reduces the training time. What's more, to make up for the shortage of dataset, the kinetics400 dataset was used for pre-training, which greatly improved the accuracy of the model. In this applied research, through continuous efforts, the expected goal is basically achieved, and according to the study and research, the design of the original dual-flow model is improved.
https://doi.org/10.3837/tiis.2021.10.011 인용 PDF KSCI HTML

Search Result 156, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)