• Title/Summary/Keyword: Action recognition

Search Result 404, Processing Time 0.035 seconds

Extensible Hierarchical Method of Detecting Interactive Actions for Video Understanding

  • Moon, Jinyoung;Jin, Junho;Kwon, Yongjin;Kang, Kyuchang;Park, Jongyoul;Park, Kyoung
    • ETRI Journal
    • /
    • v.39 no.4
    • /
    • pp.502-513
    • /
    • 2017
  • For video understanding, namely analyzing who did what in a video, actions along with objects are primary elements. Most studies on actions have handled recognition problems for a well-trimmed video and focused on enhancing their classification performance. However, action detection, including localization as well as recognition, is required because, in general, actions intersect in time and space. In addition, most studies have not considered extensibility for a newly added action that has been previously trained. Therefore, proposed in this paper is an extensible hierarchical method for detecting generic actions, which combine object movements and spatial relations between two objects, and inherited actions, which are determined by the related objects through an ontology and rule based methodology. The hierarchical design of the method enables it to detect any interactive actions based on the spatial relations between two objects. The method using object information achieves an F-measure of 90.27%. Moreover, this paper describes the extensibility of the method for a new action contained in a video from a video domain that is different from the dataset used.

Deep Learning-based Action Recognition using Skeleton Joints Mapping (스켈레톤 조인트 매핑을 이용한 딥 러닝 기반 행동 인식)

  • Tasnim, Nusrat;Baek, Joong-Hwan
    • Journal of Advanced Navigation Technology
    • /
    • v.24 no.2
    • /
    • pp.155-162
    • /
    • 2020
  • Recently, with the development of computer vision and deep learning technology, research on human action recognition has been actively conducted for video analysis, video surveillance, interactive multimedia, and human machine interaction applications. Diverse techniques have been introduced for human action understanding and classification by many researchers using RGB image, depth image, skeleton and inertial data. However, skeleton-based action discrimination is still a challenging research topic for human machine-interaction. In this paper, we propose an end-to-end skeleton joints mapping of action for generating spatio-temporal image so-called dynamic image. Then, an efficient deep convolution neural network is devised to perform the classification among the action classes. We use publicly accessible UTD-MHAD skeleton dataset for evaluating the performance of the proposed method. As a result of the experiment, the proposed system shows better performance than the existing methods with high accuracy of 97.45%.

A Study on the Effect of Returned Clothes Via On-line Sales on Their Brands (온라인상(上)에서 의류제품(衣類製品)의 반품(返品) 경험(經驗)이 브랜드에 대(對)한 태도(態度)에 미치는 영향(影響) 연구(硏究))

  • Kim, Yeon-Hee;Kim, Il
    • Journal of Fashion Business
    • /
    • v.7 no.4
    • /
    • pp.26-42
    • /
    • 2003
  • On-line clothes sale are on the increase, and the returns(for replacements or refunds) of the clothes are also increasing. Many studies on off-line consumers' complaints were made before, but few studies on the returns of clothes sold on-line have been made. From this viewpoint, this study was conducted to know what effect returns of clothes sold on-line have on their brands. Therefore, this study was first focused on the factors affecting complaint acts(return intention or return acts) such as lack of information and recognition of product, delivery errors and product defects concerning on-line sales, and second investigated the changes of buyers' attitude toward the brand following the their acts of returning the buyers, and third looked into the changes of on-line buyers' attitude toward the brand. The study is carried out by subdividing the objects of the study into return action(replacement, refundment) and purchasers who experienced return intention. Such experience is demonstratively analyzed to find how it has affected the attitude toward the brand. The study comes up with the following outcomes. First, the effect factor causing complaint action(return action, return intention) on-line is shown as the lack of the information and recognition of the product. Second, it is revealed that the effect factor causing complaint(return action, return intention) does not lie in an error in delivery or a defect of a product. Third, the positive response of a brand to a return action does not raise the repurchasing intention and positive attitude of purchasers who experienced returning a product, but lowers their private complaint action intention. Fourth, the repurchase intention of purchasers who experienced return intention for the brand is lowered, but their negative attitude and private complaint action intention is not raised.

Teacher-Student Architecture Based CNN for Action Recognition (동작 인식을 위한 교사-학생 구조 기반 CNN)

  • Zhao, Yulan;Lee, Hyo Jong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.3
    • /
    • pp.99-104
    • /
    • 2022
  • Convolutional neural network (CNN) generally uses two-stream architecture RGB and optical flow stream for its action recognition function. RGB frames stream display appearance and optical flow stream interprets its action. However, the standard method of using optical flow is costly in its computational time and latency associated with increased action recognition. The purpose of the study was to evaluate a novel way to create a two sub-networks in neural networks. The optical flow sub-network was assigned as a teacher and the RGB frames as a student. In the training stage, the optical flow sub-network extracts features through the teacher sub-network and transmits the information to student sub-network for baseline training. In the test stage, only student sub-network was operational with decreased in latency without computing optical flow. Experimental results shows that our network fed only by RGB stream gets a competitive accuracy of 54.5% on HMDB51, which is 1.5 times better than that on R3D-18.

Human Action Recognition by Inference of Stochastic Regular Grammars (확률적 정규 문법 추론법에 의한 사람 몸동작 인식)

  • Cho, Kyung-Eun;Cho, Hyung-Je
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.7 no.3
    • /
    • pp.248-259
    • /
    • 2001
  • This paper proposes a human action recognition scheme to recognize nonverbal human communications automatically. Based on the principle that a human body action can be defined as a combination of multiple articulation movements, we use the method of inferencing stochastic grammars to understand each human actions. We measure and quantize each human action in 3D world-coordinate, and make two sets of 4-chain-code for xy and zy projection plane. Based on the fact that the neighboring information among articulations is an essential element to distinguish actions, we designed a new stochastic inference procedure to apply the neighboring information of hands. Our proposed scheme shows better recognition rate than that of other general stochastic inference procedures. ures.

  • PDF

Computational Model of a Mirror Neuron System for Intent Recognition through Imitative Learning of Objective-directed Action (목적성 행동 모방학습을 통한 의도 인식을 위한 거울뉴런 시스템 계산 모델)

  • Ko, Kwang-Eun;Sim, Kwee-Bo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.20 no.6
    • /
    • pp.606-611
    • /
    • 2014
  • The understanding of another's behavior is a fundamental cognitive ability for primates including humans. Recent neuro-physiological studies suggested that there is a direct matching algorithm from visual observation onto an individual's own motor repertories for interpreting cognitive ability. The mirror neurons are known as core regions and are handled as a functionality of intent recognition on the basis of imitative learning of an observed action which is acquired from visual-information of a goal-directed action. In this paper, we addressed previous works used to model the function and mechanisms of mirror neurons and proposed a computational model of a mirror neuron system which can be used in human-robot interaction environments. The major focus of the computation model is the reproduction of an individual's motor repertory with different embodiments. The model's aim is the design of a continuous process which combines sensory evidence, prior task knowledge and a goal-directed matching of action observation and execution. We also propose a biologically inspired plausible equation model.

Recognizing the Direction of Action using Generalized 4D Features (일반화된 4차원 특징을 이용한 행동 방향 인식)

  • Kim, Sun-Jung;Kim, Soo-Wan;Choi, Jin-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.24 no.5
    • /
    • pp.518-528
    • /
    • 2014
  • In this paper, we propose a method to recognize the action direction of human by developing 4D space-time (4D-ST, [x,y,z,t]) features. For this, we propose 4D space-time interest points (4D-STIPs, [x,y,z,t]) which are extracted using 3D space (3D-S, [x,y,z]) volumes reconstructed from images of a finite number of different views. Since the proposed features are constructed using volumetric information, the features for arbitrary 2D space (2D-S, [x,y]) viewpoint can be generated by projecting the 3D-S volumes and 4D-STIPs on corresponding image planes in training step. We can recognize the directions of actors in the test video since our training sets, which are projections of 3D-S volumes and 4D-STIPs to various image planes, contain the direction information. The process for recognizing action direction is divided into two steps, firstly we recognize the class of actions and then recognize the action direction using direction information. For the action and direction of action recognition, with the projected 3D-S volumes and 4D-STIPs we construct motion history images (MHIs) and non-motion history images (NMHIs) which encode the moving and non-moving parts of an action respectively. For the action recognition, features are trained by support vector data description (SVDD) according to the action class and recognized by support vector domain density description (SVDDD). For the action direction recognition after recognizing actions, each actions are trained using SVDD according to the direction class and then recognized by SVDDD. In experiments, we train the models using 3D-S volumes from INRIA Xmas Motion Acquisition Sequences (IXMAS) dataset and recognize action direction by constructing a new SNU dataset made for evaluating the action direction recognition.

Spatial-temporal Ensemble Method for Action Recognition (행동 인식을 위한 시공간 앙상블 기법)

  • Seo, Minseok;Lee, Sangwoo;Choi, Dong-Geol
    • The Journal of Korea Robotics Society
    • /
    • v.15 no.4
    • /
    • pp.385-391
    • /
    • 2020
  • As deep learning technology has been developed and applied to various fields, it is gradually changing from an existing single image based application to a video based application having a time base in order to recognize human behavior. However, unlike 2D CNN in a single image, 3D CNN in a video has a very high amount of computation and parameter increase due to the addition of a time axis, so improving accuracy in action recognition technology is more difficult than in a single image. To solve this problem, we investigate and analyze various techniques to improve performance in 3D CNN-based image recognition without additional training time and parameter increase. We propose a time base ensemble using the time axis that exists only in the videos and an ensemble in the input frame. We have achieved an accuracy improvement of up to 7.1% compared to the existing performance with a combination of techniques. It also revealed the trade-off relationship between computational and accuracy.

Improved Two-Phase Framework for Facial Emotion Recognition

  • Yoon, Hyunjin;Park, Sangwook;Lee, Yongkwi;Han, Mikyong;Jang, Jong-Hyun
    • ETRI Journal
    • /
    • v.37 no.6
    • /
    • pp.1199-1210
    • /
    • 2015
  • Automatic emotion recognition based on facial cues, such as facial action units (AUs), has received huge attention in the last decade due to its wide variety of applications. Current computer-based automated two-phase facial emotion recognition procedures first detect AUs from input images and then infer target emotions from the detected AUs. However, more robust AU detection and AU-to-emotion mapping methods are required to deal with the error accumulation problem inherent in the multiphase scheme. Motivated by our key observation that a single AU detector does not perform equally well for all AUs, we propose a novel two-phase facial emotion recognition framework, where the presence of AUs is detected by group decisions of multiple AU detectors and a target emotion is inferred from the combined AU detection decisions. Our emotion recognition framework consists of three major components - multiple AU detection, AU detection fusion, and AU-to-emotion mapping. The experimental results on two real-world face databases demonstrate an improved performance over the previous two-phase method using a single AU detector in terms of both AU detection accuracy and correct emotion recognition rate.

Motion Recognition of Worker Based on Frame Difference (프레임간 차를 기반으로 한 작업자의 동작인식)

  • 김형균;정기봉;오무송
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.5 no.7
    • /
    • pp.1280-1286
    • /
    • 2001
  • In this Study, we try to suggest a system that recognize worker's regular motion more effectively First, based on frame difference that separates still background from movable object to video that make a film of worker's motion. The next, with edge detection, estimating the center of motion could recognize continuous motion. By action cognition system that design in this research films worker's action using fixed CCTV to supplement problem of action awareness system that is applied in existent industry spot, various mountings to get action information minimized. Also, shorten session that need in awareness enforcing action awareness through image subtraction and edge detection between frame to reduce time necessary to draw worker's body part special quality, expense designed inexpensive action cognition system as being efficient.

  • PDF