• 제목/요약/키워드: video recognition

검색결과 679건 처리시간 0.023초

Probabilistic Background Subtraction in a Video-based Recognition System

  • Lee, Hee-Sung;Hong, Sung-Jun;Kim, Eun-Tai
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제5권4호
    • /
    • pp.782-804
    • /
    • 2011
  • In video-based recognition systems, stationary cameras are used to monitor an area of interest. These systems focus on a segmentation of the foreground in the video stream and the recognition of the events occurring in that area. The usual approach to discriminating the foreground from the video sequence is background subtraction. This paper presents a novel background subtraction method based on a probabilistic approach. We represent the posterior probability of the foreground based on the current image and all past images and derive an updated method. Furthermore, we present an efficient fusion method for the color and edge information in order to overcome the difficulties of existing background subtraction methods that use only color information. The suggested method is applied to synthetic data and real video streams, and its robust performance is demonstrated through experimentation.

360° 가상현실 동영상과 일반 동영상 교육 콘텐츠의 경험인식 비교 분석 (Comparison of experience recognition in 360° virtual reality videos and common videos)

  • 정은경;정지연
    • 한국응급구조학회지
    • /
    • 제23권3호
    • /
    • pp.145-154
    • /
    • 2019
  • Purpose: This study simulates cardiac arrest situations in 360° virtual reality video clips and general video clips, and compares the correlations between educational media and experience recognition. Methods: Experimental research was carried out on a random control group (n=32) and experimental group (n=32) on March 20, 2019. Results: The groups where participants were trained with the 360° virtual reality video clips and a higher score of experience recognition (p=.047) than the group where participants were trained with the general video clips. Moreover, the subfactors of experience recognition including the sense of presence and vividness (p=.05), immersion (p<.05). experience (p<.01), fantasy factor (p<.05). and content satisfaction (p<.05) were positively correlated. Conclusion: Enhancing vividness and the sense of presence when developing virtual reality videos recorded with a 360° camera is thought to enable experience recognition without any direct interaction.

A New Residual Attention Network based on Attention Models for Human Action Recognition in Video

  • Kim, Jee-Hyun;Cho, Young-Im
    • 한국컴퓨터정보학회논문지
    • /
    • 제25권1호
    • /
    • pp.55-61
    • /
    • 2020
  • 딥 러닝 기술의 발전과 컴퓨팅 파워 등의 개선으로 인해 비디오 기반 연구는 최근 많은 관심을 얻고 있다. 비디오 데이터가 이미지 데이터와 비교하여 가장 큰 차이는 비디오 데이터에는 많은 양의 시간적, 공간적 정보가 포함되어 있다는 점이다. 이처럼 비디오에 포함된 많은 양의 데이터로 인해 컴퓨터 비전 연구에 있어서 행동 인식은 중요한 연구 과제 중 하나이지만, 비디오와 같이 움직임이 있는 환경에서 인간의 행동 인식은 매우 복잡하고 도전적인 과제이다. 인간에 대한 여러 연구를 바탕으로 인공지능에서는 인간과 유사한 주의(attention)메커니즘이 효율적인 인식 모델이라는 것을 알게 되었다. 이 효율적인 모델은 이미지 정보와 복잡한 연속 비디오 정보를 처리하는 데 이상적이다. 본 논문에서는 이러한 연구배경을 기반으로, 비디오에서 인간의 행동을 효율적으로 인식하기 위해 먼저 인간의 행동에 주목한 후 비디오 행동 인식에 주의메커니즘을 도입하고자 한다. 논문의 주요내용은 두 가지 주의 메카니즘을 기반으로 컨볼루션 신경망을 이용한 새로운 3D 잔류 주의 네트워크를 제안함으로써 비디오에서 인간의 행동을 식별하고자 한다. 제안 모델의 평가 결과 최대 90.7%정도의 정확도를 보였다.

Improved Inference for Human Attribute Recognition using Historical Video Frames

  • Ha, Hoang Van;Lee, Jong Weon;Park, Chun-Su
    • 반도체디스플레이기술학회지
    • /
    • 제20권3호
    • /
    • pp.120-124
    • /
    • 2021
  • Recently, human attribute recognition (HAR) attracts a lot of attention due to its wide application in video surveillance systems. Recent deep-learning-based solutions for HAR require time-consuming training processes. In this paper, we propose a post-processing technique that utilizes the historical video frames to improve prediction results without invoking re-training or modifying existing deep-learning-based classifiers. Experiment results on a large-scale benchmark dataset show the effectiveness of our proposed method.

A Video Expression Recognition Method Based on Multi-mode Convolution Neural Network and Multiplicative Feature Fusion

  • Ren, Qun
    • Journal of Information Processing Systems
    • /
    • 제17권3호
    • /
    • pp.556-570
    • /
    • 2021
  • The existing video expression recognition methods mainly focus on the spatial feature extraction of video expression images, but tend to ignore the dynamic features of video sequences. To solve this problem, a multi-mode convolution neural network method is proposed to effectively improve the performance of facial expression recognition in video. Firstly, OpenFace 2.0 is used to detect face images in video, and two deep convolution neural networks are used to extract spatiotemporal expression features. Furthermore, spatial convolution neural network is used to extract the spatial information features of each static expression image, and the dynamic information feature is extracted from the optical flow information of multiple expression images based on temporal convolution neural network. Then, the spatiotemporal features learned by the two deep convolution neural networks are fused by multiplication. Finally, the fused features are input into support vector machine to realize the facial expression classification. Experimental results show that the recognition accuracy of the proposed method can reach 64.57% and 60.89%, respectively on RML and Baum-ls datasets. It is better than that of other contrast methods.

Multimodal Biometrics Recognition from Facial Video with Missing Modalities Using Deep Learning

  • Maity, Sayan;Abdel-Mottaleb, Mohamed;Asfour, Shihab S.
    • Journal of Information Processing Systems
    • /
    • 제16권1호
    • /
    • pp.6-29
    • /
    • 2020
  • Biometrics identification using multiple modalities has attracted the attention of many researchers as it produces more robust and trustworthy results than single modality biometrics. In this paper, we present a novel multimodal recognition system that trains a deep learning network to automatically learn features after extracting multiple biometric modalities from a single data source, i.e., facial video clips. Utilizing different modalities, i.e., left ear, left profile face, frontal face, right profile face, and right ear, present in the facial video clips, we train supervised denoising auto-encoders to automatically extract robust and non-redundant features. The automatically learned features are then used to train modality specific sparse classifiers to perform the multimodal recognition. Moreover, the proposed technique has proven robust when some of the above modalities were missing during the testing. The proposed system has three main components that are responsible for detection, which consists of modality specific detectors to automatically detect images of different modalities present in facial video clips; feature selection, which uses supervised denoising sparse auto-encoders network to capture discriminative representations that are robust to the illumination and pose variations; and classification, which consists of a set of modality specific sparse representation classifiers for unimodal recognition, followed by score level fusion of the recognition results of the available modalities. Experiments conducted on the constrained facial video dataset (WVU) and the unconstrained facial video dataset (HONDA/UCSD), resulted in a 99.17% and 97.14% Rank-1 recognition rates, respectively. The multimodal recognition accuracy demonstrates the superiority and robustness of the proposed approach irrespective of the illumination, non-planar movement, and pose variations present in the video clips even in the situation of missing modalities.

A Robust Approach for Human Activity Recognition Using 3-D Body Joint Motion Features with Deep Belief Network

  • Uddin, Md. Zia;Kim, Jaehyoun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권2호
    • /
    • pp.1118-1133
    • /
    • 2017
  • Computer vision-based human activity recognition (HAR) has become very famous these days due to its applications in various fields such as smart home healthcare for elderly people. A video-based activity recognition system basically has many goals such as to react based on people's behavior that allows the systems to proactively assist them with their tasks. A novel approach is proposed in this work for depth video based human activity recognition using joint-based motion features of depth body shapes and Deep Belief Network (DBN). From depth video, different body parts of human activities are segmented first by means of a trained random forest. The motion features representing the magnitude and direction of each joint in next frame are extracted. Finally, the features are applied for training a DBN to be used for recognition later. The proposed HAR approach showed superior performance over conventional approaches on private and public datasets, indicating a prominent approach for practical applications in smartly controlled environments.

Robust Video-Based Barcode Recognition via Online Sequential Filtering

  • Kim, Minyoung
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제14권1호
    • /
    • pp.8-16
    • /
    • 2014
  • We consider the visual barcode recognition problem in a noisy video data setup. Unlike most existing single-frame recognizers that require considerable user effort to acquire clean, motionless and blur-free barcode signals, we eliminate such extra human efforts by proposing a robust video-based barcode recognition algorithm. We deal with a sequence of noisy blurred barcode image frames by posing it as an online filtering problem. In the proposed dynamic recognition model, at each frame we infer the blur level of the frame as well as the digit class label. In contrast to a frame-by-frame based approach with heuristic majority voting scheme, the class labels and frame-wise noise levels are propagated along the frame sequences in our model, and hence we exploit all cues from noisy frames that are potentially useful for predicting the barcode label in a probabilistically reasonable sense. We also suggest a visual barcode tracking approach that efficiently localizes barcode areas in video frames. The effectiveness of the proposed approaches is demonstrated empirically on both synthetic and real data setup.

비디오 얼굴인식을 위한 다중 손실 함수 기반 어텐션 심층신경망 학습 제안 (Attention Deep Neural Networks Learning based on Multiple Loss functions for Video Face Recognition)

  • 김경태;유원상;최재영
    • 한국멀티미디어학회논문지
    • /
    • 제24권10호
    • /
    • pp.1380-1390
    • /
    • 2021
  • The video face recognition (FR) is one of the most popular researches in the field of computer vision due to a variety of applications. In particular, research using the attention mechanism is being actively conducted. In video face recognition, attention represents where to focus on by using the input value of the whole or a specific region, or which frame to focus on when there are many frames. In this paper, we propose a novel attention based deep learning method. Main novelties of our method are (1) the use of combining two loss functions, namely weighted Softmax loss function and a Triplet loss function and (2) the feasibility of end-to-end learning which includes the feature embedding network and attention weight computation. The feature embedding network has a positive effect on the attention weight computation by using combined loss function and end-to-end learning. To demonstrate the effectiveness of our proposed method, extensive and comparative experiments have been carried out to evaluate our method on IJB-A dataset with their standard evaluation protocols. Our proposed method represented better or comparable recognition rate compared to other state-of-the-art video FR methods.

비대면 수업에 대한 치기공과 학습자 인식에 관한 연구 (A study on the dental technology student's recognition for non-face-to-face classes)

  • 최주영;정효경
    • 대한치과기공학회지
    • /
    • 제42권4호
    • /
    • pp.402-408
    • /
    • 2020
  • Purpose: To understand the students' level of recognition of online classes in the Department of Dental Technology and to provide the basic data for designing online classes based on the dental technology course. Methods: A survey was conducted among the students of the dental technology department. The collected data was analyzed with the SPSS ver. 25.0 program. To ensure a reliable verification, the α=0.05 significance level was used. The t-test and analysis of variance were also performed. Results: The students' level of recognition of online classes in the Department of Dental Technology is shown in the rate of recognition for video-based classes for both the theory and experiments. Students displayed high positivity with the video-based learning as it is repeated learning that is not affected by the limitations of time. In addition, video-based learning is highly beneficial in terms of convenience, satisfaction, and achievement for learning. Conclusion: Based on the results, video-based learning is a highly positive learning type for students. It was also recommended that the Department of Dental Technology should offer a post-COVID-19 online class to include the blended methods of a face-to-face class and video-based learning.