• 제목/요약/키워드: video fusion

검색결과 116건 처리시간 0.019초

Video Expression Recognition Method Based on Spatiotemporal Recurrent Neural Network and Feature Fusion

  • Zhou, Xuan
    • Journal of Information Processing Systems
    • /
    • 제17권2호
    • /
    • pp.337-351
    • /
    • 2021
  • Automatically recognizing facial expressions in video sequences is a challenging task because there is little direct correlation between facial features and subjective emotions in video. To overcome the problem, a video facial expression recognition method using spatiotemporal recurrent neural network and feature fusion is proposed. Firstly, the video is preprocessed. Then, the double-layer cascade structure is used to detect a face in a video image. In addition, two deep convolutional neural networks are used to extract the time-domain and airspace facial features in the video. The spatial convolutional neural network is used to extract the spatial information features from each frame of the static expression images in the video. The temporal convolutional neural network is used to extract the dynamic information features from the optical flow information from multiple frames of expression images in the video. A multiplication fusion is performed with the spatiotemporal features learned by the two deep convolutional neural networks. Finally, the fused features are input to the support vector machine to realize the facial expression classification task. The experimental results on cNTERFACE, RML, and AFEW6.0 datasets show that the recognition rates obtained by the proposed method are as high as 88.67%, 70.32%, and 63.84%, respectively. Comparative experiments show that the proposed method obtains higher recognition accuracy than other recently reported methods.

Gradient Fusion Method for Night Video Enhancement

  • Rao, Yunbo;Zhang, Yuhong;Gou, Jianping
    • ETRI Journal
    • /
    • 제35권5호
    • /
    • pp.923-926
    • /
    • 2013
  • To resolve video enhancement problems, a novel method of gradient domain fusion wherein gradient domain frames of the background in daytime video are fused with nighttime video frames is proposed. To verify the superiority of the proposed method, it is compared to conventional techniques. The implemented output of our method is shown to offer enhanced visual quality.

Open Standard Based 3D Urban Visualization and Video Fusion

  • Enkhbaatar, Lkhagva;Kim, Seong-Sam;Sohn, Hong-Gyoo
    • 한국측량학회지
    • /
    • 제28권4호
    • /
    • pp.403-411
    • /
    • 2010
  • This research demonstrates a 3D virtual visualization of urban environment and video fusion for effective damage prevention and surveillance system using open standard. We present the visualization and interaction simulation method to increase the situational awareness and optimize the realization of environmental monitoring through the CCTV video and 3D virtual environment. New camera prototype was designed based on the camera frustum view model to project recorded video prospectively onto the virtual 3D environment. The demonstration was developed by the X3D, which is royalty-free open standard and run-time architecture, and it offers abilities to represent, control and share 3D spatial information via the internet browsers.

Anterior Cervical Discectomy and Fusion YouTube Videos as a Source of Patient Education

  • Ovenden, Christopher Dillon;Brooks, Francis Michael
    • Asian Spine Journal
    • /
    • 제12권6호
    • /
    • pp.987-991
    • /
    • 2018
  • Study Design: Cross sectional study. Purpose: To assess the quality of anterior cervical discectomy and fusion (ACDF) videos available on YouTube and identify factors associated with video quality. Overview of Literature: Patients commonly use the internet as a source of information regarding their surgeries. However, there is currently limited information regarding the quality of online videos about ACDF. Methods: A search was performed on YouTube using the phrase 'anterior cervical discectomy and fusion.' The Journal of the American Medical Association (JAMA), DISCERN, and Health on the Net (HON) systems were used to rate the first 50 videos obtained. Information about each video was collected, including number of views, duration since the video was posted, percentage positivity (defined as number of likes the video received, divided by the total number of likes or dislikes of that video), number of comments, and the author of the video. Relationships between video quality and these factors were investigated. Results: The average number of views for each video was 96,239. The most common videos were those published by surgeons and those containing patient testimonies. Overall, the video quality was poor, with mean scores of 1.78/5 using the DISCERN criteria, 1.63/4 using the JAMA criteria, and 1.96/8 using the HON criteria. Surgeon authors' videos scored higher than patient testimony videos when reviewed using the HON or JAMA systems. However, no other factors were found to be associated with video quality. Conclusions: The quality of ACDF videos on YouTube is low, with the majority of videos produced by unreliable sources. Therefore, these YouTube videos should not be recommended as patient education tools for ACDF.

Dual-Stream Fusion and Graph Convolutional Network for Skeleton-Based Action Recognition

  • Hu, Zeyuan;Feng, Yiran;Lee, Eung-Joo
    • 한국멀티미디어학회논문지
    • /
    • 제24권3호
    • /
    • pp.423-430
    • /
    • 2021
  • Aiming Graph convolutional networks (GCNs) have achieved outstanding performances on skeleton-based action recognition. However, several problems remain in existing GCN-based methods, and the problem of low recognition rate caused by single input data information has not been effectively solved. In this article, we propose a Dual-stream fusion method that combines video data and skeleton data. The two networks respectively identify skeleton data and video data and fuse the probabilities of the two outputs to achieve the effect of information fusion. Experiments on two large dataset, Kinetics and NTU-RGBC+D Human Action Dataset, illustrate that our proposed method achieves state-of-the-art. Compared with the traditional method, the recognition accuracy is improved better.

이동 물체를 추적하기 위한 감각 운동 융합 시스템 설계 (The Sensory-Motor Fusion System for Object Tracking)

  • 이상희;위재우;이종호
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제52권3호
    • /
    • pp.181-187
    • /
    • 2003
  • For the moving objects with environmental sensors such as object tracking moving robot with audio and video sensors, environmental information acquired from sensors keep changing according to movements of objects. In such case, due to lack of adaptability and system complexity, conventional control schemes show limitations on control performance, and therefore, sensory-motor systems, which can intuitively respond to various types of environmental information, are desirable. And also, to improve the system robustness, it is desirable to fuse more than two types of sensory information simultaneously. In this paper, based on Braitenberg's model, we propose a sensory-motor based fusion system, which can trace the moving objects adaptively to environmental changes. With the nature of direct connecting structure, sensory-motor based fusion system can control each motor simultaneously, and the neural networks are used to fuse information from various types of sensors. And also, even if the system receives noisy information from one sensor, the system still robustly works with information from other sensors which compensates the noisy information through sensor fusion. In order to examine the performance, sensory-motor based fusion model is applied to object-tracking four-foot robot equipped with audio and video sensors. The experimental results show that the sensory-motor based fusion system can tract moving objects robustly with simpler control mechanism than model-based control approaches.

프로그래머블 멀티 포맷 비디오 디코더 (A Programmable Multi-Format Video Decoder)

  • 김재현;박구만
    • 방송공학회논문지
    • /
    • 제20권6호
    • /
    • pp.963-966
    • /
    • 2015
  • 본 논문에서는 최신 압축 표준인 HEVC(High Efficiency Video Coding)를 포함한 다양한 비디오 압축 표준을 처리할 수 있는 프로그래머블 멀티 포맷 복호기(Multi-Format video Decoder: MFD)를 제안한다. 제안한 MFD는 DTV(Digital Tele-Vision) SoC(System on Chip)에 필요한 고사양의 FHD(Full High Definition) 비디오 복호기를 목표로 하였다. 다양한 동영상 압축 표준과 방대한 연산 능력을 지원하기 위하여 제안된 플랫폼에서는 재구성형 프로세서(reconfigurable processor)와 하드웨어 가속기의 하이브리드 구조를 사용하였다. 실험결과 HEVC로 압축된 초당 30장의 FHD 영상을 300MHz에서 디코딩 가능함을 확인하였다.

공간과 시간적 특징 융합 기반 유해 비디오 분류에 관한 연구 (Using the fusion of spatial and temporal features for malicious video classification)

  • 전재현;김세민;한승완;노용만
    • 정보처리학회논문지B
    • /
    • 제18B권6호
    • /
    • pp.365-374
    • /
    • 2011
  • 최근 인터넷, IPTV/SMART TV, 소셜 네트워크 (social network)와 같은 정보 유통 채널의 다양화로 유해 비디오 분류 및 차단 기술 연구에 대한 요구가 높아가고 있으나, 현재까지는 비디오에 대한 유해성을 판단하는 연구는 부족한 실정이다. 기존 유해 이미지 분류 연구에서는 이미지에서의 피부 영역의 비율이나 Bag of Visual Words (BoVW)와 같은 공간적 특징들 (spatial features)을 이용하고 있다. 그러나, 비디오에서는 공간적 특징 이외에도 모션 반복성 특징이나 시간적 상관성 (temporal correlation)과 같은 시간적 특징들 (temporal features)을 추가적으로 이용하여 유해성을 판단할 수 있다. 기존의 유해 비디오 분류 연구에서는 공간적 특징과 시간적 특징들에서 하나의 특징만을 사용하거나 두 개의 특징들을 단순히 결정 단계에서 데이터 융합하여 사용하고 있다. 일반적으로 결정 단계 데이터 융합 방법은 특징 단계 데이터 융합 방법보다 높은 성능을 가지지 못한다. 본 논문에서는 기존의 유해 비디오 분류 연구에서 사용되고 있는 공간적 특징과 시간적 특징들을 특징 단계 융합 방법을 이용하여 융합하여 유해 비디오를 분류하는 방법을 제안한다. 실험에서는 사용되는 특징이 늘어남에 따른 분류 성능 변화와 데이터 융합 방법의 변화에 따른 분류 성능 변화를 보였다. 공간적 특징만을 이용하였을 때에는 92.25%의 유해 비디오 분류 성능을 보이는데 반해, 모션 반복성 특징을 이용하고 특징 단계 데이터 융합 방법을 이용하게 되면 96%의 향상된 분류 성능을 보였다.

CNN-based Visual/Auditory Feature Fusion Method with Frame Selection for Classifying Video Events

  • Choe, Giseok;Lee, Seungbin;Nang, Jongho
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제13권3호
    • /
    • pp.1689-1701
    • /
    • 2019
  • In recent years, personal videos have been shared online due to the popular uses of portable devices, such as smartphones and action cameras. A recent report predicted that 80% of the Internet traffic will be video content by the year 2021. Several studies have been conducted on the detection of main video events to manage a large scale of videos. These studies show fairly good performance in certain genres. However, the methods used in previous studies have difficulty in detecting events of personal video. This is because the characteristics and genres of personal videos vary widely. In a research, we found that adding a dataset with the right perspective in the study improved performance. It has also been shown that performance improves depending on how you extract keyframes from the video. we selected frame segments that can represent video considering the characteristics of this personal video. In each frame segment, object, location, food and audio features were extracted, and representative vectors were generated through a CNN-based recurrent model and a fusion module. The proposed method showed mAP 78.4% performance through experiments using LSVC data.

비디오 행동 인식을 위하여 다중 판별 결과 융합을 통한 성능 개선에 관한 연구 (A Study for Improved Human Action Recognition using Multi-classifiers)

  • 김세민;노용만
    • 방송공학회논문지
    • /
    • 제19권2호
    • /
    • pp.166-173
    • /
    • 2014
  • 최근 다양한 방송 및 영상 분야에서 사람의 행동을 인식하여는 연구들이 많이 이루어지고 있다. 영상은 다양한 형태를 가질 수 있기 때문에 제약된 환경에서 유용한 템플릿 방법들보다 특징점에 기반한 연구들이 실제 사용자 환경에서 더욱 관심을 받고 있다. 특징점 기반의 연구들은 영상에서 움직임이 발생하는 지점들을 찾아내어 이를 3차원 패치들로 생성한다. 이를 이용하여 영상의 움직임을 히스토그램에 기반한 descriptor(서술자)로 표현하고 학습기반의 판별기로 최종적으로 영상내에 존재하는 행동들을 인식하였다. 그러나 단일 판별기로는 다양한 행동을 인식하기에 어려움이 있다. 따라서 이러한 문제를 개선하기 위하여 최근에 다중 판별기를 활용한 연구들이 영상 판별 및 물체 검출 영역에서 사용되고 있다. 따라서 본 논문에서는 행동 인식을 위하여 support vector machine과 sparse representation을 이용한 decision-level fusion 방법을 제안하고자 한다. 제안된 논문의 방법은 영상에서 특징점 기반의 descriptor를 추출하고 이를 각각의 판별기를 통하여 판별 결과들을 획득한다. 이 후 학습단계에서 획득된 가중치를 활용하여 각 결과들을 융합하여 최종 결과를 도출하였다. 본 논문에 실험에서 제안된 방법은 기존의 융합 방법보다 높은 행동 인식 성능을 보여 주었다.