• 제목/요약/키워드: 3D Convolution

검색결과 103건 처리시간 0.027초

2D/3D 변환을 위한 Convolution filter (Convolution filter for 2D to 3D conversion)

  • 송혁;배진우;최병호;유지상
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 2006년도 학술대회
    • /
    • pp.37-40
    • /
    • 2006
  • 3DTV는 아나로그 TV 및 HDTV의 차세대 이슈로 부상하였다. 그러나 대부분의 컨텐츠가 2D로 획득되어 저장되어 있으므로 2D 컨텐츠의 3D로의 변화이 필수적이다. MPEG 및 JVT에서 표준화가 진행되고 있으며 이를 위해 국내외 연구소, 학교, 및 업계가 관심을 가지고 참여하고 있다. 2D/3D 변환은 오래전부터 연구되어 왔으나 실제 응용에서는 기대에 못 미치고 있다. 본 논문에서는 FPGA에 기반하고 VHDL로 코딩하여 2D/3D 변환을 위한 Convolution filter를 적용하였다. 좌우 영상을 생성하기 위하여 Convolution filter로 좌우 영상을 왜곡하였다. 필터의 사용으로 사용자의 위치나 취향에 따라서 영상의 왜곡을 달리하여 효과의 변화를 줄 수 있다.

  • PDF

CERTAIN COMBINATORIC CONVOLUTION SUMS AND THEIR RELATIONS TO BERNOULLI AND EULER POLYNOMIALS

  • Kim, Daeyeoul;Bayad, Abdelmejid;Ikikardes, Nazli Yildiz
    • 대한수학회지
    • /
    • 제52권3호
    • /
    • pp.537-565
    • /
    • 2015
  • In this paper, we give relationship between Bernoulli-Euler polynomials and convolution sums of divisor functions. First, we establish two explicit formulas for certain combinatoric convolution sums of divisor functions derived from Bernoulli and Euler polynomials. Second, as applications, we show five identities concerning the third and fourth-order convolution sums of divisor functions expressed by their divisor functions and linear combination of Bernoulli or Euler polynomials.

Human Action Recognition Based on 3D Convolutional Neural Network from Hybrid Feature

  • Wu, Tingting;Lee, Eung-Joo
    • 한국멀티미디어학회논문지
    • /
    • 제22권12호
    • /
    • pp.1457-1465
    • /
    • 2019
  • 3D convolution is to stack multiple consecutive frames to form a cube, and then apply the 3D convolution kernel in the cube. In this structure, each feature map of the convolutional layer is connected to multiple adjacent sequential frames in the previous layer, thus capturing the motion information. However, due to the changes of pedestrian posture, motion and position, the convolution at the same place is inappropriate, and when the 3D convolution kernel is convoluted in the time domain, only time domain features of three consecutive frames can be extracted, which is not a good enough to get action information. This paper proposes an action recognition method based on feature fusion of 3D convolutional neural network. Based on the VGG16 network model, sending a pre-acquired optical flow image for learning, then get the time domain features, and then the feature of the time domain is extracted from the features extracted by the 3D convolutional neural network. Finally, the behavior classification is done by the SVM classifier.

딥러닝 기반 3차원 라이다의 반사율 세기 신호를 이용한 흑백 영상 생성 기법 (Deep Learning Based Gray Image Generation from 3D LiDAR Reflection Intensity)

  • 김현구;유국열;박주현;정호열
    • 대한임베디드공학회논문지
    • /
    • 제14권1호
    • /
    • pp.1-9
    • /
    • 2019
  • In this paper, we propose a method of generating a 2D gray image from LiDAR 3D reflection intensity. The proposed method uses the Fully Convolutional Network (FCN) to generate the gray image from 2D reflection intensity which is projected from LiDAR 3D intensity. Both encoder and decoder of FCN are configured with several convolution blocks in the symmetric fashion. Each convolution block consists of a convolution layer with $3{\times}3$ filter, batch normalization layer and activation function. The performance of the proposed method architecture is empirically evaluated by varying depths of convolution blocks. The well-known KITTI data set for various scenarios is used for training and performance evaluation. The simulation results show that the proposed method produces the improvements of 8.56 dB in peak signal-to-noise ratio and 0.33 in structural similarity index measure compared with conventional interpolation methods such as inverse distance weighted and nearest neighbor. The proposed method can be possibly used as an assistance tool in the night-time driving system for autonomous vehicles.

Decomposed "Spatial and Temporal" Convolution for Human Action Recognition in Videos

  • Sediqi, Khwaja Monib;Lee, Hyo Jong
    • 한국정보처리학회:학술대회논문집
    • /
    • 한국정보처리학회 2019년도 춘계학술발표대회
    • /
    • pp.455-457
    • /
    • 2019
  • In this paper we study the effect of decomposed spatiotemporal convolutions for action recognition in videos. Our motivation emerges from the empirical observation that spatial convolution applied on solo frames of the video provide good performance in action recognition. In this research we empirically show the accuracy of factorized convolution on individual frames of video for action classification. We take 3D ResNet-18 as base line model for our experiment, factorize its 3D convolution to 2D (Spatial) and 1D (Temporal) convolution. We train the model from scratch using Kinetics video dataset. We then fine-tune the model on UCF-101 dataset and evaluate the performance. Our results show good accuracy similar to that of the state of the art algorithms on Kinetics and UCF-101 datasets.

균형적인 신체활동을 위한 맞춤형 AI 운동 추천 서비스 (Customized AI Exercise Recommendation Service for the Balanced Physical Activity)

  • 김창민;이우범
    • 융합신호처리학회논문지
    • /
    • 제23권4호
    • /
    • pp.234-240
    • /
    • 2022
  • 본 논문은 직종별 근무 환경에 따른 상대적 운동량을 고려한 맞춤형 AI 운동 추천 서비스 방법을 제안한다. 가속도 및 자이로 센서를 활용하여 수집된 데이터를 18가지 일상생활의 신체활동으로 분류한 WISDM 데이터베이스를 기반으로 전신, 하체, 상체의 3가지 활동으로 분류한 후 인식된 활동 지표를 통해 적절한 운동을 추천한다. 본 논문에서 신체활동 분류를 위해서 사용하는 1차원 합성곱 신경망(1D CNN; 1 Dimensional Convolutional Neural Network) 모델은 커널 크기가 다른 다수의 1D 컨볼루션(Convolution) 계층을 병렬적으로 연결한 컨볼루션 블록을 사용한다. 컨볼루션 블록은 하나의 입력 데이터에 다층 1D 컨볼루션을 적용함으로써 심층 신경망 모델로 추출할 수 있는 입력 패턴의 세부 지역 특징을 보다 얇은 계층으로도 효과적으로 추출 할 수 있다. 제안한 신경망 모델의 성능 평가를 위해서 기존 순환 신경망(RNN; Recurrent Neural Network) 모델과 비교 실험한 결과 98.4%의 현저한 정확도를 보였다.

A Proposal of Shuffle Graph Convolutional Network for Skeleton-based Action Recognition

  • Jang, Sungjun;Bae, Han Byeol;Lee, HeanSung;Lee, Sangyoun
    • 한국정보전자통신기술학회논문지
    • /
    • 제14권4호
    • /
    • pp.314-322
    • /
    • 2021
  • Skeleton-based action recognition has attracted considerable attention in human action recognition. Recent methods for skeleton-based action recognition employ spatiotemporal graph convolutional networks (GCNs) and have remarkable performance. However, most of them have heavy computational complexity for robust action recognition. To solve this problem, we propose a shuffle graph convolutional network (SGCN) which is a lightweight graph convolutional network using pointwise group convolution rather than pointwise convolution to reduce computational cost. Our SGCN is composed of spatial and temporal GCN. The spatial shuffle GCN contains pointwise group convolution and part shuffle module which enhances local and global information between correlated joints. In addition, the temporal shuffle GCN contains depthwise convolution to maintain a large receptive field. Our model achieves comparable performance with lowest computational cost and exceeds the performance of baseline at 0.3% and 1.2% on NTU RGB+D and NTU RGB+D 120 datasets, respectively.

딥 러닝 기반 얼굴 메쉬 데이터 디노이징 시스템 (A Deep Learning-Based Face Mesh Data Denoising System)

  • 노지현;임현승;김종민
    • 전기전자학회논문지
    • /
    • 제23권4호
    • /
    • pp.1250-1256
    • /
    • 2019
  • 3차원 프린터나 깊이 카메라 등을 이용하면 실세계의 3차원 메쉬 데이터를 손쉽게 생성할 수 있지만, 이렇게 생성된 데이터에는 필연적으로 불필요한 노이즈가 포함되어 있다. 따라서, 온전한 3차원 메쉬 데이터를 얻기 위해서는 메쉬 디노이징 작업이 필수적이다. 하지만 기존의 수학적인 디노이징 방법들은 전처리 작업이 필요하며 3차원 메쉬의 일부 중요한 특징들이 사라지는 문제점이 있다. 본 논문에서는 이러한 문제를 해결하기 위해 딥 러닝 기반의 3차원 메쉬 디노이징 기법을 소개한다. 구체적으로 본 논문에서는 인코더와 디코더로 구성된 컨볼루션 기반 오토인코더 모델을 제안한다. 메쉬 데이터에 적용하는 컨볼루션 연산은 메쉬 데이터를 구성하고 있는 각각의 정점과 그 주변의 정점들 간의 관계를 고려하여 디노이징을 수행하며, 컨볼루션이 완료되면 학습 속도 향상을 위해 샘플링 연산을 수행한다. 실험 결과, 본 논문에서 제안한 오토인코더 모델이 기존 방식보다 더 빠르고 더 높은 품질의 디노이징된 데이터를 생성함을 확인하였다.

Improved Sliding Shapes for Instance Segmentation of Amodal 3D Object

  • Lin, Jinhua;Yao, Yu;Wang, Yanjie
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제12권11호
    • /
    • pp.5555-5567
    • /
    • 2018
  • State-of-art instance segmentation networks are successful at generating 2D segmentation mask for region proposals with highest classification score, yet 3D object segmentation task is limited to geocentric embedding or detector of Sliding Shapes. To this end, we propose an amodal 3D instance segmentation network called A3IS-CNN, which extends the detector of Deep Sliding Shapes to amodal 3D instance segmentation by adding a new branch of 3D ConvNet called A3IS-branch. The A3IS-branch which takes 3D amodal ROI as input and 3D semantic instances as output is a fully convolution network(FCN) sharing convolutional layers with existing 3d RPN which takes 3D scene as input and 3D amodal proposals as output. For two branches share computation with each other, our 3D instance segmentation network adds only a small overhead of 0.25 fps to Deep Sliding Shapes, trading off accurate detection and point-to-point segmentation of instances. Experiments show that our 3D instance segmentation network achieves at least 10% to 50% improvement over the state-of-art network in running time, and outperforms the state-of-art 3D detectors by at least 16.1 AP.