Decomposed

Sediqi, Khwaja Monib;Lee, Hyo Jong;

doi:10.3745/PKIPS.y2019m05a.455

Proceedings of the Korea Information Processing Society Conference (한국정보처리학회:학술대회논문집)

2019.05a
/
Pages.455-457
/
2019
/
2005-0011(pISSN)
/
2671-7298(eISSN)

Korea Information Processing Society (한국정보처리학회)

DOI QR Code

Decomposed "Spatial and Temporal" Convolution for Human Action Recognition in Videos

Sediqi, Khwaja Monib (Dept. of Computer Science & Engineering, Chonbuk National University) ;
Lee, Hyo Jong (Dept. of Computer Science & Engineering, Chonbuk National University)

Published : 2019.05.10

https://doi.org/10.3745/PKIPS.y2019m05a.455 Citation PDF

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper we study the effect of decomposed spatiotemporal convolutions for action recognition in videos. Our motivation emerges from the empirical observation that spatial convolution applied on solo frames of the video provide good performance in action recognition. In this research we empirically show the accuracy of factorized convolution on individual frames of video for action classification. We take 3D ResNet-18 as base line model for our experiment, factorize its 3D convolution to 2D (Spatial) and 1D (Temporal) convolution. We train the model from scratch using Kinetics video dataset. We then fine-tune the model on UCF-101 dataset and evaluate the performance. Our results show good accuracy similar to that of the state of the art algorithms on Kinetics and UCF-101 datasets.

Proceedings of the Korea Information Processing Society Conference (한국정보처리학회:학술대회논문집)

Decomposed "Spatial and Temporal" Convolution for Human Action Recognition in Videos

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)