[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.30693/SMJ.2020.9.1.23

Effective Hand Gesture Recognition by Key Frame Selection and 3D Neural Network

Hoang, Nguyen Ngoc (Dept. of ECE, Chonnam National University)
Lee, Guee-Sang (Dept. of ECE, Chonnam National University)
Kim, Soo-Hyung (Dept. of ECE, Chonnam National University)
Yang, Hyung-Jeong (Dept. of ECE, Chonnam National University)

Publication Information

Smart Media Journal / v.9, no.1, 2020 , pp. 23-29 More about this Journal

Abstract

This paper presents an approach for dynamic hand gesture recognition by using algorithm based on 3D Convolutional Neural Network (3D_CNN), which is later extended to 3D Residual Networks (3D_ResNet), and the neural network based key frame selection. Typically, 3D deep neural network is used to classify gestures from the input of image frames, randomly sampled from a video data. In this work, to improve the classification performance, we employ key frames which represent the overall video, as the input of the classification network. The key frames are extracted by SegNet instead of conventional clustering algorithms for video summarization (VSUMM) which require heavy computation. By using a deep neural network, key frame selection can be performed in a real-time system. Experiments are conducted using 3D convolutional kernels such as 3D_CNN, Inflated 3D_CNN (I3D) and 3D_ResNet for gesture classification. Our algorithm achieved up to 97.8% of classification accuracy on the Cambridge gesture dataset. The experimental results show that the proposed approach is efficient and outperforms existing methods.

Keywords

hand gesture recognition; dynamic hand gesture; key frame extraction; action recognition;

Citations & Related Records

Times Cited By KSCI : 2 (Citation Analysis)

Reference
Cited By KSCI

1	H. Tang; H. Liu; W. Xiao; N. Sebe; "Fast and powerful hand gesture recognition extraction and feature fusion," NeuroComputing , 2019
2	H. Jiang; X. Ma; W. Li; S. Ding; C. Mu; "Adaptive key frame extraction from RGB-D for hand gesture recognition," Tenth International Conference on Digital Image Processing (ICDIP 2018), 2018
3	J. Carreira; A. Zisserman; "Quo Vadis, Action Recognition? A New Model and the Kinetics Datase," Conference on Computer Vision and Pattern Recognition (CVPR), 2017
4	K. Hara; H. Kataoka; Y. Satoh; "Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?," Conference on Computer Vision and Pattern Recognition (CVPR), 2018
5	C. Szegedy; W. Liu; Y. Jiaj; P. Sermanet; S. Reed; D. Anguelov; D. Erhan; V. Vanhoucke; A. Rabinovich; "Going Deeper with Convolutions," Conference on Computer Vision and Pattern Recognition (CVPR), 2015
6	K. He; X. Zhang; S. Ren; J. Sun; "Deep residual learning for image recognition," Computer Vision and Pattern Recognition (CVPR), Proc. of the IEEE Conference on, pp.770-778, 2016
7	S. E. F. d. Avila; A. P. B. Lopes; A. d. L. Jr.; A. d. A. Arajo; "Vsumm: A mechanism designed to produce static video summaries and a novel evaluation method," Pattern Recognition Letters, vol.32, no.1, pp.56 - 68, 2011 DOI
8	N. N. Hoang; G.-S. Lee; S.-H. Kim; H.-J. Yang; "A Real-time Multimodal Hand Gesture Recognition via 3D Convolutional Neural Network and Key Frame Extraction," Machine Learning in Medical Imaging (MLMI), pp.32-37, 2018
9	V. John; A. Boyali; S. Mita; M. Imanishi; N. Sanma; "Deep Learning-Based Fast Hand Gesture Recognition Using Representative Frames," International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2016
10	R. F. Rachmadi; K. Uchimura; G. Koutaki; "Video classification using compacted dataset based on selected keyframe," IEEE Region 10 Conference (TENCON), 2016
11	Tae Seok Lee; Seung Shik Kang; "LSTM based wequence-to-wequence Model for Korean Automatic Word-spacing," Smart Media Journal, vol.7, no.4, pp.17-23, 2018
12	V. Badrinarayanan; A. Kendall; R. Cipolla; "SegNet: A Deep Convolutional Encoder-Decoder Architecture Segmentation," Conference on Computer Vision and Pattern Recognition (CVPR), 2014
13	https://20bn.com/datasets/jester (accessed Mar.,03, 2020).
14	Son Tung Trieu; Guee Sang Lee; "Machine Printed and Handwritten Text Discrimination in Korean Document Images," Smart Media Journal, vol.5, no.3, pp.30-34, 2016
15	Do Nhu Tai; Soo-Hyung Kim; Guee-Sang Lee; Hyung-Jeong Yang; In-Seop Na; A-Ran Oh; "Tracking by Detection of Multiple Faces using SSD and CNN Features," Smart Media Journal, vol.7, no.4, pp.1-69, 2018
16	Abhijeet Boragule; Guee Sang Lee; "Text Line Segmentation of Handwritten Documents by Area Mapping," Smart Media Journal, vol.4, no.3, pp.44-49, 2015
17	D. Tran; L. Bourdev; R. Fergus; L. Torresani; M. Paluri; "Learning spatiotemporal features with 3D convolutional networks," Proc. of IEEE Int. Conf. Comput. Vis. (ICCV), pp.4489-4497, 2015
18	Q. D. Smedt; H. Wannous; J.-P. Vandeborr; "Skeleton-Based Dynamic Hand Gesture Recognition", Computer Vision and Pattern Recognition Workshops (CVPRW), 2016
19	U. Cote-Allard; C. L. Fall; A. Campeau-Lecoursy; C. Gosselin; F. Laviolettez; B. Gosselin; "Transfer Learning for sEMG Hand gesture recognition Using Convolutional Neural Networks," IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2017
20	M. H. Rahman; J. Afrin; "Hand Gesture Recognition using Multiclass Support Vector Machine," International Journal of Computer Applications, vol.74, no.1, 2013
21	G. Zhu; L. Zhang; P. Shen; J. Song; "Multimodal Gesture Recognition Using 3-D Convolution and Convolutional LSTM," IEEE Access, vol.5, pp.4517-4524, 2017 DOI
22	J. Donahue; L. A. Hendricks; S. Guadarrama; M. Rohrbach; S. Venugopalan; K. Saenko; T. Darrell; "Long-term recurrent convolutional networks for visual recognition and description," Conference on Computer Vision and Pattern Recognition (CVPR), 2015