[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.9717/kmms.2022.25.9.1257

Fast Convergence GRU Model for Sign Language Recognition

Subramanian, Barathi (Dept. of Computer Science and Engineering, Graduate School, Kyungpook National University)
Olimov, Bekhzod (Dept. of Computer Science and Engineering, Graduate School, Kyungpook National University)
Kim, Jeonghong (Dept. of Computer Science and Engineering, Graduate School, Kyungpook National University)

Publication Information

Journal of Korea Multimedia Society / v.25, no.9, 2022 , pp. 1257-1265 More about this Journal

Abstract

Recognition of sign language is challenging due to the occlusion of hands, accuracy of hand gestures, and high computational costs. In recent years, deep learning techniques have made significant advances in this field. Although these methods are larger and more complex, they cannot manage long-term sequential data and lack the ability to capture useful information through efficient information processing with faster convergence. In order to overcome these challenges, we propose a word-level sign language recognition (SLR) system that combines a real-time human pose detection library with the minimized version of the gated recurrent unit (GRU) model. Each gate unit is optimized by discarding the depth-weighted reset gate in GRU cells and considering only current input. Furthermore, we use sigmoid rather than hyperbolic tangent activation in standard GRUs due to performance loss associated with the former in deeper networks. Experimental results demonstrate that our pose-based optimized GRU (Pose-OGRU) outperforms the standard GRU model in terms of prediction accuracy, convergency, and information processing capability.

Keywords

Deep Learning; Gesture Recognition; Human Pose Detection; OpenPose; Sign Language;

Citations & Related Records

Reference

1	O. Koller, N.C. Camgoz, H. Ney, and R. Bowden, "Weakly Supervised Learning with Multi-Stream CNN-LSTM-HMMs to Discover Sequential Parallelism in Sign Language Videos," IEEE Transactions on P attern Analysis and Machine Intelligence, Vol. 42, No. 9, pp. 2306-2320, 2020. DOI
2	R. Rastgoo, K. Kiani, and S. Escalera, "Hand Sign Language Recognition Using MultiView Hand Skeleton," Expert Systems with Applications, Vol. 150, p. 113336, 2020.
3	D. Naglot and M. Kulkarni, "Real Time Sign Language Recognition Using the Leap Motion Controller," International Conference on Inventive Computation Technologies (ICICT), Vol. 2016, pp. 1-5, 2016.
4	N.C. Camgoz, O. Koller, S. Hadfield, and R. Bowden, "Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation," IEEE Conference on Computer Vision and P attern Recognition (CVP R), 2020 c, pp. 10020-10030, 2020.
5	B. Saunders, N.C. Camgoz, and R. Bowden, "Continuous 3D Multi-Channel Sign Language Production via Progressive Transformers and Mixture Density Networks," International J ournal of Computer Vision, Vol. 129, No. 7. pp. 2113-2135, 2021. DOI
6	D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, "C3D : Generic Features for Video Analysis C3D," arXiv P reprint, arXiv: 1412.0767v1, 2015.
7	N. Aloysius, M. Geetha, and P. Nedungadi, "Incorporating Relative Position Information in Transformer-Based Sign Language Recognition and Translation," IEEE Access, Vol. 9, pp. 145929-145942, 2021. DOI
8	D. Li, C.R. Opazo, X. Yu, and H. Li, "WordLevel Deep Sign Language Recognition from Video: A New Large-Scale Dataset and Methods Comparison," IEEE Winter Conference on Applications of Computer Vision, WACV 2020. pp. 1448-1458, 2020.
9	R.A. Kadhim and M. Khamees, "A Real-Time American Sign Language Recognition System Using Convolutional Neural Network for Real Datasets," TEM J ournal, Vol. 9, No. 3, pp. 937-943, 2020.
10	A. Wadhawan and P. Kumar, "Deep LearningBased Sign Language Recognition System for Static Signs," Neural Computing Applications, Vol. 32, No. 12, pp. 7957-7968, 2020. DOI
11	Patil, A., Kulkarni, A., Yesane, H., Sadani, M., Satav, P., "Literature Survey: Sign Language Recognition Using Gesture Recognition and Natural Language Processing," In: Sharma, N., Chakrabarti, A., Balas, V.E., Bruckstein, A.M. (eds) Data Management, Analytics and Innovation. Lecture Notes on Data Engineering and Communications Technologies, Vol. 70, 2021.
12	P. Kumar, H. Gauba, P.P. Roy, and D.P. Dogra, "Coupled HMM-Based Multi-Sensor Data Fusion for Sign Language Recognition," P attern Recognition Letters, Vol. 86, pp. 1-8, 2017. DOI
13	R. Elakkiya and K. Selvamani, "Subunit Sign Modeling Framework for Continuous Sign Language Recognition," Computers and Electrical Engineering, Vol. 74, pp. 379-390, 2019. DOI
14	O. Koller, "Quantitative Survey of the State of the Art in Sign Language Recognition," arXiv P reprint, arXiv:2008.09918, 2020.
15	R.C. Chen, C. Dewi, S.W. Huang, and R.E. Caraka, "Selecting Critical Features for Data Classification Based on Machine Learning Methods," J ournal of Big Data, Vol. 7, No. 1, Article number 52, 2020.
16	T.R. Gadekallu, M. Alazab, R. Kaluri, P.K.R. Maddikunta, S. Bhattacharya, K. Lakshmanna, and M. Parimala, "Hand Gesture Classification Using A Novel CNN-Crow Search Algorithm," Complex & Intelligent Systems, Vol. 7, No. 4. pp. 1855-1868, 2021. DOI
17	B. Kanisha, V. Mahalakshmi, M. Baskar, K. Vijaya, and P. Kalyanasundaram, "Smart Communication Using Tri-Spectral Sign Recognition for Hearing-Impaired People," J ournal of Supercomputing, Vol. 78, No. 2. pp. 2651-2664, 2022. DOI