Browse > Article
http://dx.doi.org/10.4218/etrij.2021-0118

Electroencephalography-based imagined speech recognition using deep long short-term memory network  

Agarwal, Prabhakar (Department of Electronics and Communication Engineering, National Institute of Technology Delhi)
Kumar, Sandeep (Department of Electronics and Communication Engineering, National Institute of Technology Delhi)
Publication Information
ETRI Journal / v.44, no.4, 2022 , pp. 672-685 More about this Journal
Abstract
This article proposes a subject-independent application of brain-computer interfacing (BCI). A 32-channel Electroencephalography (EEG) device is used to measure imagined speech (SI) of four words (sos, stop, medicine, washroom) and one phrase (come-here) across 13 subjects. A deep long short-term memory (LSTM) network has been adopted to recognize the above signals in seven EEG frequency bands individually in nine major regions of the brain. The results show a maximum accuracy of 73.56% and a network prediction time (NPT) of 0.14 s which are superior to other state-of-the-art techniques in the literature. Our analysis reveals that the alpha band can recognize SI better than other EEG frequencies. To reinforce our findings, the above work has been compared by models based on the gated recurrent unit (GRU), convolutional neural network (CNN), and six conventional classifiers. The results show that the LSTM model has 46.86% more average accuracy in the alpha band and 74.54% less average NPT than CNN. The maximum accuracy of GRU was 8.34% less than the LSTM network. Deep networks performed better than traditional classifiers.
Keywords
brain-computer interface; deep learning; EEG; imagined speech recognition; long short term memory;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 S. Wellington, and J. Clayton, Fourteen-channel EEG with Imagined Speech (FEIS) dataset, v1.0, University of Edinburgh, Edinburgh, UK, 2019. https://doi.org/10.5281/zenodo.3554128   DOI
2 A. Khosla, P. Khandnor, and T. Chand, A comparative analysis of signal processing and classification methods for different applications based on EEG signals, Biocybern. Biomed. Eng. 40 (2020), no. 2, 649-690.   DOI
3 P. Saha, S. Fels, and M. Abdul-Mageed, Deep learning the EEG manifold for phonological categorization from active thoughts, (Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Brighton, UK), 2019, pp. 2762-2766.
4 A. Porbadnigk, M. Wester, JP. Calliess, and T. Schultz, EEG-based speech recognition- impact of temporal effects, (Proc. International Conference on Bio-inspired Systems and Signal Processing - BIOSIGNALS, Porto, Portugal), 2009, pp. 376-381.
5 L. Marple, Computing the discrete-time "analytic" signal via FFT, IEEE Trans. Signal Process. 47 (1999), no. 9, 2600-2603.   DOI
6 P. Agarwal and S. Kumar, Electroencephalography based imagined alphabets classification using spatial and time-domain features, Int. J. Imaging Syst. Technol. 32 (2022), no. 1, 111-122. https://doi.org/10.1002/ima.22655   DOI
7 S. Kumar, P. R. Verma, M. Bharti, and P. Agarwal, A CNN based graphical user interface controlled by imagined movements, Int. J. Syst. Assur. Eng. Manag. (2021). https://doi.org/10.1007/s13198-021-01096-w   DOI
8 K. Khanna, A. Verma, and B. Richard, "The locked-in syndrome": Can it be unlocked?, J. Clin. Gerontol. Geriatr. 2 (2011), no. 4, 96-99.   DOI
9 K. Brigham and B. V. K. V. Kumar, Imagined speech classification with EEG signals for silent communication: A preliminary investigation into synthetic telepathy, (Proc. 4th International Conference on Bioinformatics and Biomedical Engineering, Chengdu, China), 2010, pp. 1-4.
10 P. Kumar, R. Saini, P. P. Roy, P. K. Sahu, and D. P. Dogra, Envisioned speech recognition using EEG sensors, Pers. Ubiquitous Comput. 22 (2018), 185-199.   DOI
11 P. Agarwal and S. Kumar, Transforming imagined thoughts into speech using a covariance-based subset selection method, Indian J. Pure Appl. Phys. 59 (2021), no. 3, 180-183.
12 R. A. Ramadan and A. V. Vasilakos, Brain computer interface: Control signals review, Neurocomputing. 223 (2017), 26-44.   DOI
13 A. M. Choudhari, P. Porwal, V. Jonnalagedda, and F. Meriaudeau, An electrooculography based human machine interface for wheelchair control, Biocybern. Biomed. Eng. 39 (2019), no. 3, 673-685.   DOI
14 M. Lee, J. Ryu, and D.-H. Kim, Automated epileptic seizure waveform detection method based on the feature of the mean slope of wavelet coefficient counts using a hidden Markov model and EEG signals, ETRI J. 42 (2020), no. 2, 217-229.   DOI
15 O. Ozdenizci, Y. Wang, T. Koike-Akino, and D. Erdogmus,, Adversarial deep learning in EEG biometrics, IEEE Signal Process. Lett. 26 (2019), no. 5, 710-714.   DOI
16 P. Kant, S. H. Laskar, J. Hazarika, and R. Mahamune, CWT based transfer learning for motor imagery classification for brain computer interfaces, J. Neurosci. Methods 345 (2020), 108886.   DOI
17 C. Cooney, A. Korik, R. Folli, D. Coyle, Classification of imagined spoken word-pairs using convolutional neural networks, (Proc. Int. Conf. The 8th Graz Brain Computer Interface, Verlag der Technischen Universitat, Graz), 2019, pp. 338-343.
18 C. H. Nguyen, G. K. Karavas, and P. Artemiadis, Inferring imagined speech using EEG signals: A new approach using Riemannian manifold features, J. Neural Eng. 15 (2017), no. 1, 016002.   DOI
19 J. Hazarika, P. Kant, R. Dasgupta, and S. H. Laskar, Neural modulation in action video game players during inhibitory control function: An EEG study using discrete wavelet transform, Biomed. Signal Process Control. 45 (2018), 144-150.   DOI
20 S. Kumar, Real-time implementation and performance evaluation of speech classifiers in speech analysis-synthesis, ETRI J. 43 (2021), no. 1, 82-94.   DOI
21 T. K. Reddy, V. Arora, and L. Behera, HJB-equation-based optimal learning scheme for neural networks with applications in brain-computer interface, IEEE Trans. Emerg. Top. Comput. Intell. 4 (2020), no. 2, 159-170.   DOI
22 K.-H. Kim, H. K. Kim, J. S. Kim, W. Son, and S. Y. Lee, A biosignal-based human interface controlling a power-wheelchair for people with motor disabilities, ETRI J. 28 (2006), no. 1, 111-114.   DOI
23 P. Kaushik, A. Gupta, P. P. Roy, and D. P. Dogra, EEG-based age and gender prediction using deep BLSTM-LSTM network model, IEEE Sens. J. 19 (2019), no. 7, 2634-2641.   DOI
24 M. D'Zamura, S. Deng, T. Lappas, S. Thorpe, and R. Srinivasan, Toward EEG sensing of imagined speech, In Human-computer interaction. New trends, J. A. Jacko (ed.) Vol. 5610, Springer, Berlin, Heidelberg, 2009, 40-48.
25 S. Kellis, K. Miller, K. Thomson, R. Brown, P. House, and B. Greger, Decoding spoken words using local field potentials recorded from the cortical surface, J. Neural Eng. 7 (2010), no. 5, 056007.   DOI
26 M. N. I. Qureshi, B. Min, H. J. Park, D. Cho, W. Choi, and B. Lee, Multiclass classification of word imagination speech with hybrid connectivity features, IEEE Trans. Biomed. Eng. 65 (2018), no. 10, 2168-2177.   DOI
27 P. Agarwal, R. K. Kale, M. Kumar, S. Kumar, Silent speech classification based upon various feature extraction methods, (Proc. Int. Conf. Signal Processing and Integrated Networks, Noida, India), 2020, pp. 16-20.
28 E. T. Esfahani and V. Sundararajan, Classification of primitive shapes using brain-computer interfaces, Comput. Aided Des. 44 (2012), no. 10, 1011-1019.   DOI
29 A. A. Torres-Garcia, C. A. Reyes-Garcia, L. Villasenor-Pineda, and G. Garcia-Aguilar, Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification, Expert Syst. Appl. 59 (2016), 1-12.   DOI
30 E. F. Gonzalez-Castaneda, A. A. Torres-Garcia, C. A. Reyes-Garcia, and L. Villasenor-Pineda, Sonification and textification: Proposing methods for classifying unspoken words from EEG signals, Biomed. Signal Process Control. 37 (2017), 82-91.   DOI
31 P. Saha and S. Fels, Hierarchical deep feature learning for decoding imagined speech from EEG, (Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HA, USA) pp. 10019-10020.
32 C. S. Dasalla, H. Kambara, M. Sato, and Y. Koike, Single-trial classification of vowel speech imagery using common spatial patterns, Neural Netw. 22 (2009), no. 9, 1334-1339.   DOI
33 D. Dash, P. Ferrari, and J. Wang, Decoding imagined and spoken phrases from non-invasive neural (MEG) signals, Front. Neurosci. 14 (2020), 290.   DOI
34 M.-O. Tamm, Y. Muhammad, and N. Muhammad, Classification of vowels from imagined speech with convolutional neural networks, Comput. 9 (2020), no. 2, 46.   DOI
35 X. Glorot, and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, (Proc. 13th International Conference on Artificial Intelligence and Statistics. Chia Laguna Resort, Sardinia, Italy), 2010, pp. 249-256.
36 D. Pawar and S. Dhage, Multiclass covert speech classification using extreme learning machine, Biomed. Eng. Lett. 10 (2020), 217-226.   DOI
37 C. Ju, D. Gao, R. Mane, B. Tan, Y. Liu, C. Guan, Federated transfer learning for EEG signal classification, (Proc. 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society, Montreal, QC, Canada), 2020, pp. 3040-3045.
38 S. Martin, P. Brunner, I. Iturrate, J. R. Millan, G. Schalk, R. T. Knight, and B. N. Pasley, Word pair classification during imagined speech using direct brain recordings, Sci. Rep. 6 (2016), 25803.   DOI
39 S. Siuly, Y. Li, and Y. Zhang, EEG signal analysis and classification: Techniques and applications, Springer, Cham, Switzerland, 2016.
40 S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Comput. 9 (1997), no. 8, 1735-1789.   DOI
41 A. M. Saxe, J. L. McClelland, and S. Ganguli, Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, ArXiv preprint, 2013. https://doi.oprg/10.48550/arXiv.1312.6120
42 J. M. Lilly and S. C. Olhede, Generalized Morse wavelets as a superfamily of analytic wavelets, IEEE Trans. Signal Process. 60 (2012), no. 11, 6036-6041.   DOI
43 S. Zhao, and F. Rudzicz, Classifying phonological categories in imagined and articulated speech, (Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Australia), 2015, pp. 992-996. https://doi.org/10.1109/ICASSP.2015.7178118   DOI
44 G. H. Klem, H. O. Luders, H. H. Jasper, and C. Elger, The tentwenty electrode system of the international federation. The international federation of clinical neurophysiology, Electroencephalogr. Clin. Neurophysiol. 52 (1999), 3-6.
45 W. He, Y. Zhao, H. Tang, C. Sun, and W. Fu, A wireless BCI and BMI system for wearable robots, IEEE Trans. Syst. Man Cybern. Syst. 46 (2016), no. 7, 936-946.   DOI