DOI QR코드

DOI QR Code

Electroencephalography-based imagined speech recognition using deep long short-term memory network

  • Agarwal, Prabhakar (Department of Electronics and Communication Engineering, National Institute of Technology Delhi) ;
  • Kumar, Sandeep (Department of Electronics and Communication Engineering, National Institute of Technology Delhi)
  • 투고 : 2021.04.04
  • 심사 : 2022.04.26
  • 발행 : 2022.08.10

초록

This article proposes a subject-independent application of brain-computer interfacing (BCI). A 32-channel Electroencephalography (EEG) device is used to measure imagined speech (SI) of four words (sos, stop, medicine, washroom) and one phrase (come-here) across 13 subjects. A deep long short-term memory (LSTM) network has been adopted to recognize the above signals in seven EEG frequency bands individually in nine major regions of the brain. The results show a maximum accuracy of 73.56% and a network prediction time (NPT) of 0.14 s which are superior to other state-of-the-art techniques in the literature. Our analysis reveals that the alpha band can recognize SI better than other EEG frequencies. To reinforce our findings, the above work has been compared by models based on the gated recurrent unit (GRU), convolutional neural network (CNN), and six conventional classifiers. The results show that the LSTM model has 46.86% more average accuracy in the alpha band and 74.54% less average NPT than CNN. The maximum accuracy of GRU was 8.34% less than the LSTM network. Deep networks performed better than traditional classifiers.

키워드

참고문헌

  1. K. Khanna, A. Verma, and B. Richard, "The locked-in syndrome": Can it be unlocked?, J. Clin. Gerontol. Geriatr. 2 (2011), no. 4, 96-99. https://doi.org/10.1016/j.jcgg.2011.08.001
  2. P. Kant, S. H. Laskar, J. Hazarika, and R. Mahamune, CWT based transfer learning for motor imagery classification for brain computer interfaces, J. Neurosci. Methods 345 (2020), 108886. https://doi.org/10.1016/j.jneumeth.2020.108886
  3. P. Agarwal and S. Kumar, Transforming imagined thoughts into speech using a covariance-based subset selection method, Indian J. Pure Appl. Phys. 59 (2021), no. 3, 180-183.
  4. A. M. Choudhari, P. Porwal, V. Jonnalagedda, and F. Meriaudeau, An electrooculography based human machine interface for wheelchair control, Biocybern. Biomed. Eng. 39 (2019), no. 3, 673-685. https://doi.org/10.1016/j.bbe.2019.04.002
  5. M. Lee, J. Ryu, and D.-H. Kim, Automated epileptic seizure waveform detection method based on the feature of the mean slope of wavelet coefficient counts using a hidden Markov model and EEG signals, ETRI J. 42 (2020), no. 2, 217-229. https://doi.org/10.4218/etrij.2018-0118
  6. O. Ozdenizci, Y. Wang, T. Koike-Akino, and D. Erdogmus,, Adversarial deep learning in EEG biometrics, IEEE Signal Process. Lett. 26 (2019), no. 5, 710-714. https://doi.org/10.1109/lsp.2019.2906826
  7. W. He, Y. Zhao, H. Tang, C. Sun, and W. Fu, A wireless BCI and BMI system for wearable robots, IEEE Trans. Syst. Man Cybern. Syst. 46 (2016), no. 7, 936-946. https://doi.org/10.1109/TSMC.2015.2506618
  8. J. Hazarika, P. Kant, R. Dasgupta, and S. H. Laskar, Neural modulation in action video game players during inhibitory control function: An EEG study using discrete wavelet transform, Biomed. Signal Process Control. 45 (2018), 144-150. https://doi.org/10.1016/j.bspc.2018.05.023
  9. S. Kumar, Real-time implementation and performance evaluation of speech classifiers in speech analysis-synthesis, ETRI J. 43 (2021), no. 1, 82-94. https://doi.org/10.4218/etrij.2019-0364
  10. A. Khosla, P. Khandnor, and T. Chand, A comparative analysis of signal processing and classification methods for different applications based on EEG signals, Biocybern. Biomed. Eng. 40 (2020), no. 2, 649-690. https://doi.org/10.1016/j.bbe.2020.02.002
  11. R. A. Ramadan and A. V. Vasilakos, Brain computer interface: Control signals review, Neurocomputing. 223 (2017), 26-44. https://doi.org/10.1016/j.neucom.2016.10.024
  12. T. K. Reddy, V. Arora, and L. Behera, HJB-equation-based optimal learning scheme for neural networks with applications in brain-computer interface, IEEE Trans. Emerg. Top. Comput. Intell. 4 (2020), no. 2, 159-170. https://doi.org/10.1109/tetci.2018.2858761
  13. C. Ju, D. Gao, R. Mane, B. Tan, Y. Liu, C. Guan, Federated transfer learning for EEG signal classification, (Proc. 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society, Montreal, QC, Canada), 2020, pp. 3040-3045.
  14. K.-H. Kim, H. K. Kim, J. S. Kim, W. Son, and S. Y. Lee, A biosignal-based human interface controlling a power-wheelchair for people with motor disabilities, ETRI J. 28 (2006), no. 1, 111-114. https://doi.org/10.4218/etrij.06.0205.0069
  15. P. Kaushik, A. Gupta, P. P. Roy, and D. P. Dogra, EEG-based age and gender prediction using deep BLSTM-LSTM network model, IEEE Sens. J. 19 (2019), no. 7, 2634-2641. https://doi.org/10.1109/jsen.2018.2885582
  16. S. Martin, P. Brunner, I. Iturrate, J. R. Millan, G. Schalk, R. T. Knight, and B. N. Pasley, Word pair classification during imagined speech using direct brain recordings, Sci. Rep. 6 (2016), 25803. https://doi.org/10.1038/srep25803
  17. M. D'Zamura, S. Deng, T. Lappas, S. Thorpe, and R. Srinivasan, Toward EEG sensing of imagined speech, In Human-computer interaction. New trends, J. A. Jacko (ed.) Vol. 5610, Springer, Berlin, Heidelberg, 2009, 40-48.
  18. K. Brigham and B. V. K. V. Kumar, Imagined speech classification with EEG signals for silent communication: A preliminary investigation into synthetic telepathy, (Proc. 4th International Conference on Bioinformatics and Biomedical Engineering, Chengdu, China), 2010, pp. 1-4.
  19. C. H. Nguyen, G. K. Karavas, and P. Artemiadis, Inferring imagined speech using EEG signals: A new approach using Riemannian manifold features, J. Neural Eng. 15 (2017), no. 1, 016002. https://doi.org/10.1088/1741-2552/15/1/016002
  20. P. Agarwal, R. K. Kale, M. Kumar, S. Kumar, Silent speech classification based upon various feature extraction methods, (Proc. Int. Conf. Signal Processing and Integrated Networks, Noida, India), 2020, pp. 16-20.
  21. P. Kumar, R. Saini, P. P. Roy, P. K. Sahu, and D. P. Dogra, Envisioned speech recognition using EEG sensors, Pers. Ubiquitous Comput. 22 (2018), 185-199. https://doi.org/10.1007/s00779-017-1083-4
  22. M. N. I. Qureshi, B. Min, H. J. Park, D. Cho, W. Choi, and B. Lee, Multiclass classification of word imagination speech with hybrid connectivity features, IEEE Trans. Biomed. Eng. 65 (2018), no. 10, 2168-2177. https://doi.org/10.1109/TBME.2017.2786251
  23. E. T. Esfahani and V. Sundararajan, Classification of primitive shapes using brain-computer interfaces, Comput. Aided Des. 44 (2012), no. 10, 1011-1019. https://doi.org/10.1016/j.cad.2011.04.008
  24. S. Kellis, K. Miller, K. Thomson, R. Brown, P. House, and B. Greger, Decoding spoken words using local field potentials recorded from the cortical surface, J. Neural Eng. 7 (2010), no. 5, 056007. https://doi.org/10.1088/1741-2560/7/5/056007
  25. A. A. Torres-Garcia, C. A. Reyes-Garcia, L. Villasenor-Pineda, and G. Garcia-Aguilar, Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification, Expert Syst. Appl. 59 (2016), 1-12. https://doi.org/10.1016/j.eswa.2016.04.011
  26. E. F. Gonzalez-Castaneda, A. A. Torres-Garcia, C. A. Reyes-Garcia, and L. Villasenor-Pineda, Sonification and textification: Proposing methods for classifying unspoken words from EEG signals, Biomed. Signal Process Control. 37 (2017), 82-91. https://doi.org/10.1016/j.bspc.2016.10.012
  27. D. Pawar and S. Dhage, Multiclass covert speech classification using extreme learning machine, Biomed. Eng. Lett. 10 (2020), 217-226. https://doi.org/10.1007/s13534-020-00152-x
  28. C. S. Dasalla, H. Kambara, M. Sato, and Y. Koike, Single-trial classification of vowel speech imagery using common spatial patterns, Neural Netw. 22 (2009), no. 9, 1334-1339. https://doi.org/10.1016/j.neunet.2009.05.008
  29. C. Cooney, A. Korik, R. Folli, D. Coyle, Classification of imagined spoken word-pairs using convolutional neural networks, (Proc. Int. Conf. The 8th Graz Brain Computer Interface, Verlag der Technischen Universitat, Graz), 2019, pp. 338-343.
  30. D. Dash, P. Ferrari, and J. Wang, Decoding imagined and spoken phrases from non-invasive neural (MEG) signals, Front. Neurosci. 14 (2020), 290. https://doi.org/10.3389/fnins.2020.00290
  31. M.-O. Tamm, Y. Muhammad, and N. Muhammad, Classification of vowels from imagined speech with convolutional neural networks, Comput. 9 (2020), no. 2, 46. https://doi.org/10.3390/computers9020046
  32. P. Saha and S. Fels, Hierarchical deep feature learning for decoding imagined speech from EEG, (Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HA, USA) pp. 10019-10020.
  33. P. Saha, S. Fels, and M. Abdul-Mageed, Deep learning the EEG manifold for phonological categorization from active thoughts, (Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Brighton, UK), 2019, pp. 2762-2766.
  34. A. Porbadnigk, M. Wester, JP. Calliess, and T. Schultz, EEG-based speech recognition- impact of temporal effects, (Proc. International Conference on Bio-inspired Systems and Signal Processing - BIOSIGNALS, Porto, Portugal), 2009, pp. 376-381.
  35. L. Marple, Computing the discrete-time "analytic" signal via FFT, IEEE Trans. Signal Process. 47 (1999), no. 9, 2600-2603. https://doi.org/10.1109/78.782222
  36. P. Agarwal and S. Kumar, Electroencephalography based imagined alphabets classification using spatial and time-domain features, Int. J. Imaging Syst. Technol. 32 (2022), no. 1, 111-122. https://doi.org/10.1002/ima.22655
  37. G. H. Klem, H. O. Luders, H. H. Jasper, and C. Elger, The tentwenty electrode system of the international federation. The international federation of clinical neurophysiology, Electroencephalogr. Clin. Neurophysiol. 52 (1999), 3-6.
  38. S. Siuly, Y. Li, and Y. Zhang, EEG signal analysis and classification: Techniques and applications, Springer, Cham, Switzerland, 2016.
  39. S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Comput. 9 (1997), no. 8, 1735-1789. https://doi.org/10.1162/neco.1997.9.8.1735
  40. X. Glorot, and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, (Proc. 13th International Conference on Artificial Intelligence and Statistics. Chia Laguna Resort, Sardinia, Italy), 2010, pp. 249-256.
  41. A. M. Saxe, J. L. McClelland, and S. Ganguli, Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, ArXiv preprint, 2013. https://doi.oprg/10.48550/arXiv.1312.6120
  42. J. M. Lilly and S. C. Olhede, Generalized Morse wavelets as a superfamily of analytic wavelets, IEEE Trans. Signal Process. 60 (2012), no. 11, 6036-6041. https://doi.org/10.1109/TSP.2012.2210890
  43. S. Zhao, and F. Rudzicz, Classifying phonological categories in imagined and articulated speech, (Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Australia), 2015, pp. 992-996. https://doi.org/10.1109/ICASSP.2015.7178118
  44. S. Wellington, and J. Clayton, Fourteen-channel EEG with Imagined Speech (FEIS) dataset, v1.0, University of Edinburgh, Edinburgh, UK, 2019. https://doi.org/10.5281/zenodo.3554128
  45. S. Kumar, P. R. Verma, M. Bharti, and P. Agarwal, A CNN based graphical user interface controlled by imagined movements, Int. J. Syst. Assur. Eng. Manag. (2021). https://doi.org/10.1007/s13198-021-01096-w