Electroencephalography-based imagined speech recognition using deep long short-term memory network

Agarwal, Prabhakar;Kumar, Sandeep;

doi:10.4218/etrij.2021-0118

ETRI Journal

제44권4호
/
Pages.672-685
/
2022
/
1225-6463(pISSN)
/
2233-7326(eISSN)

한국전자통신연구원 (Electronics and Telecommunications Research Institute)

DOI QR Code

Electroencephalography-based imagined speech recognition using deep long short-term memory network

Agarwal, Prabhakar (Department of Electronics and Communication Engineering, National Institute of Technology Delhi) ;
Kumar, Sandeep (Department of Electronics and Communication Engineering, National Institute of Technology Delhi)

투고 : 2021.04.04
심사 : 2022.04.26
발행 : 2022.08.10

https://doi.org/10.4218/etrij.2021-0118 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

This article proposes a subject-independent application of brain-computer interfacing (BCI). A 32-channel Electroencephalography (EEG) device is used to measure imagined speech (SI) of four words (sos, stop, medicine, washroom) and one phrase (come-here) across 13 subjects. A deep long short-term memory (LSTM) network has been adopted to recognize the above signals in seven EEG frequency bands individually in nine major regions of the brain. The results show a maximum accuracy of 73.56% and a network prediction time (NPT) of 0.14 s which are superior to other state-of-the-art techniques in the literature. Our analysis reveals that the alpha band can recognize SI better than other EEG frequencies. To reinforce our findings, the above work has been compared by models based on the gated recurrent unit (GRU), convolutional neural network (CNN), and six conventional classifiers. The results show that the LSTM model has 46.86% more average accuracy in the alpha band and 74.54% less average NPT than CNN. The maximum accuracy of GRU was 8.34% less than the LSTM network. Deep networks performed better than traditional classifiers.

키워드

참고문헌

K. Khanna, A. Verma, and B. Richard, "The locked-in syndrome": Can it be unlocked?, J. Clin. Gerontol. Geriatr. 2 (2011), no. 4, 96-99. https://doi.org/10.1016/j.jcgg.2011.08.001
P. Kant, S. H. Laskar, J. Hazarika, and R. Mahamune, CWT based transfer learning for motor imagery classification for brain computer interfaces, J. Neurosci. Methods 345 (2020), 108886. https://doi.org/10.1016/j.jneumeth.2020.108886
P. Agarwal and S. Kumar, Transforming imagined thoughts into speech using a covariance-based subset selection method, Indian J. Pure Appl. Phys. 59 (2021), no. 3, 180-183.
A. M. Choudhari, P. Porwal, V. Jonnalagedda, and F. Meriaudeau, An electrooculography based human machine interface for wheelchair control, Biocybern. Biomed. Eng. 39 (2019), no. 3, 673-685. https://doi.org/10.1016/j.bbe.2019.04.002
M. Lee, J. Ryu, and D.-H. Kim, Automated epileptic seizure waveform detection method based on the feature of the mean slope of wavelet coefficient counts using a hidden Markov model and EEG signals, ETRI J. 42 (2020), no. 2, 217-229. https://doi.org/10.4218/etrij.2018-0118
O. Ozdenizci, Y. Wang, T. Koike-Akino, and D. Erdogmus,, Adversarial deep learning in EEG biometrics, IEEE Signal Process. Lett. 26 (2019), no. 5, 710-714. https://doi.org/10.1109/lsp.2019.2906826
W. He, Y. Zhao, H. Tang, C. Sun, and W. Fu, A wireless BCI and BMI system for wearable robots, IEEE Trans. Syst. Man Cybern. Syst. 46 (2016), no. 7, 936-946. https://doi.org/10.1109/TSMC.2015.2506618
J. Hazarika, P. Kant, R. Dasgupta, and S. H. Laskar, Neural modulation in action video game players during inhibitory control function: An EEG study using discrete wavelet transform, Biomed. Signal Process Control. 45 (2018), 144-150. https://doi.org/10.1016/j.bspc.2018.05.023
S. Kumar, Real-time implementation and performance evaluation of speech classifiers in speech analysis-synthesis, ETRI J. 43 (2021), no. 1, 82-94. https://doi.org/10.4218/etrij.2019-0364
A. Khosla, P. Khandnor, and T. Chand, A comparative analysis of signal processing and classification methods for different applications based on EEG signals, Biocybern. Biomed. Eng. 40 (2020), no. 2, 649-690. https://doi.org/10.1016/j.bbe.2020.02.002
R. A. Ramadan and A. V. Vasilakos, Brain computer interface: Control signals review, Neurocomputing. 223 (2017), 26-44. https://doi.org/10.1016/j.neucom.2016.10.024
T. K. Reddy, V. Arora, and L. Behera, HJB-equation-based optimal learning scheme for neural networks with applications in brain-computer interface, IEEE Trans. Emerg. Top. Comput. Intell. 4 (2020), no. 2, 159-170. https://doi.org/10.1109/tetci.2018.2858761
C. Ju, D. Gao, R. Mane, B. Tan, Y. Liu, C. Guan, Federated transfer learning for EEG signal classification, (Proc. 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society, Montreal, QC, Canada), 2020, pp. 3040-3045.
K.-H. Kim, H. K. Kim, J. S. Kim, W. Son, and S. Y. Lee, A biosignal-based human interface controlling a power-wheelchair for people with motor disabilities, ETRI J. 28 (2006), no. 1, 111-114. https://doi.org/10.4218/etrij.06.0205.0069
P. Kaushik, A. Gupta, P. P. Roy, and D. P. Dogra, EEG-based age and gender prediction using deep BLSTM-LSTM network model, IEEE Sens. J. 19 (2019), no. 7, 2634-2641. https://doi.org/10.1109/jsen.2018.2885582
S. Martin, P. Brunner, I. Iturrate, J. R. Millan, G. Schalk, R. T. Knight, and B. N. Pasley, Word pair classification during imagined speech using direct brain recordings, Sci. Rep. 6 (2016), 25803. https://doi.org/10.1038/srep25803
M. D'Zamura, S. Deng, T. Lappas, S. Thorpe, and R. Srinivasan, Toward EEG sensing of imagined speech, In Human-computer interaction. New trends, J. A. Jacko (ed.) Vol. 5610, Springer, Berlin, Heidelberg, 2009, 40-48.
K. Brigham and B. V. K. V. Kumar, Imagined speech classification with EEG signals for silent communication: A preliminary investigation into synthetic telepathy, (Proc. 4th International Conference on Bioinformatics and Biomedical Engineering, Chengdu, China), 2010, pp. 1-4.
C. H. Nguyen, G. K. Karavas, and P. Artemiadis, Inferring imagined speech using EEG signals: A new approach using Riemannian manifold features, J. Neural Eng. 15 (2017), no. 1, 016002. https://doi.org/10.1088/1741-2552/15/1/016002
P. Agarwal, R. K. Kale, M. Kumar, S. Kumar, Silent speech classification based upon various feature extraction methods, (Proc. Int. Conf. Signal Processing and Integrated Networks, Noida, India), 2020, pp. 16-20.
P. Kumar, R. Saini, P. P. Roy, P. K. Sahu, and D. P. Dogra, Envisioned speech recognition using EEG sensors, Pers. Ubiquitous Comput. 22 (2018), 185-199. https://doi.org/10.1007/s00779-017-1083-4
M. N. I. Qureshi, B. Min, H. J. Park, D. Cho, W. Choi, and B. Lee, Multiclass classification of word imagination speech with hybrid connectivity features, IEEE Trans. Biomed. Eng. 65 (2018), no. 10, 2168-2177. https://doi.org/10.1109/TBME.2017.2786251
E. T. Esfahani and V. Sundararajan, Classification of primitive shapes using brain-computer interfaces, Comput. Aided Des. 44 (2012), no. 10, 1011-1019. https://doi.org/10.1016/j.cad.2011.04.008
S. Kellis, K. Miller, K. Thomson, R. Brown, P. House, and B. Greger, Decoding spoken words using local field potentials recorded from the cortical surface, J. Neural Eng. 7 (2010), no. 5, 056007. https://doi.org/10.1088/1741-2560/7/5/056007
A. A. Torres-Garcia, C. A. Reyes-Garcia, L. Villasenor-Pineda, and G. Garcia-Aguilar, Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification, Expert Syst. Appl. 59 (2016), 1-12. https://doi.org/10.1016/j.eswa.2016.04.011
E. F. Gonzalez-Castaneda, A. A. Torres-Garcia, C. A. Reyes-Garcia, and L. Villasenor-Pineda, Sonification and textification: Proposing methods for classifying unspoken words from EEG signals, Biomed. Signal Process Control. 37 (2017), 82-91. https://doi.org/10.1016/j.bspc.2016.10.012
D. Pawar and S. Dhage, Multiclass covert speech classification using extreme learning machine, Biomed. Eng. Lett. 10 (2020), 217-226. https://doi.org/10.1007/s13534-020-00152-x
C. S. Dasalla, H. Kambara, M. Sato, and Y. Koike, Single-trial classification of vowel speech imagery using common spatial patterns, Neural Netw. 22 (2009), no. 9, 1334-1339. https://doi.org/10.1016/j.neunet.2009.05.008
C. Cooney, A. Korik, R. Folli, D. Coyle, Classification of imagined spoken word-pairs using convolutional neural networks, (Proc. Int. Conf. The 8th Graz Brain Computer Interface, Verlag der Technischen Universitat, Graz), 2019, pp. 338-343.
D. Dash, P. Ferrari, and J. Wang, Decoding imagined and spoken phrases from non-invasive neural (MEG) signals, Front. Neurosci. 14 (2020), 290. https://doi.org/10.3389/fnins.2020.00290
M.-O. Tamm, Y. Muhammad, and N. Muhammad, Classification of vowels from imagined speech with convolutional neural networks, Comput. 9 (2020), no. 2, 46. https://doi.org/10.3390/computers9020046
P. Saha and S. Fels, Hierarchical deep feature learning for decoding imagined speech from EEG, (Proceedings of the AAAI Conference on Artificial Intelligence, Honolulu, HA, USA) pp. 10019-10020.
P. Saha, S. Fels, and M. Abdul-Mageed, Deep learning the EEG manifold for phonological categorization from active thoughts, (Proc. IEEE Int. Conf. on Acoustics, Speech and Signal Processing, Brighton, UK), 2019, pp. 2762-2766.
A. Porbadnigk, M. Wester, JP. Calliess, and T. Schultz, EEG-based speech recognition- impact of temporal effects, (Proc. International Conference on Bio-inspired Systems and Signal Processing - BIOSIGNALS, Porto, Portugal), 2009, pp. 376-381.
L. Marple, Computing the discrete-time "analytic" signal via FFT, IEEE Trans. Signal Process. 47 (1999), no. 9, 2600-2603. https://doi.org/10.1109/78.782222
P. Agarwal and S. Kumar, Electroencephalography based imagined alphabets classification using spatial and time-domain features, Int. J. Imaging Syst. Technol. 32 (2022), no. 1, 111-122. https://doi.org/10.1002/ima.22655
G. H. Klem, H. O. Luders, H. H. Jasper, and C. Elger, The tentwenty electrode system of the international federation. The international federation of clinical neurophysiology, Electroencephalogr. Clin. Neurophysiol. 52 (1999), 3-6.
S. Siuly, Y. Li, and Y. Zhang, EEG signal analysis and classification: Techniques and applications, Springer, Cham, Switzerland, 2016.
S. Hochreiter and J. Schmidhuber, Long Short-Term Memory, Neural Comput. 9 (1997), no. 8, 1735-1789. https://doi.org/10.1162/neco.1997.9.8.1735
X. Glorot, and Y. Bengio, Understanding the difficulty of training deep feedforward neural networks, (Proc. 13th International Conference on Artificial Intelligence and Statistics. Chia Laguna Resort, Sardinia, Italy), 2010, pp. 249-256.
A. M. Saxe, J. L. McClelland, and S. Ganguli, Exact solutions to the nonlinear dynamics of learning in deep linear neural networks, ArXiv preprint, 2013. https://doi.oprg/10.48550/arXiv.1312.6120
J. M. Lilly and S. C. Olhede, Generalized Morse wavelets as a superfamily of analytic wavelets, IEEE Trans. Signal Process. 60 (2012), no. 11, 6036-6041. https://doi.org/10.1109/TSP.2012.2210890
S. Zhao, and F. Rudzicz, Classifying phonological categories in imagined and articulated speech, (Proc. IEEE International Conference on Acoustics, Speech and Signal Processing. South Brisbane, Australia), 2015, pp. 992-996. https://doi.org/10.1109/ICASSP.2015.7178118
S. Wellington, and J. Clayton, Fourteen-channel EEG with Imagined Speech (FEIS) dataset, v1.0, University of Edinburgh, Edinburgh, UK, 2019. https://doi.org/10.5281/zenodo.3554128
S. Kumar, P. R. Verma, M. Bharti, and P. Agarwal, A CNN based graphical user interface controlled by imagined movements, Int. J. Syst. Assur. Eng. Manag. (2021). https://doi.org/10.1007/s13198-021-01096-w

ETRI Journal

Electroencephalography-based imagined speech recognition using deep long short-term memory network

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)