DOI QR코드

DOI QR Code

Speech Emotion Recognition in People at High Risk of Dementia

  • Dongseon Kim (Department of Silver Business, Sookmyung Women's University) ;
  • Bongwon Yi (Department of Communication Disorders, Korea Nazarene University) ;
  • Yugwon Won (Baikal AI Co. Ltd.)
  • Received : 2024.05.10
  • Accepted : 2024.07.11
  • Published : 2024.07.31

Abstract

Background and Purpose: The emotions of people at various stages of dementia need to be effectively utilized for prevention, early intervention, and care planning. With technology available for understanding and addressing the emotional needs of people, this study aims to develop speech emotion recognition (SER) technology to classify emotions for people at high risk of dementia. Methods: Speech samples from people at high risk of dementia were categorized into distinct emotions via human auditory assessment, the outcomes of which were annotated for guided deep-learning method. The architecture incorporated convolutional neural network, long short-term memory, attention layers, and Wav2Vec2, a novel feature extractor to develop automated speech-emotion recognition. Results: Twenty-seven kinds of Emotions were found in the speech of the participants. These emotions were grouped into 6 detailed emotions: happiness, interest, sadness, frustration, anger, and neutrality, and further into 3 basic emotions: positive, negative, and neutral. To improve algorithmic performance, multiple learning approaches were applied using different data sources-voice and text-and varying the number of emotions. Ultimately, a 2-stage algorithm-initial text-based classification followed by voice-based analysis-achieved the highest accuracy, reaching 70%. Conclusions: The diverse emotions identified in this study were attributed to the characteristics of the participants and the method of data collection. The speech of people at high risk of dementia to companion robots also explains the relatively low performance of the SER algorithm. Accordingly, this study suggests the systematic and comprehensive construction of a dataset from people with dementia.

Keywords

Acknowledgement

This research was made possible, in part, using the data provided by Hyodol Co. Ltd., a manufacturer of companion robots in Korea. The data collected through the companion robot, Hyodol, was provided to the researcher team free of charge for the purpose of the study, without any other conditions or duties.

References

  1. Sutin AR, Stephan Y, Terracciano A. Psychological well-being and risk of dementia. Int J Geriatr Psychiatry 2018;33:743-747.
  2. Katon W, Pedersen HS, Ribe AR, Fenger-Gron M, Davydow D, Waldorff FB, et al. Effect of depression and diabetes mellitus on the risk for dementia: a national population-based cohort study. JAMA Psychiatry 2015;72:612-619.
  3. Ownby RL, Crocco E, Acevedo A, John V, Loewenstein D. Depression and risk for Alzheimer disease: systematic review, meta-analysis, and metaregression analysis. Arch Gen Psychiatry 2006;63:530-538. 
  4. da Silva J, Goncalves-Pereira M, Xavier M, Mukaetova-Ladinska EB. Affective disorders and risk of developing dementia: systematic review. Br J Psychiatry 2013;202:177-186.
  5. Richard E, Reitz C, Honig LH, Schupf N, Tang MX, Manly JJ, et al. Late-life depression, mild cognitive impairment, and dementia. JAMA Neurol 2013;70:374-382.
  6. Mourao RJ, Mansur G, Malloy-Diniz LF, Castro Costa E, Diniz BS. Depressive symptoms increase the risk of progression to dementia in subjects with mild cognitive impairment: systematic review and metaanalysis. Int J Geriatr Psychiatry 2016;31:905-911.
  7. Weng X, George DR, Jiang B, Wang L. Association between subjective cognitive decline and social and emotional support in US adults. Am J Alzheimers Dis Other Demen 2020;35:1533317520922392.
  8. Lawton MP, Van Haitsma K, Klapper J. Observed affect in nursing home residents with Alzheimer's disease. J Gerontol B Psychol Sci Soc Sci 1996;51:3-14.
  9. Vogelpohl TS, Beck CK. Affective responses to behavioral interventions. Semin Clin Neuropsychiatry 1997;2:102-112.
  10. Ekman P. Emotions Revealed, Second Edition: Recognizing Faces and Feelings to Improve Communication and Emotional Life. New York: Henry Holt and Company, 2007.
  11. Izdebski K. Emotions in the Human Voice. Volume 1, Foundations. San Diego: Plural Publishing, Inc., 2008.
  12. Higuchi M, Nakamura M, Shinohara S, Omiya Y, Takano T, Mitsuyoshi S, et al. Effectiveness of a voice-based mental health evaluation system for mobile devices: prospective study. JMIR Form Res 2020;4:e16455.
  13. Kwon OW, Chan K, Hao J, Lee TW. Emotion recognition by speech signals. In: Proceedings of the 8th European Conference on Speech Communication and Technology (Eurospeech 2003); 2003 Sep 1-4; Geneva, Switzerland. International Speech Communication Association, 2003; 125-128.
  14. Nogueiras A, Moreno A, Bonafonte A, Marino JB. Speech emotion recognition using hidden Markov models. In: Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech 2001); 2001 Sep 3-7; Aalborg, Denmark. International Speech Communication Association, 2001; 2679-2682.
  15. Hansen JH. Analysis and compensation of speech under stress and noise for environmental robustness in speech recognition. Speech Commun 1996;20:151-173.
  16. Bou-Ghazale SE, Hansen JH. A comparative study of traditional and newly proposed features for recognition of speech under stress. IEEE Trans Speech Audio Process 2000;8:429-442.
  17. Chuang ZJ, Wu CH. Multi-modal emotion recognition from speech and text. Int J Comput Linguist Chin Lang Process 2004;9:45-62.
  18. Yoon S, Byun S, Jung K. Multimodal speech emotion recognition using audio and text. arXiv. Forthcoming 2018.
  19. Lu Q, Sun X, Long Y, Gao Z, Feng J, Sun T. Sentiment analysis: comprehensive reviews, recent advances, and open challenges. IEEE Trans Neural Netw Learn Syst 2023;PP:1-21.
  20. Livingston G, Huntley J, Sommerlad A, Ames D, Ballard C, Banerjee S, et al. Dementia prevention, intervention, and care: 2020 report of the Lancet Commission. Lancet 2020;396:413-446.
  21. National Institute for Health and Care Excellence (NICE). Dementia, Disability and Frailty in Later Life - Mid-Life Approaches to Delay or Prevent Onset. London: NICE, 2015.
  22. Daviglus ML, Plassman BL, Pirzada A, Bell CC, Bowen PE, Burke JR, et al. Risk factors and preventive interventions for Alzheimer disease: state of the science. Arch Neurol 2011;68:1185-1190.
  23. Saczynski JS, Beiser A, Seshadri S, Auerbach S, Wolf PA, Au R. Depressive symptoms and risk of dementia: the Framingham Heart Study. Neurology 2010;75:35-41.
  24. Barnes DE, Alexopoulos GS, Lopez OL, Williamson JD, Yaffe K. Depressive symptoms, vascular disease, and mild cognitive impairment: findings from the Cardiovascular Health Study. Arch Gen Psychiatry 2006;63:273-279.
  25. Power MC, Mormino E, Soldan A, James BD, Yu L, Armstrong NM, et al. Combined neuropathological pathways account for age-related risk of dementia. Ann Neurol 2018;84:10-22.
  26. Escobar-Linero E, Luna-Perejon F, Munoz-Saavedra L, Sevillano JL, Dominguez-Morales M. On the feature extraction process in machine learning. An experimental study about guided versus non-guided process in falling detection systems. Eng Appl Artif Intell 2022;114:105170.
  27. Mondal A, Gokhale SS. Mining emotions on Plutchik's wheel. In: Proceedings of the 2020 Seventh International Conference on Social Networks Analysis, Management and Security (SNAMS); 2020 December 14-16; Paris, France. Piscataway; IEEE, 2020; 1-6.
  28. Baevski A, Zhou Y, Mohamed A, Auli M. wav2vec 2.0: A framework for self-supervised learning of speech representations. Adv Neural Inf Process Syst 2020;33:12449-12460.
  29. Lecun Y, Bottou Y, Bengio P, Haffner P. Gradient-based learning applied to document recognition. Proc IEEE 1998;86:2278-2324.
  30. Hochreiter S, Schmidhuber J. Long short-term memory. Neural Comput 1997;9:1735-1780.
  31. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Adv Neural Inf Process Syst 2017:30.
  32. Kumar Y, Koul A, Singla R, Ijaz MF. Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. J Ambient Intell Humaniz Comput 2023;14:8459-8486.
  33. Bucks RS, Radford SA. Emotion processing in Alzheimer's disease. Aging Ment Health 2004;8:222-232. 
  34. Mega MS, Cummings JL, Fiorello T, Gornbein J. The spectrum of behavioral changes in Alzheimer's disease. Neurology 1996;46:130-135.
  35. Algase DL, Beck C, Kolanowski A, Whall A, Berent S, Richards K, et al. Need-driven dementia-ompromised behavior: an alternative view of disruptive behavior. Am J Alzheimer Dis 1996;11:10-19. 
  36. Goncalves-Pereira M. Neuropsychiatric symptoms in cognitive impairment and dementia: a brief introductory overview. In: Verdelho A, Goncalves-Pereira M. Neuropsychiatric Symptoms in Cognitive Impairment and Dementia. Cham; Springer, 2017; 1-7.
  37. Han KH, Zaytseva Y, Bao Y, Poppel E, Chung SY, Kim JW, et al. Impairment of vocal expression of negative emotions in patients with Alzheimer's disease. Front Aging Neurosci 2014;6:101.
  38. Cadieux NL, Greve KW. Emotion processing in Alzheimer's disease. J Int Neuropsychol Soc 1997;3:411-419. 
  39. Zandi T, Cooper M, Garrison L. Facial recognition: a cognitive study of elderly dementia patients and normal older adults. Int Psychogeriatr 1992;4:215-221.
  40. Hahn EA. Daily experiences in stress, memory, and emotion in older adults with mild cognitive impairment [dissertation]. Tampa: University of South Florida, 2012.
  41. Shah Fahad M, Ranjan A, Yadav J, Deepak A. A survey of speech emotion recognition in natural environment. Digit Signal Process 2021;110:102951.
  42. Livingstone SR, Russo FA. The Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS): a dynamic, multimodal set of facial and vocal expressions in North American English. PLoS One 2018;13:e0196391.
  43. Burkhardt F, Paeschke A, Rolfes M, Sendlmeier WF, Weiss B. A database of German emotional speech. In: Proceedings of the 9th European Conference on Speech Communication and Technology (Interspeech 2005); 2005 Sep 4-8; Lisbon, Portugal. International Speech Communication Association, 2005; 1517-1520.
  44. National Information Society Agency (NIA). Free conversation with emotion tags (adult) [Internet]. Daegu: NIA; 2022 [cited 2024 Jun 12]. Available from:https://www.aihub.or.kr/aihubdata/data/view.do?currMenu=115&topMenu=100&aihubDataSe=data&dataSetSn=71631.
  45. Mohamed O, Aly SA. Arabic speech emotion recognition employing Wav2vec2. 0 and HuBERT based on BAVED dataset. arXiv. Forthcoming 2021.
  46. Al-onazi BB, Nauman MA, Jahangir R, Malik MM, Alkhammash EH, Elshewey AM. Transformer-based multilingual speech emotion recognition using data augmentation and feature fusion. Applied Sciences. 2022;12:9188.
  47. Xue C, Karjadi C, Paschalidis IC, Au R, Kolachalama VB. Detection of dementia on voice recordings using deep learning: a Framingham Heart Study. Alzheimers Res Ther 2021;13:146.
  48. Park CY, Kim M, Shim Y, Ryoo N, Choi H, Jeong HT, et al. Harnessing the power of voice: a deep neural network model for Alzheimer's disease detection. Dement Neurocogn Disord 2024;23:1-10.