• Title/Summary/Keyword: Accuracy of Emotion Recognition

Search Result 93, Processing Time 0.03 seconds

Emotion Recognition using Facial Thermal Images

  • Eom, Jin-Sup;Sohn, Jin-Hun
    • Journal of the Ergonomics Society of Korea
    • /
    • v.31 no.3
    • /
    • pp.427-435
    • /
    • 2012
  • The aim of this study is to investigate facial temperature changes induced by facial expression and emotional state in order to recognize a persons emotion using facial thermal images. Background: Facial thermal images have two advantages compared to visual images. Firstly, facial temperature measured by thermal camera does not depend on skin color, darkness, and lighting condition. Secondly, facial thermal images are changed not only by facial expression but also emotional state. To our knowledge, there is no study to concurrently investigate these two sources of facial temperature changes. Method: 231 students participated in the experiment. Four kinds of stimuli inducing anger, fear, boredom, and neutral were presented to participants and the facial temperatures were measured by an infrared camera. Each stimulus consisted of baseline and emotion period. Baseline period lasted during 1min and emotion period 1~3min. In the data analysis, the temperature differences between the baseline and emotion state were analyzed. Eyes, mouth, and glabella were selected for facial expression features, and forehead, nose, cheeks were selected for emotional state features. Results: The temperatures of eyes, mouth, glanella, forehead, and nose area were significantly decreased during the emotional experience and the changes were significantly different by the kind of emotion. The result of linear discriminant analysis for emotion recognition showed that the correct classification percentage in four emotions was 62.7% when using both facial expression features and emotional state features. The accuracy was slightly but significantly decreased at 56.7% when using only facial expression features, and the accuracy was 40.2% when using only emotional state features. Conclusion: Facial expression features are essential in emotion recognition, but emotion state features are also important to classify the emotion. Application: The results of this study can be applied to human-computer interaction system in the work places or the automobiles.

Speech Emotion Recognition Using Confidence Level for Emotional Interaction Robot (감정 상호작용 로봇을 위한 신뢰도 평가를 이용한 화자독립 감정인식)

  • Kim, Eun-Ho
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.19 no.6
    • /
    • pp.755-759
    • /
    • 2009
  • The ability to recognize human emotion is one of the hallmarks of human-robot interaction. Especially, speaker-independent emotion recognition is a challenging issue for commercial use of speech emotion recognition systems. In general, speaker-independent systems show a lower accuracy rate compared with speaker-dependent systems, as emotional feature values depend on the speaker and his/her gender. Hence, this paper describes the realization of speaker-independent emotion recognition by rejection using confidence measure to make the emotion recognition system be homogeneous and accurate. From comparison of the proposed methods with conventional method, the improvement and effectiveness of proposed methods were clearly confirmed.

Robust Real-time Tracking of Facial Features with Application to Emotion Recognition (안정적인 실시간 얼굴 특징점 추적과 감정인식 응용)

  • Ahn, Byungtae;Kim, Eung-Hee;Sohn, Jin-Hun;Kweon, In So
    • The Journal of Korea Robotics Society
    • /
    • v.8 no.4
    • /
    • pp.266-272
    • /
    • 2013
  • Facial feature extraction and tracking are essential steps in human-robot-interaction (HRI) field such as face recognition, gaze estimation, and emotion recognition. Active shape model (ASM) is one of the successful generative models that extract the facial features. However, applying only ASM is not adequate for modeling a face in actual applications, because positions of facial features are unstably extracted due to limitation of the number of iterations in the ASM fitting algorithm. The unaccurate positions of facial features decrease the performance of the emotion recognition. In this paper, we propose real-time facial feature extraction and tracking framework using ASM and LK optical flow for emotion recognition. LK optical flow is desirable to estimate time-varying geometric parameters in sequential face images. In addition, we introduce a straightforward method to avoid tracking failure caused by partial occlusions that can be a serious problem for tracking based algorithm. Emotion recognition experiments with k-NN and SVM classifier shows over 95% classification accuracy for three emotions: "joy", "anger", and "disgust".

RoutingConvNet: A Light-weight Speech Emotion Recognition Model Based on Bidirectional MFCC (RoutingConvNet: 양방향 MFCC 기반 경량 음성감정인식 모델)

  • Hyun Taek Lim;Soo Hyung Kim;Guee Sang Lee;Hyung Jeong Yang
    • Smart Media Journal
    • /
    • v.12 no.5
    • /
    • pp.28-35
    • /
    • 2023
  • In this study, we propose a new light-weight model RoutingConvNet with fewer parameters to improve the applicability and practicality of speech emotion recognition. To reduce the number of learnable parameters, the proposed model connects bidirectional MFCCs on a channel-by-channel basis to learn long-term emotion dependence and extract contextual features. A light-weight deep CNN is constructed for low-level feature extraction, and self-attention is used to obtain information about channel and spatial signals in speech signals. In addition, we apply dynamic routing to improve the accuracy and construct a model that is robust to feature variations. The proposed model shows parameter reduction and accuracy improvement in the overall experiments of speech emotion datasets (EMO-DB, RAVDESS, and IEMOCAP), achieving 87.86%, 83.44%, and 66.06% accuracy respectively with about 156,000 parameters. In this study, we proposed a metric to calculate the trade-off between the number of parameters and accuracy for performance evaluation against light-weight.

Emotion Recognition in Children With Autism Spectrum Disorder: A Comparison of Musical and Visual Cues (음악 단서와 시각 단서 조건에 따른 학령기 자폐스펙트럼장애 아동과 일반아동의 정서 인식 비교)

  • Yoon, Yea-Un
    • Journal of Music and Human Behavior
    • /
    • v.19 no.1
    • /
    • pp.1-20
    • /
    • 2022
  • The purpose of this study was to evaluate how accurately children with autism spectrum disorder (ASD; n = 9) recognized four basic emotions (i.e., happiness, sadness, anger, and fear) following musical or visual cues. Their performance was compared to that of typically developing children (TD; n = 14). All of the participants were between the ages of 7 and 13 years. Four musical cues and four visual cues for each emotion were presented to evaluate the participants' ability to recognize the four basic emotions. The results indicated that there were significant differences between the two groups between the musical and visual cues. In particular, the ASD group demonstrated significantly less accurate recognition of the four emotions compared to the TD group. However, the emotion recognition of both groups was more accurate following the musical cues compared to the visual cues. Finally, for both groups, their greatest recognition accuracy was for happiness following the musical cues. In terms of the visual cues, the ASD group exhibited the greatest recognition accuracy for anger. This initial study support that musical cues can facilitate emotion recognition in children with ASD. Further research is needed to improve our understanding of the mechanisms involved in emotion recognition and the role of sensory cues play in emotion recognition for children with ASD.

Physiological Responses-Based Emotion Recognition Using Multi-Class SVM with RBF Kernel (RBF 커널과 다중 클래스 SVM을 이용한 생리적 반응 기반 감정 인식 기술)

  • Vanny, Makara;Ko, Kwang-Eun;Park, Seung-Min;Sim, Kwee-Bo
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.19 no.4
    • /
    • pp.364-371
    • /
    • 2013
  • Emotion Recognition is one of the important part to develop in human-human and human computer interaction. In this paper, we have focused on the performance of multi-class SVM (Support Vector Machine) with Gaussian RFB (Radial Basis function) kernel, which has been used to solve the problem of emotion recognition from physiological signals and to improve the accuracy of emotion recognition. The experimental paradigm for data acquisition, visual-stimuli of IAPS (International Affective Picture System) are used to induce emotional states, such as fear, disgust, joy, and neutral for each subject. The raw signals of acquisited data are splitted in the trial from each session to pre-process the data. The mean value and standard deviation are employed to extract the data for feature extraction and preparing in the next step of classification. The experimental results are proving that the proposed approach of multi-class SVM with Gaussian RBF kernel with OVO (One-Versus-One) method provided the successful performance, accuracies of classification, which has been performed over these four emotions.

On the Importance of Tonal Features for Speech Emotion Recognition (음성 감정인식에서의 톤 정보의 중요성 연구)

  • Lee, Jung-In;Kang, Hong-Goo
    • Journal of Broadcast Engineering
    • /
    • v.18 no.5
    • /
    • pp.713-721
    • /
    • 2013
  • This paper describes an efficiency of chroma based tonal features for speech emotion recognition. As the tonality caused by major or minor keys affects to the perception of musical mood, so the speech tonality affects the perception of the emotional states of spoken utterances. In order to justify this assertion with respect to tonality and emotion, subjective hearing tests are carried out by using synthesized signals generated from chroma features, and consequently show that the tonality contributes especially to the perception of the negative emotion such as anger and sad. In automatic emotion recognition tests, the modified chroma-based tonal features are shown to produce noticeable improvement of accuracy when they are supplemented to the conventional log-frequency power coefficient (LFPC)-based spectral features.

A Study on Emotion Recognition of Chunk-Based Time Series Speech (청크 기반 시계열 음성의 감정 인식 연구)

  • Hyun-Sam Shin;Jun-Ki Hong;Sung-Chan Hong
    • Journal of Internet Computing and Services
    • /
    • v.24 no.2
    • /
    • pp.11-18
    • /
    • 2023
  • Recently, in the field of Speech Emotion Recognition (SER), many studies have been conducted to improve accuracy using voice features and modeling. In addition to modeling studies to improve the accuracy of existing voice emotion recognition, various studies using voice features are being conducted. This paper, voice files are separated by time interval in a time series method, focusing on the fact that voice emotions are related to time flow. After voice file separation, we propose a model for classifying emotions of speech data by extracting speech features Mel, Chroma, zero-crossing rate (ZCR), root mean square (RMS), and mel-frequency cepstrum coefficients (MFCC) and applying them to a recurrent neural network model used for sequential data processing. As proposed method, voice features were extracted from all files using 'librosa' library and applied to neural network models. The experimental method compared and analyzed the performance of models of recurrent neural network (RNN), long short-term memory (LSTM) and gated recurrent unit (GRU) using the Interactive emotional dyadic motion capture Interactive Emotional Dyadic Motion Capture (IEMOCAP) english dataset.

Speaker-Dependent Emotion Recognition For Audio Document Indexing

  • Hung LE Xuan;QUENOT Georges;CASTELLI Eric
    • Proceedings of the IEEK Conference
    • /
    • summer
    • /
    • pp.92-96
    • /
    • 2004
  • The researches of the emotions are currently great interest in speech processing as well as in human-machine interaction domain. In the recent years, more and more of researches relating to emotion synthesis or emotion recognition are developed for the different purposes. Each approach uses its methods and its various parameters measured on the speech signal. In this paper, we proposed using a short-time parameter: MFCC coefficients (Mel­Frequency Cepstrum Coefficients) and a simple but efficient classifying method: Vector Quantification (VQ) for speaker-dependent emotion recognition. Many other features: energy, pitch, zero crossing, phonetic rate, LPC... and their derivatives are also tested and combined with MFCC coefficients in order to find the best combination. The other models: GMM and HMM (Discrete and Continuous Hidden Markov Model) are studied as well in the hope that the usage of continuous distribution and the temporal behaviour of this set of features will improve the quality of emotion recognition. The maximum accuracy recognizing five different emotions exceeds $88\%$ by using only MFCC coefficients with VQ model. This is a simple but efficient approach, the result is even much better than those obtained with the same database in human evaluation by listening and judging without returning permission nor comparison between sentences [8]; And this result is positively comparable with the other approaches.

  • PDF

Does Story Enhance Social Cognitive Ability? Associations between Working Memory and Perspective Taking Ability (이야기는 사회인지능력을 향상시키는가? 작업기억과 관점채택 능력과의 관계)

  • Ahn, Dohyun
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.9
    • /
    • pp.101-111
    • /
    • 2019
  • This study was to examine association between working memory and social cognitive ability, and the influence of story-use on social cognitive ability. To this end, this study measured working memory(via n-back), and randomly assigned 82 participants into three groups(5th level intentionality, 3rd-level intentionality, and exposition conditions), and then compared the accuracy of perspective taking and emotion recognition(RMET: Reading Minds in the Eyes Test) as social cognitive ability. The results suggested that perspective taking accuracy was significantly associated with working memory capacity, whereas emotion recognition accuracy was not. Contrary to the hypothesis, perspective taking in the 5th-level intentionality story group were significantly lower than those in the 3rd-level intentionality story group. Emotions recognition accuracy was not significantly different among the three groups. Overall, this study produced inconsistent results, which has been discussed in terms of theory and methods.