통합 검색 | Korea Science

Speech Emotion Recognition Using 2D-CNN with Mel-Frequency Cepstrum Coefficients

Eom, Youngsik;Bang, Junseong
- Journal of information and communication convergence engineering
- /
- 제19권3호
- /
- pp.148-154
- /
- 2021
With the advent of context-aware computing, many attempts were made to understand emotions. Among these various attempts, Speech Emotion Recognition (SER) is a method of recognizing the speaker's emotions through speech information. The SER is successful in selecting distinctive 'features' and 'classifying' them in an appropriate way. In this paper, the performances of SER using neural network models (e.g., fully connected network (FCN), convolutional neural network (CNN)) with Mel-Frequency Cepstral Coefficients (MFCC) are examined in terms of the accuracy and distribution of emotion recognition. For Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) dataset, by tuning model parameters, a two-dimensional Convolutional Neural Network (2D-CNN) model with MFCC showed the best performance with an average accuracy of 88.54% for 5 emotions, anger, happiness, calm, fear, and sadness, of men and women. In addition, by examining the distribution of emotion recognition accuracies for neural network models, the 2D-CNN with MFCC can expect an overall accuracy of 75% or more.
https://doi.org/10.6109/jicce.2021.19.3.148 인용 PDF KSCI

정신분열병 환자에서의 감정표현불능증과 얼굴정서인식결핍 (Alexithymia and the Recognition of Facial Emotion in Schizophrenic Patients)

노진찬;박성혁;김경희;김소율;신성웅;이건석
- 생물정신의학
- /
- 제18권4호
- /
- pp.239-244
- /
- 2011
Objectives Schizophrenic patients have been shown to be impaired in both emotional self-awareness and recognition of others' facial emotions. Alexithymia refers to the deficits in emotional self-awareness. The relationship between alexithymia and recognition of others' facial emotions needs to be explored to better understand the characteristics of emotional deficits in schizophrenic patients. Methods Thirty control subjects and 31 schizophrenic patients completed the Toronto Alexithymia Scale-20-Korean version (TAS-20K) and facial emotion recognition task. The stimuli in facial emotion recognition task consist of 6 emotions (happiness, sadness, anger, fear, disgust, and neutral). Recognition accuracy was calculated within each emotion category. Correlations between TAS-20K and recognition accuracy were analyzed. Results The schizophrenic patients showed higher TAS-20K scores and lower recognition accuracy compared with the control subjects. The schizophrenic patients did not demonstrate any significant correlations between TAS-20K and recognition accuracy, unlike the control subjects. Conclusions The data suggest that, although schizophrenia may impair both emotional self-awareness and recognition of others' facial emotions, the degrees of deficit can be different between emotional self-awareness and recognition of others' facial emotions. This indicates that the emotional deficits in schizophrenia may assume more complex features.
PDF KSCI

사용자의 성향 기반의 얼굴 표정을 통한 감정 인식률 향상을 위한 연구 (A study on the enhancement of emotion recognition through facial expression detection in user's tendency)

이종식;신동희
- 감성과학
- /
- 제17권1호
- /
- pp.53-62
- /
- 2014
인간의 감정을 인식하는 기술은 많은 응용분야가 있음에도 불구하고 감정 인식의 어려움으로 인해 쉽게 해결되지 않는 문제로 남아 있다. 인간의 감정 은 크게 영상과 음성을 이용하여 인식이 가능하다. 감정 인식 기술은 영상을 기반으로 하는 방법과 음성을 이용하는 방법 그리고 두 가지를 모두 이용하는 방법으로 많은 연구가 진행 중에 있다. 이 중에 특히 인간의 감정을 가장 보편적으로 표현되는 방식이 얼굴 영상을 이용한 감정 인식 기법에 대한 연구가 활발히 진행 중이다. 그러나 지금까지 사용자의 환경과 이용자 적응에 따라 많은 차이와 오류를 접하게 된다. 본 논문에서는 감정인식률을 향상시키기 위해서는 이용자의 내면적 성향을 이해하고 분석하여 이에 따라 적절한 감정인식의 정확도에 도움을 주어서 감정인식률을 향상 시키는 메카니즘을 제안하였으며 본 연구는 이러한 이용자의 내면적 성향을 분석하여 감정 인식 시스템에 적용함으로 얼굴 표정에 따른 감정인식에 대한 오류를 줄이고 향상 시킬 수 있다. 특히 얼굴표정 미약한 이용자와 감정표현에 인색한 이용자에게 좀 더 향상된 감정인식률을 제공 할 수 있는 방법을 제안하였다.
https://doi.org/10.14695/KJSOS.2014.17.1.53 인용 PDF KSCI

음성감정인식 성능 향상을 위한 트랜스포머 기반 전이학습 및 다중작업학습 (Transformer-based transfer learning and multi-task learning for improving the performance of speech emotion recognition)

박순찬;김형순
- 한국음향학회지
- /
- 제40권5호
- /
- pp.515-522
- /
- 2021
음성감정인식을 위한 훈련 데이터는 감정 레이블링의 어려움으로 인해 충분히 확보하기 어렵다. 본 논문에서는 음성감정인식의 성능 개선을 위해 트랜스포머 기반 모델에 대규모 음성인식용 훈련 데이터를 통한 전이학습을 적용한다. 또한 음성인식과의 다중작업학습을 통해 별도의 디코딩 없이 문맥 정보를 활용하는 방법을 제안한다. IEMOCAP 데이터 셋을 이용한 음성감정인식 실험을 통해, 가중정확도 70.6 % 및 비가중정확도 71.6 %를 달성하여, 제안된 방법이 음성감정인식 성능 향상에 효과가 있음을 보여준다.
https://doi.org/10.7776/ASK.2021.40.5.515 인용 PDF KSCI

Multimodal Parametric Fusion for Emotion Recognition

Kim, Jonghwa
- International journal of advanced smart convergence
- /
- 제9권1호
- /
- pp.193-201
- /
- 2020
The main objective of this study is to investigate the impact of additional modalities on the performance of emotion recognition using speech, facial expression and physiological measurements. In order to compare different approaches, we designed a feature-based recognition system as a benchmark which carries out linear supervised classification followed by the leave-one-out cross-validation. For the classification of four emotions, it turned out that bimodal fusion in our experiment improves recognition accuracy of unimodal approach, while the performance of trimodal fusion varies strongly depending on the individual. Furthermore, we experienced extremely high disparity between single class recognition rates, while we could not observe a best performing single modality in our experiment. Based on these observations, we developed a novel fusion method, called parametric decision fusion (PDF), which lies in building emotion-specific classifiers and exploits advantage of a parametrized decision process. By using the PDF scheme we achieved 16% improvement in accuracy of subject-dependent recognition and 10% for subject-independent recognition compared to the best unimodal results.
https://doi.org/10.7236/IJASC.2020.9.1.193 인용 PDF KSCI

Emotion Recognition Implementation with Multimodalities of Face, Voice and EEG

Udurume, Miracle;Caliwag, Angela;Lim, Wansu;Kim, Gwigon
- Journal of information and communication convergence engineering
- /
- 제20권3호
- /
- pp.174-180
- /
- 2022
Emotion recognition is an essential component of complete interaction between human and machine. The issues related to emotion recognition are a result of the different types of emotions expressed in several forms such as visual, sound, and physiological signal. Recent advancements in the field show that combined modalities, such as visual, voice and electroencephalography signals, lead to better result compared to the use of single modalities separately. Previous studies have explored the use of multiple modalities for accurate predictions of emotion; however the number of studies regarding real-time implementation is limited because of the difficulty in simultaneously implementing multiple modalities of emotion recognition. In this study, we proposed an emotion recognition system for real-time emotion recognition implementation. Our model was built with a multithreading block that enables the implementation of each modality using separate threads for continuous synchronization. First, we separately achieved emotion recognition for each modality before enabling the use of the multithreaded system. To verify the correctness of the results, we compared the performance accuracy of unimodal and multimodal emotion recognitions in real-time. The experimental results showed real-time user emotion recognition of the proposed model. In addition, the effectiveness of the multimodalities for emotion recognition was observed. Our multimodal model was able to obtain an accuracy of 80.1% as compared to the unimodality, which obtained accuracies of 70.9, 54.3, and 63.1%.
https://doi.org/10.56977/jicce.2022.20.3.174 인용 PDF KSCI

뇌파 및 심전도 복합 생체신호를 이용한 실시간 감정인식 인터페이스 연구 (Research of Real-Time Emotion Recognition Interface Using Multiple Physiological Signals of EEG and ECG)

신동민;신동일;신동규
- 한국게임학회 논문지
- /
- 제15권2호
- /
- pp.105-114
- /
- 2015
뇌파 및 심전도 생체신호를 복합적으로 이용한 감정인식을 통한 실시간 사용자 인터페이스를 제안한다. 기존에 뇌파를 통한 감정인식의 문제점이었던 낮은 정확도를 개선하기 위해 뇌파의 Theta, Alpha, Beta, Gamma의 상대파워 값과 심전도의 자율신경계 비율을 혼합하는 복합 생체신호 감정 인식 시스템을 개발했다. 기쁨, 공포, 슬픔, 즐거움, 화남, 혐오에 해당하는 6가지 감정을 인식하기 위해 사용자별 확률 값을 저장하는 데이터 맵을 생성하고, 채널에 대응하는 감정 인식의 정확도를 향상시키기 위해 가중치를 갱신하는 알고리즘을 제안한다. 또한 뇌파로 구성된 단일 데이터와 뇌파/심전도 생체신호 복합 데이터의 실험 결과를 비교한 결과 23.77%의 정확도 증가를 보였다. 제안된 인터페이스 시스템은 높은 정확도를 통해 게임 및 스마트 공간의 제어에 필요한 인터페이스로 기기에 활용이 가능할 것이다.
https://doi.org/10.7583/JKGS.2015.15.2.105 인용 PDF KSCI

얼굴 특징점 추적을 통한 사용자 감성 인식 (Emotion Recognition based on Tracking Facial Keypoints)

이용환;김흥준
- 반도체디스플레이기술학회지
- /
- 제18권1호
- /
- pp.97-101
- /
- 2019
Understanding and classification of the human's emotion play an important tasks in interacting with human and machine communication systems. This paper proposes a novel emotion recognition method by extracting facial keypoints, which is able to understand and classify the human emotion, using active Appearance Model and the proposed classification model of the facial features. The existing appearance model scheme takes an expression of variations, which is calculated by the proposed classification model according to the change of human facial expression. The proposed method classifies four basic emotions (normal, happy, sad and angry). To evaluate the performance of the proposed method, we assess the ratio of success with common datasets, and we achieve the best 93% accuracy, average 82.2% in facial emotion recognition. The results show that the proposed method effectively performed well over the emotion recognition, compared to the existing schemes.
PDF KSCI

Half-Against-Half Multi-class SVM Classify Physiological Response-based Emotion Recognition

;고광은;박승민;심귀보
- 한국지능시스템학회논문지
- /
- 제23권3호
- /
- pp.262-267
- /
- 2013
The recognition of human emotional state is one of the most important components for efficient human-human and human- computer interaction. In this paper, four emotions such as fear, disgust, joy, and neutral was a main problem of classifying emotion recognition and an approach of visual-stimuli for eliciting emotion based on physiological signals of skin conductance (SC), skin temperature (SKT), and blood volume pulse (BVP) was used to design the experiment. In order to reach the goal of solving this problem, half-against-half (HAH) multi-class support vector machine (SVM) with Gaussian radial basis function (RBF) kernel was proposed showing the effective techniques to improve the accuracy rate of emotion classification. The experimental results proved that the proposed was an efficient method for solving the emotion recognition problems with the accuracy rate of 90% of neutral, 86.67% of joy, 85% of disgust, and 80% of fear.
https://doi.org/10.5391/JKIIS.2013.23.3.262 인용 PDF KSCI

Classification of Three Different Emotion by Physiological Parameters

Jang, Eun-Hye;Park, Byoung-Jun;Kim, Sang-Hyeob;Sohn, Jin-Hun
- 대한인간공학회지
- /
- 제31권2호
- /
- pp.271-279
- /
- 2012
Objective: This study classified three different emotional states(boredom, pain, and surprise) using physiological signals. Background: Emotion recognition studies have tried to recognize human emotion by using physiological signals. It is important for emotion recognition to apply on human-computer interaction system for emotion detection. Method: 122 college students participated in this experiment. Three different emotional stimuli were presented to participants and physiological signals, i.e., EDA(Electrodermal Activity), SKT(Skin Temperature), PPG(Photoplethysmogram), and ECG (Electrocardiogram) were measured for 1 minute as baseline and for 1~1.5 minutes during emotional state. The obtained signals were analyzed for 30 seconds from the baseline and the emotional state and 27 features were extracted from these signals. Statistical analysis for emotion classification were done by DFA(discriminant function analysis) (SPSS 15.0) by using the difference values subtracting baseline values from the emotional state. Results: The result showed that physiological responses during emotional states were significantly differed as compared to during baseline. Also, an accuracy rate of emotion classification was 84.7%. Conclusion: Our study have identified that emotions were classified by various physiological signals. However, future study is needed to obtain additional signals from other modalities such as facial expression, face temperature, or voice to improve classification rate and to examine the stability and reliability of this result compare with accuracy of emotion classification using other algorithms. Application: This could help emotion recognition studies lead to better chance to recognize various human emotions by using physiological signals as well as is able to be applied on human-computer interaction system for emotion recognition. Also, it can be useful in developing an emotion theory, or profiling emotion-specific physiological responses as well as establishing the basis for emotion recognition system in human-computer interaction.
https://doi.org/10.5143/JESK.2012.31.2.271 인용 PDF KSCI

검색결과 93건 처리시간 0.023초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)