Search | Korea Science

Audio and Video Bimodal Emotion Recognition in Social Networks Based on Improved AlexNet Network and Attention Mechanism

Liu, Min;Tang, Jun
- Journal of Information Processing Systems
- /
- v.17 no.4
- /
- pp.754-771
- /
- 2021
In the task of continuous dimension emotion recognition, the parts that highlight the emotional expression are not the same in each mode, and the influences of different modes on the emotional state is also different. Therefore, this paper studies the fusion of the two most important modes in emotional recognition (voice and visual expression), and proposes a two-mode dual-modal emotion recognition method combined with the attention mechanism of the improved AlexNet network. After a simple preprocessing of the audio signal and the video signal, respectively, the first step is to use the prior knowledge to realize the extraction of audio characteristics. Then, facial expression features are extracted by the improved AlexNet network. Finally, the multimodal attention mechanism is used to fuse facial expression features and audio features, and the improved loss function is used to optimize the modal missing problem, so as to improve the robustness of the model and the performance of emotion recognition. The experimental results show that the concordance coefficient of the proposed model in the two dimensions of arousal and valence (concordance correlation coefficient) were 0.729 and 0.718, respectively, which are superior to several comparative algorithms.
https://doi.org/10.3745/JIPS.02.0161 인용 PDF KSCI

Incomplete Cholesky Decomposition based Kernel Cross Modal Factor Analysis for Audiovisual Continuous Dimensional Emotion Recognition

Li, Xia;Lu, Guanming;Yan, Jingjie;Li, Haibo;Zhang, Zhengyan;Sun, Ning;Xie, Shipeng
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.2
- /
- pp.810-831
- /
- 2019
Recently, continuous dimensional emotion recognition from audiovisual clues has attracted increasing attention in both theory and in practice. The large amount of data involved in the recognition processing decreases the efficiency of most bimodal information fusion algorithms. A novel algorithm, namely the incomplete Cholesky decomposition based kernel cross factor analysis (ICDKCFA), is presented and employed for continuous dimensional audiovisual emotion recognition, in this paper. After the ICDKCFA feature transformation, two basic fusion strategies, namely feature-level fusion and decision-level fusion, are explored to combine the transformed visual and audio features for emotion recognition. Finally, extensive experiments are conducted to evaluate the ICDKCFA approach on the AVEC 2016 Multimodal Affect Recognition Sub-Challenge dataset. The experimental results show that the ICDKCFA method has a higher speed than the original kernel cross factor analysis with the comparable performance. Moreover, the ICDKCFA method achieves a better performance than other common information fusion methods, such as the Canonical correlation analysis, kernel canonical correlation analysis and cross-modal factor analysis based fusion methods.
https://doi.org/10.3837/tiis.2019.02.018 인용 PDF KSCI HTML

An Analysis on the Image and Visual Preference of the Environmental Sculpture in Urban Streetscapes (도시가로 경관에 있어 환경조형물의 이미지 및 시각적 선호도 분석)

Suh, Joo-Hwan;Park, Tae-Hie;Heo, Jun
- Journal of the Korean Institute of Landscape Architecture
- /
- v.32 no.1
- /
- pp.57-68
- /
- 2004
The purpose of this paper is to discover the Image and Visual Preference of the Environmental Sculpture in Urban Streetscape. For this, the analysis was preformed by the data obtained from questionnaires and from the photos of the environmental sculptures scene. The landscape image was analyzed by the factor analysis algorithm. The level of visual preference was measured by a slide simulation test, and this data was analyzed by multiple regression. The results of this study can be summarized as follows: The visual preference of the environmental sculpture has been evaluated to average 4.03 on a scale of 7 Landscape slides No. 11 and No.5 were ranked more highly for visual preference. Factors formulating the landscape image were found to be 'beauty', 'orderliness', 'emotion', and 'formation'. By using the control method for the number of factors, T.V., were obtained as 63.0%. For all experimental landscape slides, the factor of orderliness was found to be the main factor determining the visual preference. The 4 factors for visual preference were analyzed by regression as follows: Visual Preference = 3.996 + 0.341(FS1) + 0.595(FS2) + 0.222(FS3) + 0.011(FS4), R-Square = 0.520.
PDF KSCI

A Study on Evaluation of Visual Factor for Measuring Subjective Virtual Realization (주관적인 가상 실감화 측정 방법에 대한 시각적 요소 평가 연구)

Won, Myeung-Ju;Park, Sang-In;Kim, Chi-Jung;Lee, Eui-Chul;Whang, Min-Cheol
- Science of Emotion and Sensibility
- /
- v.15 no.3
- /
- pp.389-398
- /
- 2012
Virtual worlds have pursued reality as if they actually exist. In order to evaluate the sense of reality in the computer-simulated worlds, several subjective questionnaires, which include specific independent variables, have been proposed in the literature. However, the questionnaires lack reliability and validity necessary for defining and measuring the virtual realization. Few studies have been conducted to investigate the effect of visual factors on the sense of reality experienced by exposing to a virtual environment. Therefore, this study was aimed at reinvestigating the variables and proposing a more reliable and advisable questionnaire for evaluating the virtual realization, focusing on visual factors. Twenty-one questions were gleaned from the literature and subjective interviews with focused groups. Exploratory factor analysis with oblique rotation was performed on the data obtained from 200 participants(females: 100) after exposing to a virtual character image described in an extreme way. After removing poorly loading items, remained subsets were subjected to confirmatory factor analysis on the data obtained from the same participants. As a result, 3 significant factors were determined to efficiently measure the virtual realization. The determined factors included visual presence(3 subset items), visual immersion(7 subset items), and visual interactivity(4 subset items). The proposed factors were verified by conducting a subjective evaluation in which participants were asked to evaluate a 3D virtual eyeball model based on the visual presence. The results implicated that the measurement method was suitable for evaluating the degree of the virtual realization. The proposed method is expected to reasonably measure the degree of the virtual realization.
PDF

Criterion Suggestion on Relative Disparity, Viewing Distance and Viewing Angle to Minimize 3D Visual Fatigue for Pattern-Retarded Type 3D Display (편광식 3D 디스플레이를 위한 상대적 시차, 시청 거리, 시청 방위에서의 시각피로 최소화 기준 제안)

Park, Jong-Jin;Kim, Shinwoo;Li, Hyung-Chul O.
- Science of Emotion and Sensibility
- /
- v.19 no.1
- /
- pp.61-70
- /
- 2016
3D visual fatigue is known as one of the most important factors that interfere the commercial success of 3D contents. Vergence-accommodation conflict, which is known to occur when an observer watches an image containing binocular disparity presented on a 3D display, has been suggested as a major cause of 3D visual fatigue. This implies that any image incorporating binocular disparity might cause 3D visual fatigue. In order to reduce 3D visual fatigue, it would be necessary to consider indirect ways of reducing 3D visual fatigue as well as eliminating the direct causes of 3D visual fatigue. We have examined the effect of the variables that are expected to affect subjective 3D visual fatigue and these variables included the relative disparity contained in an image, viewing distance and viewing angle. We have also figured out the proper levels of the variables required to minimize 3D visual fatigue. The results indicate that observers began to report significant 3D visual fatigue when the crossed disparity contained in an image exceeded 7.22' and the vertical viewing angle was larger than 15 degree.
https://doi.org/10.14695/KJSOS.2016.19.1.61 인용 PDF KSCI

Applying Emotional Information Retrieval Method to Information Appliances Design -The Use of Color Information for Mobile Emotion Retrieval System- (감성검색법을 기초로 한 정보기기 콘텐츠 디자인 연구 -색채정보를 이용한 모바일 감성검색시스템을 사례로-)

Kim, Don-Han;Seo, Kyung-Ho
- Science of Emotion and Sensibility
- /
- v.13 no.3
- /
- pp.501-510
- /
- 2010
The knowledge base on emotional information is one of the key elements in the implementation of emotion retrieval systems for contents design of Mobile devices. This study proposed a new approach to the knowledge base implementation by automatically extracting color components from full-color images. In this study, the validity of the proposed method was empirically tested. Database was developed using 100 interior images as visual stimuli and a total of 48 subjects participated in the experiment. In order to test the reliability of the proposed 'emotional information knowledge base', firstly 'recall ratio' that refers to frequencies of correct images from the retrieved images was derived. Secondly, correlation Analysis was performed to compare the ratings by the subjects to what the system calculated. Finally, the rating comparison was used to run a paired-sample t-test. The analysis demonstrated satisfactory recall ration of 62.1%. Also, a significant positive correlation (p<.01) was observed from all the emotion keywords. The paired Sample t-test found that all the emotion keywords except "casual" retrieved the images in the order from more relevant to less relevant images and the difference was statistically significant (t(9)=5.528, p<.05). Findings of this study support that the proposed 'emotional information knowledge base' established only with color information automatically extracted from images can be effectively used for such visual stimuli search tasks as commercial interior images.
PDF

ANS Responses Induced by Humor and Joy Using Audio-visual Film Clips (동영상 자극에 의해 유발된 유머 및 기쁨 정서에 따른 아동의 자율신경계 반응)

Jang, Eun-Hye;Sung, Soon-Im;Lee, Young-Chang;Eom, Jin-Sup;Sohn, Jin-Hun
- Science of Emotion and Sensibility
- /
- v.10 no.2
- /
- pp.263-271
- /
- 2007
A review of recent studies indicates that positive emotions help buffer stress. In particular, humor is what only human can appreciate. Humor plays an important role in many facets of human life including psychological, social, and somatic functioning. This study is to identify children's ANS responses by humor or joy using audio-visual film clips. 49 male and female children (12-13 years) participated in this study. The participants were briefed on the experiment was its procedure. During the experiment, electrodes were attached to participant and middle fingers of left hands. Physiological responses(EDA, SKT, PPG and ECG) were measured for 30s both in the resting state and experiment state during which emotion provoking stimulus was presented for 2 min. Also, self-report on emotions was followed for 1 min. Results from the self-report showed that both humor and joy were evoked effectively among 89.3% of children. It explains that emotion-eliciting stimuli were effective and appropriate. ANS responses in both emotion were significant in three physiological variables (i.e., SCL, NSCR, LF). There was a significant difference between humor emotion and joy emotion, and both emotions were distinguishable by a specific ANS response pattern.
PDF

Spontaneous Speech Emotion Recognition Based On Spectrogram With Convolutional Neural Network (CNN 기반 스펙트로그램을 이용한 자유발화 음성감정인식)

Guiyoung Son;Soonil Kwon
- The Transactions of the Korea Information Processing Society
- /
- v.13 no.6
- /
- pp.284-290
- /
- 2024
Speech emotion recognition (SER) is a technique that is used to analyze the speaker's voice patterns, including vibration, intensity, and tone, to determine their emotional state. There has been an increase in interest in artificial intelligence (AI) techniques, which are now widely used in medicine, education, industry, and the military. Nevertheless, existing researchers have attained impressive results by utilizing acted-out speech from skilled actors in a controlled environment for various scenarios. In particular, there is a mismatch between acted and spontaneous speech since acted speech includes more explicit emotional expressions than spontaneous speech. For this reason, spontaneous speech-emotion recognition remains a challenging task. This paper aims to conduct emotion recognition and improve performance using spontaneous speech data. To this end, we implement deep learning-based speech emotion recognition using the VGG (Visual Geometry Group) after converting 1-dimensional audio signals into a 2-dimensional spectrogram image. The experimental evaluations are performed on the Korean spontaneous emotional speech database from AI-Hub, consisting of 7 emotions, i.e., joy, love, anger, fear, sadness, surprise, and neutral. As a result, we achieved an average accuracy of 83.5% and 73.0% for adults and young people using a time-frequency 2-dimension spectrogram, respectively. In conclusion, our findings demonstrated that the suggested framework outperformed current state-of-the-art techniques for spontaneous speech and showed a promising performance despite the difficulty in quantifying spontaneous speech emotional expression.
https://doi.org/10.3745/TKIPS.2024.13.6.284 인용 PDF

The Effects of Color Hue-Tone on Recognizing Emotions of Characters in the Film, Les Misérables

Kim, Yu-Jin
- Science of Emotion and Sensibility
- /
- v.18 no.1
- /
- pp.67-78
- /
- 2015
This study investigated whether people experience a correspondence between color hue-tone and the main characters' emotions in the 2012 British musical drama film, Les $Mis\grave{e}rables$ through three practical experiments. Six screen images, which represent the characters' different emotions (Parrot's six primary types including love, joy, surprise, anger, sadness, and fear) were selected. For each screen image, participants were asked to judge the degree of the character's dominant emotions evoked from 17 varied screen images, which consisted of original chromatic and achromatized images as well as 15 color-filtered images (5 hues X 3 tones of the IRI color system). These tasks revealed that a chromatic color scheme is more effective to deliver the characters' positive emotions (i.e. love and joy) than an achromatic one. In addition, they proved that the hue and tone dimensions partially influence the relationships between the character emotions and colors.
https://doi.org/10.14695/KJSOS.2015.18.1.67 인용 PDF KSCI

A study of motion gui for emotional communication (감성정보전달을 위한 Motion GUI 연구)

Ban, Sang-Hui;Kim, Yeong-Seon
- Proceedings of the Korean Society for Emotion and Sensibility Conference
- /
- 2007.05a
- /
- pp.146-148
- /
- 2007
미디어기기들이 디지털화 된 이후, GUI는 디지털기기의 활용도를 높여주는 매개체역할을 해왔다. 최근 출시되는 기기들의 그래픽 사양이 높아지면서, GUI는 어플리케이션의 접근성을 높여주는 소극적인 요소에서 사용자에게 다양하고 새로운 감성과 가치를 제공하는 보다 적극적인 요소로서 그 역할이 확대되고 있다. 본 논문에서는 사용자의 감성 만족도를 향상시키기 위해 시각적 주목성이 높은 Motion GUI를 도입하였다. Motion GUI의 시각 구성 요소들을 분류하고 각 요소에 따른 감성 정보를 도출한다. 또한 최소한의 그래픽으로 구성되어 있었던 기존 미디어 플레이어 화면에, 도출된 연구 결과를 도입하여 디자인한 Motion GUI를 적용하고 그 사용성과 사용자의 감성 만족도에 대한 결과를 기술한다.
PDF

Search Result 489, Processing Time 0.029 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)