Browse > Article

Multi-Emotion Recognition Model with Text and Speech Ensemble  

Yi, Moung Ho (조선대학교 전자공학과)
Lim, Myoung Jin (조선대학교 신산업융합학부)
Shin, Ju Hyun (조선대학교 신산업융합학부)
Publication Information
Smart Media Journal / v.11, no.8, 2022 , pp. 65-72 More about this Journal
Abstract
Due to COVID-19, the importance of non-face-to-face counseling is increasing as the face-to-face counseling method has progressed to non-face-to-face counseling. The advantage of non-face-to-face counseling is that it can be consulted online anytime, anywhere and is safe from COVID-19. However, it is difficult to understand the client's mind because it is difficult to communicate with non-verbal expressions. Therefore, it is important to recognize emotions by accurately analyzing text and voice in order to understand the client's mind well during non-face-to-face counseling. Therefore, in this paper, text data is vectorized using FastText after separating consonants, and voice data is vectorized by extracting features using Log Mel Spectrogram and MFCC respectively. We propose a multi-emotion recognition model that recognizes five emotions using vectorized data using an LSTM model. Multi-emotion recognition is calculated using RMSE. As a result of the experiment, the RMSE of the proposed model was 0.2174, which was the lowest error compared to the model using text and voice data, respectively.
Keywords
FastText; Log Mel Spectrogram; MFCC; Ensemble; Multi-emotion recognition;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 Satt, Aharon, et al., "Efficient Emotion Recognition from Speech Using Deep Learning on Spectrograms," Interspeech, pp. 1089-1093, Stockholm, Sweden, Aug. 2017.
2 김소연, 유헌창, "SNS 비정형 데이터의 한국어 다중감성 분석 기법," 한국컴퓨터교육학회 학술발표대회논문집, 제22권, 제2호, 147-149쪽, 2018년
3 Q. Jin, C. Li, S. Chen and H. Wu, "Speech emotion recognition with acoustic and lexical features," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4749-4753, South Brisbane, Australia, Apr. 2015.
4 Ekman, Paul. "An argument for basic emotions," Cognition & emotion, Vol. 6, No. 3-4, pp. 169-200, Oct. 1992   DOI
5 이명호, 임명진, 신주현, "단어와 문장의 의미를 고려한 비속어 판별 방법," 스마트미디어저널, 제9권, 제3호, 98-106쪽, 2020년 9월
6 임명진, "대화 문맥의 연관성을 적용한 멀티 레이블 감정인식 모델", 조선대학교 박사학위 논문, 2022. 8
7 임명진, 이명호, 신주현, "상담 챗봇의 다차원 감정인식 모델," 스마트미디어저널, 제10권, 제4호, 21-27쪽, 2021년 12월
8 H. S. Kumbhar and S. U. Bhandari, "Speech Emotion Recognition using MFCC features and LSTM network," International Conference On Computing, Communication, Control And Automation, pp. 1-3, Pune, India, Sep. 2019.
9 신동원, 이연수, 장정선, 임해창, "CNN-LSTM을 이용한 대화 문맥 반영과 감정 분류," 한국어정보학회 학술대회, 141-146쪽, 2016년
10 임명진, 박원호, 신주현, "Word2Vec과 LSTM을 활용한 이별 가사 감정 분류," 스마트미디어저널, 제9권 제3호, 90-97쪽, 2020년 9월
11 Plutchik, Robert. "Emotions and life: Perspectives from psychology, biology, and evolution," American Psychological Association, 2003
12 Nancy, A. Maria,et al., "Audio Based Emotion Recognition Using Mel Frequency Cepstral Coeffi-cient and Support Vector Machine," Journal of Computational and Theoretical Nanoscience, Vol. 15, No. 6-7, pp. 2255-2258, Jun. 2018.   DOI
13 Likitha, M. S., et al., "Speech based human emotion recognition using MFCC," 2017 International Con-ference on Wireless Communications, Signal Processing and Networking (WiSPNET), IEEE, pp. 2257-2260, Chennai, India, Mar. 2017.