Speech emotion recognition through time series classification

Kim, Gi-duk;Kim, Mi-sook;Lee, Hack-man;

Proceedings of the Korean Society of Computer Information Conference (한국컴퓨터정보학회:학술대회논문집)

2021.07a
/
Pages.11-13
/
2021

Korean Society of Computer Information (한국컴퓨터정보학회)

Speech emotion recognition through time series classification

시계열 데이터 분류를 통한 음성 감정 인식

Kim, Gi-duk (Dept. of Electricity and Electronic Computer Engineering, Pusan National University) ;
Kim, Mi-sook (Dept. of Multimedia, Pusan National University) ;
Lee, Hack-man (Dept. of Computer Engineering, Pusan National University)

김기덕 (부산대학교 전기전자컴퓨터공학과) ;
김미숙 (부산대학교 멀티미디어협동과정) ;
이학만 (부산대학교 전자계산학과)

Published : 2021.07.14

PDF

Download PDF

⟨ Previous Next ⟩

Abstract

본 논문에서는 시계열 데이터 분류를 통한 음성 감정 인식을 제안한다. mel-spectrogram을 사용하여 음성파일에서 특징을 뽑아내 다변수 시계열 데이터로 변환한다. 이를 Conv1D, GRU, Transformer를 결합한 딥러닝 모델에 학습시킨다. 위의 딥러닝 모델에 음성 감정 인식 데이터 세트인 TESS, SAVEE, RAVDESS, EmoDB에 적용하여 각각의 데이터 세트에서 기존의 모델 보다 높은 정확도의 음성 감정 분류 결과를 얻을 수 있었다. 정확도는 99.60%, 99.32%, 97.28%, 99.86%를 얻었다.

Proceedings of the Korean Society of Computer Information Conference (한국컴퓨터정보학회:학술대회논문집)

Speech emotion recognition through time series classification

시계열 데이터 분류를 통한 음성 감정 인식

Abstract

Keywords

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)