• Title/Summary/Keyword: 음성감정인식

Search Result 142, Processing Time 0.022 seconds

Trends in Brain Wave Signal and Application Technology (뇌파신호 및 응용 기술 동향)

  • Kim, D.Y.;Lee, J.H.;Park, M.H.;Choi, Y.H.;Park, Y.O.
    • Electronics and Telecommunications Trends
    • /
    • v.32 no.2
    • /
    • pp.19-28
    • /
    • 2017
  • 뇌파신호는 사람의 생각이나 감정을 가장 현실적인 방법으로 취득하여 해석하고 분석할 수 있는 유용한 정보원이다. 뇌파는 음성인식 이후에 사람과 사람, 사람과 사물, 사람과 컴퓨터 간에 편리하고 가장 자연스러운 초연결(Hyper-Connection) 접속과 통신을 가능하게 하는 유력하고 궁극적인 수단이다. 하지만 뇌파를 두뇌 활동 시 발생하는 신경세포와 신경세포 사이에 형성된 시냅스들의 화학적 활성화에 의한 전자기적 신호 평균의 총합으로만 해석하는 한, 뇌과학에서 이룩한 복잡한 사람의 생각과 감정 패턴과의 연결 해석이 불가능한 한계가 발생한다. 본고에서는 이를 극복하여 뇌파를 미래의 초연결 접속과 통신 수단으로 활용 가능하도록 하기 위한 기술적 가치와 가능성을 재발견하기 위하여 뇌과학에서 밝혀지고 있는 생각과 감정회로와 연동 해석하기 위한 뇌파신호의 처리, 해석 및 응용 기술 동향에 대해 기술한다.

  • PDF

Extracting Speech Parameters for intonational Differences between the Seoul Dialect and the other Dialects of Korean (서울말과 방언사이의 억양차이 파라미터 추출)

  • Lee, Kang-Hee
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2016.01a
    • /
    • pp.211-212
    • /
    • 2016
  • 음성 인식 기술은 상당 기간 연구 개발이 계속적으로 되었는데 최근에 이르러 스마트 폰이 급속히 확산되면서 그 필요성, 즉, 고품질의 상용 서비스에 대한 욕구가 널리 확산되고 있다. 이런 환경에서도 사실 한국어는 상대적으로 소홀히 다뤄 질 수 있는 가능성이 매우 높다. 이러한 소외는 기술적인 문제로만 남겨지는 것이 아니라 언어라는 측면에서 문화와 직결된다. 이에 한국어 음성 인식 연구는 꼭 필요한 것이고 많은 부분 국가가 정책적으로 지원을 하는 것이 마땅하나 현 상황은 많이 미흡하나 아마도 곧 그 필요성이 대두 될 것이라 예상하며 그를 준비하는 연구로 특화된 분야, 즉, 표준어와 방언들 그리고 감정 표현 언어에 관한 연구를 한다.

  • PDF

Therapeutic Robot Action Design for ASD Children Using Speech Data (음성 정보를 이용한 자폐아 치료용 로봇의 동작 설계)

  • Lee, Jin-Gyu;Lee, Bo-Hee
    • Journal of IKEEE
    • /
    • v.22 no.4
    • /
    • pp.1123-1130
    • /
    • 2018
  • A cat robot for the Autism Spectrum Disorders(ASD) treatment was designed and conducted field test. The designed robot had emotion expressing action through interaction by the touch, and performed a reasonable emotional expression based on Artificial Neural Network(ANN). However these operations were difficult to use in the various healing activities. In this paper, we describe a motion design that can be used in a variety of contexts and flexibly reaction with various kinds of situations. As a necessary element, the speech recognition system using the speech data collection method and ANN was suggested and the classification results were analyzed after experiment. This ANN will be improved through collecting various voice data to raise the accuracy in the future and checked the effectiveness through field test.

Speech Emotion Recognition on a Simulated Intelligent Robot (모의 지능로봇에서의 음성 감정인식)

  • Jang Kwang-Dong;Kim Nam;Kwon Oh-Wook
    • MALSORI
    • /
    • no.56
    • /
    • pp.173-183
    • /
    • 2005
  • We propose a speech emotion recognition method for affective human-robot interface. In the Proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes Pitch, jitter, duration, and rate of speech. Finally a pattern classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5 different directions. Experimental results show that the proposed method yields $48\%$ classification accuracy while human classifiers give $71\%$ accuracy.

  • PDF

Feature Vector Processing for Speech Emotion Recognition in Noisy Environments (잡음 환경에서의 음성 감정 인식을 위한 특징 벡터 처리)

  • Park, Jeong-Sik;Oh, Yung-Hwan
    • Phonetics and Speech Sciences
    • /
    • v.2 no.1
    • /
    • pp.77-85
    • /
    • 2010
  • This paper proposes an efficient feature vector processing technique to guard the Speech Emotion Recognition (SER) system against a variety of noises. In the proposed approach, emotional feature vectors are extracted from speech processed by comb filtering. Then, these extracts are used in a robust model construction based on feature vector classification. We modify conventional comb filtering by using speech presence probability to minimize drawbacks due to incorrect pitch estimation under background noise conditions. The modified comb filtering can correctly enhance the harmonics, which is an important factor used in SER. Feature vector classification technique categorizes feature vectors into either discriminative vectors or non-discriminative vectors based on a log-likelihood criterion. This method can successfully select the discriminative vectors while preserving correct emotional characteristics. Thus, robust emotion models can be constructed by only using such discriminative vectors. On SER experiment using an emotional speech corpus contaminated by various noises, our approach exhibited superior performance to the baseline system.

  • PDF

Extraction of Speech Features for Emotion Recognition (감정 인식을 위한 음성 특징 도출)

  • Kwon, Chul-Hong;Song, Seung-Kyu;Kim, Jong-Yeol;Kim, Keun-Ho;Jang, Jun-Su
    • Phonetics and Speech Sciences
    • /
    • v.4 no.2
    • /
    • pp.73-78
    • /
    • 2012
  • Emotion recognition is an important technology in the filed of human-machine interface. To apply speech technology to emotion recognition, this study aims to establish a relationship between emotional groups and their corresponding voice characteristics by investigating various speech features. The speech features related to speech source and vocal tract filter are included. Experimental results show that statistically significant speech parameters for classifying the emotional groups are mainly related to speech sources such as jitter, shimmer, F0 (F0_min, F0_max, F0_mean, F0_std), harmonic parameters (H1, H2, HNR05, HNR15, HNR25, HNR35), and SPI.

Improved speech emotion recognition using histogram equalization and data augmentation techniques (히스토그램 등화와 데이터 증강 기법을 이용한 개선된 음성 감정 인식)

  • Heo, Woon-Haeng;Kwon, Oh-Wook
    • Phonetics and Speech Sciences
    • /
    • v.9 no.2
    • /
    • pp.77-83
    • /
    • 2017
  • We propose a new method to reduce emotion recognition errors caused by variation in speaker characteristics and speech rate. Firstly, for reducing variation in speaker characteristics, we adjust features from a test speaker to fit the distribution of all training data by using the histogram equalization (HE) algorithm. Secondly, for dealing with variation in speech rate, we augment the training data with speech generated in various speech rates. In computer experiments using EMO-DB, KRN-DB and eNTERFACE-DB, the proposed method is shown to improve weighted accuracy relatively by 34.7%, 23.7% and 28.1%, respectively.

Speech Emotion Recognition by Speech Signals on a Simulated Intelligent Robot (모의 지능로봇에서 음성신호에 의한 감정인식)

  • Jang, Kwang-Dong;Kwon, Oh-Wook
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.163-166
    • /
    • 2005
  • We propose a speech emotion recognition method for natural human-robot interface. In the proposed method, emotion is classified into 6 classes: Angry, bored, happy, neutral, sad and surprised. Features for an input utterance are extracted from statistics of phonetic and prosodic information. Phonetic information includes log energy, shimmer, formant frequencies, and Teager energy; Prosodic information includes pitch, jitter, duration, and rate of speech. Finally a patten classifier based on Gaussian support vector machines decides the emotion class of the utterance. We record speech commands and dialogs uttered at 2m away from microphones in 5different directions. Experimental results show that the proposed method yields 59% classification accuracy while human classifiers give about 50%accuracy, which confirms that the proposed method achieves performance comparable to a human.

  • PDF

Multi-modal Emotion Recognition using Semi-supervised Learning and Multiple Neural Networks in the Wild (준 지도학습과 여러 개의 딥 뉴럴 네트워크를 사용한 멀티 모달 기반 감정 인식 알고리즘)

  • Kim, Dae Ha;Song, Byung Cheol
    • Journal of Broadcast Engineering
    • /
    • v.23 no.3
    • /
    • pp.351-360
    • /
    • 2018
  • Human emotion recognition is a research topic that is receiving continuous attention in computer vision and artificial intelligence domains. This paper proposes a method for classifying human emotions through multiple neural networks based on multi-modal signals which consist of image, landmark, and audio in a wild environment. The proposed method has the following features. First, the learning performance of the image-based network is greatly improved by employing both multi-task learning and semi-supervised learning using the spatio-temporal characteristic of videos. Second, a model for converting 1-dimensional (1D) landmark information of face into two-dimensional (2D) images, is newly proposed, and a CNN-LSTM network based on the model is proposed for better emotion recognition. Third, based on an observation that audio signals are often very effective for specific emotions, we propose an audio deep learning mechanism robust to the specific emotions. Finally, so-called emotion adaptive fusion is applied to enable synergy of multiple networks. The proposed network improves emotion classification performance by appropriately integrating existing supervised learning and semi-supervised learning networks. In the fifth attempt on the given test set in the EmotiW2017 challenge, the proposed method achieved a classification accuracy of 57.12%.

Speaker and Context Independent Emotion Recognition using Speech Signal (음성을 이용한 화자 및 문장독립 감정인식)

  • 강면구;김원구
    • Proceedings of the IEEK Conference
    • /
    • 2002.06d
    • /
    • pp.377-380
    • /
    • 2002
  • In this paper, speaker and context independent emotion recognition using speech signal is studied. For this purpose, a corpus of emotional speech data recorded and classified according to the emotion using the subjective evaluation were used to make statical feature vectors such as average, standard deviation and maximum value of pitch and energy and to evaluate the performance of the conventional pattern matching algorithms. The vector quantization based emotion recognition system is proposed for speaker and context independent emotion recognition. Experimental results showed that vector quantization based emotion recognizer using MFCC parameters showed better performance than that using the Pitch and energy Parameters.

  • PDF