Modelling and Synthesis of Emotional Speech on Multimedia Environment

멀티미디어 환경을 위한 정서음성의 모델링 및 합성에 관한 연구

  • 조철우 (창원대학교 공과대학 제어계측공학과) ;
  • 김대현 (창원대학교 공과대학 제어계측공학과)
  • Published : 1999.04.01

Abstract

This paper describes procedures to model and synthesize emotional speech in a multimedia environment. At first, procedures to model the visual representation of emotional speech are proposed. To display the sequences of the images in synchronized form with speech, MSF(Multimedia Speech File) format is proposed and the display software is implemented. Then the emotional speech sinal is collected and analysed to obtain the prosodic characteristics of the emotional speech in limited domain. Multi-emotional sentences are spoken by actors. From the emotional speech signals, prosodic structures are compared in terms of the pseudo-syntactic structure. Based on the analyzed result, neutral speech is transformed into a specific emotinal state by modifying the prosodic structures.

Keywords