Implementation of Text-to-Audio Visual Speech Synthesis Using Key Frames of Face Images

키프레임 얼굴영상을 이용한 시청각음성합성 시스템 구현

  • Published : 2002.06.01

Abstract

In this paper, for natural facial synthesis, lip-synch algorithm based on key-frame method using RBF(radial bases function) is presented. For lips synthesizing, we make viseme range parameters from phoneme and its duration information that come out from the text-to-speech(TTS) system. And we extract viseme information from Av DB that coincides in each phoneme. We apply dominance function to reflect coarticulation phenomenon, and apply bilinear interpolation to reduce calculation time. At the next time lip-synch is performed by playing the synthesized images obtained by interpolation between each phonemes and the speech sound of TTS.

Keywords