MPEG-4TTS 현황 및 전망

  • 발행 : 1997.09.01

초록

Text-to-Speech(WS) technology has been attracting a lot of interest among speech engineers because of its own benefits. Namely, the possible application areas of talking computers, emergency alarming systems in speech, speech output devices for speech-impaired, and so on. Hence, many researchers have made significant progresses in the speech synthesis techniques in the sense of their own languages and as a result, the quality of current speech synthesizers are believed to be acceptable to normal users. These are partly why the MPEG group had decided to include the WS technology as one of its MPEG-4 functionalities. ETRI has made major contributions to the current MPEG-4 775 appearing in various MPEG-4 documents with relatively minor contributions from AT&T and NW. Main MPEG-4 functionalities presently available are; 1) use of original prosody for synthesized speech output, 2) trick mode functions for general users without breaking synthesized speech prosody, 3) interoperability with Facial Animation(FA) tools, and 4) dubbing a moving/anlmated picture with lip-shape pattern informations.

키워드

참고문헌

  1. Tampere meeting, document ISO/IEC/JTC1/SC29/WG11 M1157 A Hybrid Scalable Text to Speech Synthesis S. Nakajima
  2. Chicago meeting, document ISO/IEC/JTC1/SC29/WG11 M1124 Multilevel Saclable TTS Synthesis L.C. Lee;S.H. Kim
  3. Maceio meeting, document ISO/IEC/JTC1/SC29/WG11 M1524 MPEG4 TTS Interface M. Hahn;J.C. Lee
  4. Sevilla meeting, document ISO/IEC/JTC1/SC29/WG11 M1739 Revision of MPEC-4 TTS Interface M. Hahn;H.S. Lee;J.W. Yang
  5. Bristol meeting, document ISO/IEC/JTC1/SC29/WG11 N1631 MPEC-4 Audio Working Draft 3.0 MPEC-4 audio Group
  6. Stockholm meeting, document ISO/IEC/JTC1/SC29/WG11 M2450 Synchronization of MPEG-4 TTS with Moving Picture Y.K. Lim;M. Hahn;J.C. Lee;H.S. Lee