A Pre-Selection of Candidate Units Using Accentual Characteristic In a Unit Selection Based Japanese TTS System

Na, Deok-Su;Min, So-Yeon;Lee, Kwang-Hyoung;Lee, Jong-Seok;Bae, Myung-Jin;

doi:10.7776/ASK.2007.26.4.159

한국음향학회지 (The Journal of the Acoustical Society of Korea)

제26권4호
/
Pages.159-165
/
2007
/
1225-4428(pISSN)
/
2287-3775(eISSN)

한국음향학회 (The Acoustical Society of Korea)

DOI QR Code

일본어 악센트 특징을 이용한 합성단위 선택 기반 일본어 TTS의 후보 합성단위의 사전선택 방법

A Pre-Selection of Candidate Units Using Accentual Characteristic In a Unit Selection Based Japanese TTS System

나덕수 (보이스웨어 기술연구소) ;
민소연 (서일대학) ;
이광형 (서일대학) ;
이종석 (보이스웨어 기술연구소) ;
배명진 (숭실대학교 정보통신 전자공학부)

발행 : 2007.05.31

https://doi.org/10.7776/ASK.2007.26.4.159 인용 PDF KSCI

PDF 다운로드

⟨ 이전 논문 다음 논문 ⟩

초록

본 논문에서는 합성단위 선택 (unit selection) 기반 일본어 합성기에 필요한 후보 합성단위들에 대한 사전선택 (pre-selection)의 새로운 방법을 제안한다. 일반적인 사전선택 방법은 하나의 억양구에서 음소 열에 대한 비용을 계산하여 이용하는 방법이다. 그런데, 일본어는 다른 언어와는 다르게 상대적인 피치의 높낮이로 나타나는 악센트를 가지는 언어이고, 몇 개의 단어가 하나의 악센트구를 형성하는 특징이 있다. 또한 일본어의 운율은 악센트 구를 기본 단위로 하여 변화하는 특징이 있어서, 사전선택에서 이러한 악센트 구 단위의 운율 변화를 반영함으로써 음질을 향상시킬 수 있고, 악센트 구에서 음소 열에 대한 비용을 계산하여 억양구에서 하는 것보다 계산량을 줄일 수 있다. 제안한 방법은 일본어의 악센트 구를 정의하여 음소 열에서 이것을 분석하고, 각 악센트 구에서 합성 할 음소의 각 후보에 대해 CCL (Connected Context Length)을 구하는 악센트 구 매칭을 이용하여 사전선택을 수행하는 방법이다. 제안한 방법은 Voiceware의 합성기인 VoiceText를 baseline 시스템으로 사용하여 구현하였고, 인지적 에러 (억양 에러, 연결 에러)와 합성시간에 대해 평가하였다. 실험 결과, 제안한 방법은 합성 음질을 보다 자연스럽게 향상시켰고, 합성 속도를 개선하였다.

In this paper, we propose a new pre-selection of candidate units that is suitable for the unit selection based Japanese TTS system. General pre-selection method performed by calculating a context-dependent cost within IP (Intonation Phrase). Different from other languages, however. Japanese has an accent represented as the height of a relative pitch, and several words form a single accentual phrase. Also. the prosody in Japanese changes in accentual phrase units. By reflecting such prosodic change in pre-selection. the qualify of synthesized speech can be improved. Furthermore, by calculating a context-dependent cost within accentual phrase, synthesis speed can be improved than calculating within intonation phrase. The proposed method defines AP. analyzes AP in context and performs pre-selection using accentual phrase matching which calculates CCL (connected context length) of the Phoneme's candidates that should be synthesized in each accentual phrase. The baseline system used in the proposed method is VoiceText, which is a synthesizer of Voiceware. Evaluations were made on perceptual error (intonation error, concatenation mismatch error) and synthesis time. Experimental result showed that the proposed method improved the qualify of synthesized speech. as well as shortened the synthesis time.

키워드

참고문헌

H. Segi, T. Takagi and T. Ito, 'A Concatenative Speech Synthesis Method Using Context Dependent Phoneme Sequences with Variable Length as Search Units', Proc. 5th ISCA Speech Synthesis Workshop, 115-120, Pittsburgh, June, 2004
A. Conkie, M. C. Beutnagel, A. K. Svrdal and P. E. Brown, 'Preselection of candidate units in a unit selection-based text-to-speech synthesis system', Proc. ICSLP-2000, 3, 314-317, Beijing, Oct. 2000
T. Mizutani and T. Kagosima, 'Concatenative Speech Synthesis Based on the Plural Unit Selection and Fusion Method', IEICE Trans. Inf. & Svst., E88-D, (11) 2565-2572, 2005 https://doi.org/10.1093/ietisy/e88-d.11.2565
J. Venditti, 'Japanese ToBI Labeling Guidelines.', OSU Working Papers in Linguistics, 127-162, 1997
H. Kawai, T. Toda, J. Ni, M. Tsuzaki, and K. Tokuda: 'Xirnera: A New TTS from ATR Based on Corpus-Based Technologies,' Proc. ISCA 5th Speech Synthesis Workshop, 179-184, Pittsburgh, June, 2004
T. Kazuyo, A. Makoto, M. Toshimitsu and I. Shuichi, 'JEIDA Standard of Symbols for Japanese Text-to-Speech Synthesizers', Proc. 3rd Oriental COCOSDA Workshop, 27-32, Beijing, Oct, 2000
Technical Standardization Committee on Speech Input/Output Systems, 'Speech Synthesis System Performance Evaluation Methods', JEITA IT-4001, 42-45, April. 2003

한국음향학회지 (The Journal of the Acoustical Society of Korea)

일본어 악센트 특징을 이용한 합성단위 선택 기반 일본어 TTS의 후보 합성단위의 사전선택 방법

A Pre-Selection of Candidate Units Using Accentual Characteristic In a Unit Selection Based Japanese TTS System

초록

키워드

참고문헌

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)