[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5762/KAIS.2013.14.8.3992

A Study on Implementation of Emotional Speech Synthesis System using Variable Prosody Model

Min, So-Yeon (Dept. of Information and Communication, Seoil University)
Na, Deok-Su (Voiceware co. Ltd, R&D center)

Publication Information

Journal of the Korea Academia-Industrial cooperation Society / v.14, no.8, 2013 , pp. 3992-3998 More about this Journal

Abstract

This paper is related to the method of adding a emotional speech corpus to a high-quality large corpus based speech synthesizer, and generating various synthesized speech. We made the emotional speech corpus as a form which can be used in waveform concatenated speech synthesizer, and have implemented the speech synthesizer that can be generated various synthesized speech through the same synthetic unit selection process of normal speech synthesizer. We used a markup language for emotional input text. Emotional speech is generated when the input text is matched as much as the length of intonation phrase in emotional speech corpus, but in the other case normal speech is generated. The BIs(Break Index) of emotional speech is more irregular than normal speech. Therefore, it becomes difficult to use the BIs generated in a synthesizer as it is. In order to solve this problem we applied the Variable Break[3] modeling. We used the Japanese speech synthesizer for experiment. As a result we obtained the natural emotional synthesized speech using the break prediction module for normal speech synthesize.

Keywords

Variable Break; Emotional Speech Synthesizer;

Citations & Related Records

Times Cited By KSCI : 6 (Citation Analysis)

Reference
Cited By KSCI

1	Campbell, N, "Autolabeling Japanese ToBI," Proc. ICSLP'96, vol.4, pp.2399-2402, 1996.
2	D. S. Na, M. J. Bae, "A Variable Break Prediction Method using CART in a Japanese Text-to-Speech System," IEICE Trans. Inf. & Syst., Vol. E92-D, No.2, pp.349-352, 2009. DOI: http://dx.doi.org/10.1587/transinf.E92.D.349 DOI ScienceOn
3	D. S. Na, S. Y. Min, J. S. Lee, M. J. Bae,, "A Performance Improvement Method using Variable Break in Corpus Based Japanese Text-to-Speech System," The Journal of the Acoustical Society of Korea, Vol. 28, No. 2, pp.155-163, 2009. 과학기술학회마을
4	. J. Venditti, J. "The J_ToBI model of Japanese intonation", in S. A. Jun Ed., Prosodic Typology and Transcription: A Unified Approach: Oxford University Press, pp.172-200.
5	K. Maekawa, H. Kikuchi, Y. Igarashi, J. Venditti, "X-JToBI: an extended j-toBI for spontaneous speech", Proc. ICSLP-2002, pp.1545-1548, 2002.
6	K.-H. Kim, H.-M. Kim, K.-Y. Lee, M.-J. Lim, J.-L. Kim, "Design And Implementation of a Speech Recognition Interview Model based-on Opinion Mining Algorithm", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 12, No 1, pp. 225-230, 2012. DOI ScienceOn
7	S.-H. Kim, J.-Y. Ahn, "A Study on the Voice Interface for Mobile Environment", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 13, No 1, pp. 199-204, 2013.
8	J. J. Im, "Development of energy expenditure measurement device based on voice and body activity", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 12, No 6, pp. 303-309, 2012. DOI ScienceOn
9	J.-Y. Ahn, S.-B. Kim, S.-H. Kim, K.-I. Hur, "A study on Voice Recognition using Model Adaptation HMM for Mobile Environment", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 11, No 3, pp. 175-180, 2011.
10	W. Oh, E. Rhee, "Curriculum Development of Acoustics and Audio Engineering on Digital Convergence Environment", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 13, No 2, pp. 191-197, 2013. 과학기술학회마을 DOI ScienceOn
11	S. Kiriyama, S. Kitazawa, "Evaluation of a prosodic labeling system utilizing linguistic information," Proc. INTERSPEECH2004, pp.2993-2996, 2004.
12	K. Maekawa, H. Kikuchi, Y. Igarashi, J. Venditti, "X-JToBI: an extended j-toBI for spontaneous speech", Proc. ICSLP-2002, pp.1545-1548, 2002.
13	S. H. Lee, Y. H. Oh. "The Modelling of Prosodic Phrasing and Pause Duration using CART", Proceeding of the Acoustical society of Korea, Vol. 17 No. 1, pp 81-86, 1998.

KSCI

A Study on Implementation of Emotional Speech Synthesis System using Variable Prosody Model 가변 운율 모델링을 이용한 고음질 감정 음성합성기 구현에 관한 연구

A Study on Implementation of Emotional Speech Synthesis System using Variable Prosody Model