Browse > Article
http://dx.doi.org/10.5762/KAIS.2013.14.8.3992

A Study on Implementation of Emotional Speech Synthesis System using Variable Prosody Model  

Min, So-Yeon (Dept. of Information and Communication, Seoil University)
Na, Deok-Su (Voiceware co. Ltd, R&D center)
Publication Information
Journal of the Korea Academia-Industrial cooperation Society / v.14, no.8, 2013 , pp. 3992-3998 More about this Journal
Abstract
This paper is related to the method of adding a emotional speech corpus to a high-quality large corpus based speech synthesizer, and generating various synthesized speech. We made the emotional speech corpus as a form which can be used in waveform concatenated speech synthesizer, and have implemented the speech synthesizer that can be generated various synthesized speech through the same synthetic unit selection process of normal speech synthesizer. We used a markup language for emotional input text. Emotional speech is generated when the input text is matched as much as the length of intonation phrase in emotional speech corpus, but in the other case normal speech is generated. The BIs(Break Index) of emotional speech is more irregular than normal speech. Therefore, it becomes difficult to use the BIs generated in a synthesizer as it is. In order to solve this problem we applied the Variable Break[3] modeling. We used the Japanese speech synthesizer for experiment. As a result we obtained the natural emotional synthesized speech using the break prediction module for normal speech synthesize.
Keywords
Variable Break; Emotional Speech Synthesizer;
Citations & Related Records
Times Cited By KSCI : 6  (Citation Analysis)
연도 인용수 순위
1 Campbell, N, "Autolabeling Japanese ToBI," Proc. ICSLP'96, vol.4, pp.2399-2402, 1996.
2 D. S. Na, M. J. Bae, "A Variable Break Prediction Method using CART in a Japanese Text-to-Speech System," IEICE Trans. Inf. & Syst., Vol. E92-D, No.2, pp.349-352, 2009. DOI: http://dx.doi.org/10.1587/transinf.E92.D.349   DOI   ScienceOn
3 D. S. Na, S. Y. Min, J. S. Lee, M. J. Bae,, "A Performance Improvement Method using Variable Break in Corpus Based Japanese Text-to-Speech System," The Journal of the Acoustical Society of Korea, Vol. 28, No. 2, pp.155-163, 2009.   과학기술학회마을
4 . J. Venditti, J. "The J_ToBI model of Japanese intonation", in S. A. Jun Ed., Prosodic Typology and Transcription: A Unified Approach: Oxford University Press, pp.172-200.
5 K. Maekawa, H. Kikuchi, Y. Igarashi, J. Venditti, "X-JToBI: an extended j-toBI for spontaneous speech", Proc. ICSLP-2002, pp.1545-1548, 2002.
6 K.-H. Kim, H.-M. Kim, K.-Y. Lee, M.-J. Lim, J.-L. Kim, "Design And Implementation of a Speech Recognition Interview Model based-on Opinion Mining Algorithm", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 12, No 1, pp. 225-230, 2012.   DOI   ScienceOn
7 S.-H. Kim, J.-Y. Ahn, "A Study on the Voice Interface for Mobile Environment", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 13, No 1, pp. 199-204, 2013.
8 J. J. Im, "Development of energy expenditure measurement device based on voice and body activity", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 12, No 6, pp. 303-309, 2012.   DOI   ScienceOn
9 J.-Y. Ahn, S.-B. Kim, S.-H. Kim, K.-I. Hur, "A study on Voice Recognition using Model Adaptation HMM for Mobile Environment", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 11, No 3, pp. 175-180, 2011.
10 W. Oh, E. Rhee, "Curriculum Development of Acoustics and Audio Engineering on Digital Convergence Environment", Journal of The Institute of Webcasting, Internet and Telecommunication, Vol 13, No 2, pp. 191-197, 2013.   과학기술학회마을   DOI   ScienceOn
11 S. Kiriyama, S. Kitazawa, "Evaluation of a prosodic labeling system utilizing linguistic information," Proc. INTERSPEECH2004, pp.2993-2996, 2004.
12 K. Maekawa, H. Kikuchi, Y. Igarashi, J. Venditti, "X-JToBI: an extended j-toBI for spontaneous speech", Proc. ICSLP-2002, pp.1545-1548, 2002.
13 S. H. Lee, Y. H. Oh. "The Modelling of Prosodic Phrasing and Pause Duration using CART", Proceeding of the Acoustical society of Korea, Vol. 17 No. 1, pp 81-86, 1998.