• Title/Summary/Keyword: Break prediction and variable break

Search Result 6, Processing Time 0.026 seconds

A Unit Selection Methods using Variable Break in a Japanese TTS (일본어 TTS의 가변 Break를 이용한 합성단위 선택 방법)

  • Na, Deok-Su;Bae, Myung-Jin
    • Proceedings of the IEEK Conference
    • /
    • 2008.06a
    • /
    • pp.983-984
    • /
    • 2008
  • This paper proposes a variable break that can offset prediction error as well as a pre-selection methods, based on the variable break, for enhanced unit selection. In Japanese, a sentence consists of several APs (Accentual phrases) and MPs (Major phrases), and the breaks between these phrases must predicted to realize text-to-speech systems. An MP also consists of several APs and plays a decisive role in making synthetic speech natural and understandable because short pauses appear at its boundary. The variable break is defined as a break that is able to change easily from an AP to an MP boundary, or from an MP to an AP boundary. Using CART (Classification and Regression Trees), the variable break is modeled stochastically, and then we pre-select candidate units in the unit-selection process. As the experimental results show, it was possible to complement a break prediction error and improve the naturalness of synthetic speech.

  • PDF

A Performance Improvement Method using Variable Break in Corpus Based Japanese Text-to-Speech System (가변 Break를 이용한 코퍼스 기반 일본어 음성 합성기의 성능 향상 방법)

  • Na, Deok-Su;Min, So-Yeon;Lee, Jong-Seok;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.155-163
    • /
    • 2009
  • In text-to-speech systems, the conversion of text into prosodic parameters is necessarily composed of three steps. These are the placement of prosodic boundaries. the determination of segmental durations, and the specification of fundamental frequency contours. Prosodic boundaries. as the most important and basic parameter. affect the estimation of durations and fundamental frequency. Break prediction is an important step in text-to-speech systems as break indices (BIs) have a great influence on how to correctly represent prosodic phrase boundaries, However. an accurate prediction is difficult since BIs are often chosen according to the meaning of a sentence or the reading style of the speaker. In Japanese, the prediction of an accentual phrase boundary (APB) and major phrase boundary (MPB) is particularly difficult. Thus, this paper presents a method to complement the prediction errors of an APB and MPB. First, we define a subtle BI in which it is difficult to decide between an APB and MPB clearly as a variable break (VB), and an explicit BI as a fixed break (FB). The VB is chosen using the classification and regression tree, and multiple prosodic targets in relation to the pith and duration are then generated. Finally. unit-selection is conducted using multiple prosodic targets. In the MOS test result. the original speech scored a 4,99. while proposed method scored a 4.25 and conventional method scored a 4.01. The experimental results show that the proposed method improves the naturalness of synthesized speech.

A Study on Implementation of Emotional Speech Synthesis System using Variable Prosody Model (가변 운율 모델링을 이용한 고음질 감정 음성합성기 구현에 관한 연구)

  • Min, So-Yeon;Na, Deok-Su
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.8
    • /
    • pp.3992-3998
    • /
    • 2013
  • This paper is related to the method of adding a emotional speech corpus to a high-quality large corpus based speech synthesizer, and generating various synthesized speech. We made the emotional speech corpus as a form which can be used in waveform concatenated speech synthesizer, and have implemented the speech synthesizer that can be generated various synthesized speech through the same synthetic unit selection process of normal speech synthesizer. We used a markup language for emotional input text. Emotional speech is generated when the input text is matched as much as the length of intonation phrase in emotional speech corpus, but in the other case normal speech is generated. The BIs(Break Index) of emotional speech is more irregular than normal speech. Therefore, it becomes difficult to use the BIs generated in a synthesizer as it is. In order to solve this problem we applied the Variable Break[3] modeling. We used the Japanese speech synthesizer for experiment. As a result we obtained the natural emotional synthesized speech using the break prediction module for normal speech synthesize.

Prediction of Prosodic Break Using Syntactic Relations and Prosodic Features (구문 관계와 운율 특성을 이용한 한국어 운율구 경계 예측)

  • Jung, Young-Im;Cho, Sun-Ho;Yoon, Ae-Sun;Kwon, Hyuk-Chul
    • Korean Journal of Cognitive Science
    • /
    • v.19 no.1
    • /
    • pp.89-105
    • /
    • 2008
  • In this paper, we suggest a rule-based system for the prediction of natural prosodic phrase breaks from Korean texts. For the implementation of the rule-based system, (1) sentence constituents are sub-categorized according to their syntactic functions, (2) syntactic phrases are recognized using the dependency relations among sub-categorized constituents, (3) rules for predicting prosodic phrase breaks are created. In addition, (4) the length of syntactic phrases and sentences, the position of syntactic phrases in a sentence, sense information of contextual words have been considered as to determine the variable prosodic phrase breaks. Based on these rules and features, we obtained the accuracy over 90% in predicting the position of major break and no break which have high correlation with the syntactic structure of the sentence. As for the overall accuracy in predicting the whole prosodic phrase breaks, the suggested system shows Break_Correct of 87.18% and Juncture Correct of 89.27% which is higher than that of other models.

  • PDF

Fault Prediction and Diagnosis Using Fuzzy Expert System (퍼지 전문가 시스템을 이용한 고장 예측 및 진단)

  • 최성운;이영석
    • Journal of the Korea Safety Management & Science
    • /
    • v.1 no.1
    • /
    • pp.7-17
    • /
    • 1999
  • As the loss from break-downs and errors, which became more frequent with the growth of elaborateness, complexity and in scale of the plant and equipments, are enormous, the improvement in the reliability, maintenance, safety, and qualify become to have interest. The fault diagnosis is a systematic and unified method to find errors, which is based on the interpretation that data, subconsciously, have noises. But, as most of the methods are inferences based on binomial logic, the uncertainty is not correctly reflected. In this study, we suggest, to manage the uncertainty in the system efficiently on the point of predictive maintenance, We should use fuzzy expert system, which make the decision considering uncertainty possible by taking linguistical variable and fixed quantity by using the fuzzy theory concepts on the basis of an expert's direct observation and experience.

  • PDF

Development of four-equation turbulence model for prediction of mixed convective heat transfer on a flat plate (수평평판위 의 혼합대류 열전말 계산 을 위한 4-방정식 모델 의 개발)

  • 성형진;정명균
    • Transactions of the Korean Society of Mechanical Engineers
    • /
    • v.7 no.2
    • /
    • pp.193-203
    • /
    • 1983
  • The mixed convective heat transfer problems are characterized by the relatively significant contribution of buoyancy force to the transport processes of momentum and heat. Past analytical studies on this kind of problems have been carried out by employing either the conventional R-.epsilon. turbulence model which includes constant turbulent Prandtl number .sigma.$_{+}$ 1 or an extended R-.epsilon. turbulence model which takes account of the buoyancy effect in appropriate length scale equations. But in the latter case, the temperature variance .the+a.$^{2}$ over bar is approximated by a model under local equilibrium condition and the time scale ratio between velocity and temperature is assumed to be constant. These approximation is known to break down when the buoyancy effect is dominant. The present study is aimed at development of new computational turbulence closure level which can be applied to this rather complex turbulent process. The temperature variance is obtained directly by solving its dynamic transport equation and the time scale ratio which is variable in space is computed by a solution of a dynamic equation for the rate of scalar dissipation .epsilon.$_{\thetod}$ It was found that the computational results are in good agreement with available experimental data of wide range of unstable conditions.