• Title/Summary/Keyword: 운율구

Search Result 59, Processing Time 0.026 seconds

A Performance Improvement Method using Variable Break in Corpus Based Japanese Text-to-Speech System (가변 Break를 이용한 코퍼스 기반 일본어 음성 합성기의 성능 향상 방법)

  • Na, Deok-Su;Min, So-Yeon;Lee, Jong-Seok;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.2
    • /
    • pp.155-163
    • /
    • 2009
  • In text-to-speech systems, the conversion of text into prosodic parameters is necessarily composed of three steps. These are the placement of prosodic boundaries. the determination of segmental durations, and the specification of fundamental frequency contours. Prosodic boundaries. as the most important and basic parameter. affect the estimation of durations and fundamental frequency. Break prediction is an important step in text-to-speech systems as break indices (BIs) have a great influence on how to correctly represent prosodic phrase boundaries, However. an accurate prediction is difficult since BIs are often chosen according to the meaning of a sentence or the reading style of the speaker. In Japanese, the prediction of an accentual phrase boundary (APB) and major phrase boundary (MPB) is particularly difficult. Thus, this paper presents a method to complement the prediction errors of an APB and MPB. First, we define a subtle BI in which it is difficult to decide between an APB and MPB clearly as a variable break (VB), and an explicit BI as a fixed break (FB). The VB is chosen using the classification and regression tree, and multiple prosodic targets in relation to the pith and duration are then generated. Finally. unit-selection is conducted using multiple prosodic targets. In the MOS test result. the original speech scored a 4,99. while proposed method scored a 4.25 and conventional method scored a 4.01. The experimental results show that the proposed method improves the naturalness of synthesized speech.

Prosodic Realization of Focus in Korean Sentences (한국어 문장에 나타난 초점의 운율적 특징)

  • 유정
    • Proceedings of the Korean Society for Language and Information Conference
    • /
    • 2003.06a
    • /
    • pp.52-57
    • /
    • 2003
  • 문장 안에서 새로운 정보를 닫고 있는 초점이 실제 말소리에서 드러나는 특징을 연구하였다. 초점과 초점 주변의 초분절소(suprasegmental, 강세, 길이, 성조)를 분석한 결과, 초점 보다 초점 앞의 강세구 길이가 크게 늘어나는 것을 관찰하였는데, 이때 강세구 전체 길이 보다 마지막 음절의 길이가 두드러지게 길게 발음되었다. 또한, 초점 앞에서 문장 단위의 끊어 읽기가 나타나는 것을 발견하였다. 인구어와는 달리 초점에 강세를 동반한다는 특징을 찾아볼 수는 없었다.

  • PDF

Implementation of the Voice Conversion in the Text-to-speech System (Text-to-speech 시스템에서의 화자 변환 기능 구현)

  • Hwang Cholgyu;Kim Hyung Soon
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.33-36
    • /
    • 1999
  • 본 논문에서는 기존의 text-to-speech(TTS) 합성방식이 미리 정해진 화자에 의한 단조로운 합성음을 가지는 문제를 극복하기 위하여, 임의의 화자의 음색을 표현할 수 있는 화자 변환(Voice Conversion) 기능을 구현하였다. 구현된 방식은 화자의 음향공간을 Gaussian Mixture Model(GMM)로 모델링하여 연속 확률 분포에 따른 화자 변환을 가능케 했다. 원시화자(source)와 목적화자(target)간의 특징 벡터의 joint density function을 이용하여 목적화자의 음향공간 특징벡터와 변환된 벡터간의 제곱오류를 최소화하는 변환 함수를 구하였으며, 구해진 변환 함수로 벡터 mapping에 의한 스펙트럼 포락선을 변환했다. 운율 변환은 음성 신호를 정현파 모델에 의해서 모델링하고, 분석된 운율 정보(피치, 지속 시간)는 평균값을 고려해서 변환했다. 성능 평가를 위해서 VQ mapping 방법을 함께 구현하여 각각의 정규화된 켑스트럼 거리를 구해서 성능을 비교 평가하였다. 합성시에는 ABS-OLA 기반의 정현파 모델링 방식을 채택함으로써 자연스러운 합성음을 생성할 수 있었다.

  • PDF

Statistical Approaches to Convert Pitch Contour Based on Korean Prosodic Phrases (한국어 운율구 기반의 피치궤적 변환의 통계적 접근)

  • Lee, Ki-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.1E
    • /
    • pp.10-15
    • /
    • 2004
  • In performing speech conversion from a source speaker to a target speaker, it is important that the pitch contour of the source speakers utterance be converted into that of the target speaker, because pitch contour of a speech utterance plays an important role in expressing speaker's individuality and meaning of the utterance. This paper describes statistical algorithms of pitch contour conversion for Korean language. Pitch contour conversions are investigated at two 1 evels of prosodic phrases: intonational phrase and accentual phrase. The basic algorithm is a Gaussian normalization [7] in intonational phrase. The first presented algorithm is combined with a declination-line of pitch contour in an intonational phrase. The second one is Gaussian normalization within accentual phrases to compensate for local pitch variations. Experimental results show that the algorithm of Gaussian normalization within accentual phrases is significantly more accurate than the other two algorithms in intonational phrase.

A Study On the Relation between Eojeol and Prosodic Phrase (어절 구성과 운율구 형성과의 관계에 대한 연구 - 관형사형 전성어미를 중심으로 -)

  • Park, Mi-Kyoung
    • Proceedings of the KSPS conference
    • /
    • 2004.05a
    • /
    • pp.165-170
    • /
    • 2004
  • The aim of this paper is to study the relation between Eojeol and prosodic phrase in Korean. Depending on two adnominal ending form in Korean '-ㄴ' and '-ㄹ', there are some different prosodic phrase: 1) $1{\sim}2$ syllable eojeols : '-ㄴ' has none prosodic phrase in front of the eojeol, an accentual phrase in the end of the eojeol. In contrast, '-ㄹ' has an accentual phrase in front of the eojeol, but none in the end of the eojeol. 2) More than 3 syllable eojeols : '-ㄴ' have accentual phrases on the edge of the eojeol. but '-ㄹ' has an accentual phrase in the end of the eojeol.

  • PDF

The Rule of Duration Variation For Natural Female Synthetic Speech (자연스러운 여성 합성음을 위한 지속시간 규칙에 관한 연구)

  • Choi Young-Ig;Kwon Chul-Hong
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.3-6
    • /
    • 1999
  • 합성음의 자연성은 운율(prosody)과 관계가 있으며, 운율은 지속시간(duration), 세기(intensity), 피치(pitch)의 3가지 요소가 어우러져 이루어진다. 본 연구에서는 한국어 여성 음에 나타나는 지속시간 패턴을 분석하여 지속시간의 규칙을 정리함을 목적으로 한다. 본 연구에서는 각 음소(자음, 모음)의 고유 지속시간과 단어내의 음절 위치, 인접음소의 영향, 구와 절의 경계의 영향에 따른 지속시간의 변화를 조사하여 지속시간 규칙을 정립하였다 청취 실험 결과, 본 지속시간 규칙이 합성음의 자연성을 향상시켰다는 것을 보여준다.

  • PDF

운율구와 대화체 문장구조의 상관관계에 대한 실험음성학적 연구

  • Seong Cheol-Jae
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.323-332
    • /
    • 1996
  • The current speech technology has been aiming to acquire much clearer and more natural synthetic speech sound. The naturalness can be developed by an adequate phrasing of target sentence, of course, which seems to be strongly related to both syntactic and phonetic aspect simultaneously. The present study aims to describe, at one aspect, the relatedness between syntactic structure and prosodic phrasing through dialogue speech, and at the other, to establish a suitable phrasing pattern with respect to the purpose of acquiring more natural synthetic sound. The prosodic phrase, here, means a prosodic unit which can be clearly identified as having an evident break boundary at its final position in a sentence in the sense of both perceptual and acoustical viewpoint. The end of each prosodic phrase is, accordingly, marked as the point of major boundary in a sentence.

  • PDF

Aspects of Prosodic Phrases' Formation Produced by Chinese Speakers in the Reading of Korean Text (낭독체에 나타난 중국인 학습자들의 운율구 실현 양상 -청취실험을 바탕으로-)

  • Yune, Young-Sook
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.29-41
    • /
    • 2008
  • The purpose of this paper is to examine how Chinese speakers realize Korean prosodic phrases in the reading of Korean texts. Prosodic phrase, in this study, is defined as basic unit of spoken language which can be perceived as purely separate phonetic unit by both hearer and speaker, and is realized with a coherent intonational configuration. Prosodic phrase plays an important role in both speech production and perception. In the second language acquisition, prosody influences the accuracy and fluency of spoken language. The main purpose of this study is to describe the aspect of syntagmatic operation of prosody that produces prosodic phrases. We have specifically examined the relations between the prosodic phrase's boundary and its syntactic status. Furthermore, we examined internal syntactic structure of each prosodic phrase. And the results of each analysis were compared to the aspects of prosodic phrases' formation produced by native Korean speakers. The results show that Chinese speakers tend to coincide the prosodic phrases with syntactic structure more than native Korean speakers.

  • PDF

IP generating factors and rules of read speech and dialogue in Korean (대화체와 낭독체의 억양구 형성에 관한 연구)

  • Park Jihye
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.285-288
    • /
    • 2002
  • 본 논문에서는 발화 유형을 대화체와 낭독체의 두 가지로 구분하여 각 발화 유형에서 억양구를 형성하는 특징을 살펴보았다. 실험 결과, 한 문장 내에 두 개 이상의 억양구가 생성되는 경우와 접속문의 경우에는 낭독체에서 더 많은 억양구가 형성되었다. 대화체에서 더 많은 억양구가 형성되는 경우는 주로 주어 다음에 억양구가 형성되는 경우이며, 대화체 발화에서는 한 문장내에 두 개 이상의 억양구가 형성된 경우는 존재하지 않았다. 이러한 실험 결과를 바탕으로 억양구의 형성이 음절수뿐만 아니라 문장의 구조에 영향을 받으며, 이 두 가지 요인이 발화 유형에 따라 다르게 적용된다는 운율적 특징을 파악할 수 있다.

  • PDF

A Pre-Selection of Candidate Units Using Accentual Characteristic In a Unit Selection Based Japanese TTS System (일본어 악센트 특징을 이용한 합성단위 선택 기반 일본어 TTS의 후보 합성단위의 사전선택 방법)

  • Na, Deok-Su;Min, So-Yeon;Lee, Kwang-Hyoung;Lee, Jong-Seok;Bae, Myung-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.4
    • /
    • pp.159-165
    • /
    • 2007
  • In this paper, we propose a new pre-selection of candidate units that is suitable for the unit selection based Japanese TTS system. General pre-selection method performed by calculating a context-dependent cost within IP (Intonation Phrase). Different from other languages, however. Japanese has an accent represented as the height of a relative pitch, and several words form a single accentual phrase. Also. the prosody in Japanese changes in accentual phrase units. By reflecting such prosodic change in pre-selection. the qualify of synthesized speech can be improved. Furthermore, by calculating a context-dependent cost within accentual phrase, synthesis speed can be improved than calculating within intonation phrase. The proposed method defines AP. analyzes AP in context and performs pre-selection using accentual phrase matching which calculates CCL (connected context length) of the Phoneme's candidates that should be synthesized in each accentual phrase. The baseline system used in the proposed method is VoiceText, which is a synthesizer of Voiceware. Evaluations were made on perceptual error (intonation error, concatenation mismatch error) and synthesis time. Experimental result showed that the proposed method improved the qualify of synthesized speech. as well as shortened the synthesis time.