• Title/Summary/Keyword: Prosody

Search Result 208, Processing Time 0.021 seconds

Post-focus compression is not automatically transferred from Korean to L2 English

  • Liu, Jun;Xu, Yi;Lee, Yong-cheol
    • Phonetics and Speech Sciences
    • /
    • v.11 no.2
    • /
    • pp.15-21
    • /
    • 2019
  • Korean and English are both known to show on-focus pitch range expansion and post-focus pitch range compression (PFC). But it is not clear if this prosodic similarity would make it easy for Korean speakers to learn English focus prosody. In the present study, we conducted a production experiment using phone number strings to examine whether Korean learners of English produce a native-like focus prosody. Korean learners of English were classified into three groups (advanced, intermediate and low) according to their English proficiency and were compared to native speakers. Results show that intermediate and low groups of speakers did not increase duration, intensity, and pitch in the focus positions, nor did they compress those cues in the post-focus positions. Advanced speakers noticeably increased the acoustic cues in the focus positions to a similar extent as native speakers. However, their performance in post-focus positions was quite far from that of native speakers in terms of pitch and excursion size. These results thus demonstrate a lack of positive transfer of focus prosody from Korean to English in L2 learning, and learners may have to relearn it from scratch, which is consistent with a previous finding. More importantly, the results provide further support for the view proposed in other works that acoustic properties of PFC were not easily transferred from one language to another.

The Study on Korean Prosody Generation using Artificial Neural Networks (인공 신경망의 한국어 운율 발생에 관한 연구)

  • Min Kyung-Joong;Lim Un-Cheon
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.337-340
    • /
    • 2004
  • The exactly reproduced prosody of a TTS system is one of the key factors that affect the naturalness of synthesized speech. In general, rules about prosody had been gathered either from linguistic knowledge or by analyzing the prosodic information from natural speech. But these could not be perfect and some of them could be incorrect. So we proposed artificial neural network(ANN)s that can be trained to team the prosody of natural speech and generate it. In learning phase, let ANNs learn the pitch and energy contour of center phoneme by applying a string of phonemes in a sentence to ANNs and comparing the output pattern with target pattern and making adjustment in weighting values to get the least mean square error between them. In test phase, the estimation rates were computed. We saw that ANNs could generate the prosody of a sentence.

  • PDF

How to Express Emotion: Role of Prosody and Voice Quality Parameters (감정 표현 방법: 운율과 음질의 역할)

  • Lee, Sang-Min;Lee, Ho-Joon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.19 no.11
    • /
    • pp.159-166
    • /
    • 2014
  • In this paper, we examine the role of emotional acoustic cues including both prosody and voice quality parameters for the modification of a word sense. For the extraction of prosody parameters and voice quality parameters, we used 60 pieces of speech data spoken by six speakers with five different emotional states. We analyzed eight different emotional acoustic cues, and used a discriminant analysis technique in order to find the dominant sequence of acoustic cues. As a result, we found that anger has a close relation with intensity level and 2nd formant bandwidth range; joy has a relative relation with the position of 2nd and 3rd formant values and intensity level; sadness has a strong relation only with prosody cues such as intensity level and pitch level; and fear has a relation with pitch level and 2nd formant value with its bandwidth range. These findings can be used as the guideline for find-tuning an emotional spoken language generation system, because these distinct sequences of acoustic cues reveal the subtle characteristics of each emotional state.

The Relationship Between Perception of Prosody, Pitch Discrimination, and Melodic Contour Identification in Cochlear Implants Recipients (인공와우이식 난청인의 말소리 운율변화에 따른 구어 이해와 음도 변별, 선율윤곽 확인 간 관련성)

  • Kim, Eun Yeon;Moon, Il Joon;Cho, Yang-sun;Chung, Won-ho;Hong, Sung Hwa
    • Journal of Music and Human Behavior
    • /
    • v.14 no.2
    • /
    • pp.1-18
    • /
    • 2017
  • The relationships between the ability to understand changes in meaning depending on the prosody of spoken words and the ability to perceive pitch and melodic contour in cochlear implants (CI) recipients were examined. Fifteen postlingual CI recipients were measured in terms of speech prosody perception, speech perception, pitch discrimination (PD), and melody contour identification (MCI). The speech prosody perception test consists of words with positive (PW) and neutral meaning (NW). Participants were asked to identify the meaning of words depending on the conditions of positive and negative prosody. The MCI consists of subtests 1 and 2 with different chance levels to choose. Then, the relationships between speech prosody perception, speech perception, PD, and MCI performance were analyzed. There was a significant difference in identifying the meaning of words expressed in a different prosody between the PW and NW conditions. Speech prosody perception showed a significant correlation with MCI 1 while there was no significant relationship with speech perception. Although speech perception may be possible after CI, limited spoken word comprehension due to decreased sensitivity for prosodic changes may persist in CI recipients. In addition, there was a limitation in perception of melodic contour change compared to pitch discrimination, which is related to speech prosody perception.

Syntactic Ambiguities and their Resolution in Prosody in Japanese (일본어 유악센트 방언과 무악센트 방언의 통사적 애매성의 해소와 운율적 특징)

  • Choi, Young-Sook
    • Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.211-221
    • /
    • 2002
  • The prosody can play a crucial role in differentiating ambiguous sentences to correctly reflect their intended syntactic structures. In what way do the speakers in Tokyo and Sendai dialects of Japanese use prosodic elements to differentiate syntactic ambiguities? Acoustic measurement was made of utterances of ambiguous sentences in Japanese to observe prosodic strategies for disambiguation. Materials were sentences of the type ADV-VP1-NP-VP2, ADV-NP1-NP2-VP2, where the ambiguity lies in locative adverbial modification, ADV modifying either VP1 or VP2. For this construction the Japanese create the same ambiguities. After defining the depth of a syntactic boundary, F0 of the phrase before and after the boundary, and duration of the syllable and pause before the boundary were measured. The results show that Tokyo dialects speakers use F0 after syntactic boundary, and Sendai dialects speakers use of the syllable and/or pause before the boundary.

  • PDF

Sensitivity to Phrase-initial Tone and Laryngeal Feature Identification of Foreign Learners of Korean

  • Lee, Hye-Sook
    • Phonetics and Speech Sciences
    • /
    • v.2 no.3
    • /
    • pp.91-99
    • /
    • 2010
  • This paper reports on an identification test where KFL learners identified the Korean three-way laryngeal contrast in the phrase-initial position, when the phrase-initial tone was systematically manipulated. It turns out that heritage learners have some sensitivity to phrase-initial tone and show a plain-aspirated alternation in their identification according to the phrase-initial tone, as native speakers do, whereas non-heritage students do not show such tone sensitivity. However, after a weekly prosody training, second-year non-heritage students have shown a significant improvement in their performance. This paper clearly shows that the phrase-initial tone plays a critical role in distinguishing laryngeal features of Korean obstruents, and also suggests that prosody including the tone-segment correlation should be incorporated in the KFL curriculum.

  • PDF

Implementation of Korean TTS System based on Natural Language Processing (자연어 처리 기반 한국어 TTS 시스템 구현)

  • Kim Byeongchang;Lee Gary Geunbae
    • MALSORI
    • /
    • no.46
    • /
    • pp.51-64
    • /
    • 2003
  • In order to produce high quality synthesized speech, it is very important to get an accurate grapheme-to-phoneme conversion and prosody model from texts using natural language processing. Robust preprocessing for non-Korean characters should also be required. In this paper, we analyzed Korean texts using a morphological analyzer, part-of-speech tagger and syntactic chunker. We present a new grapheme-to-phoneme conversion method for Korean using a hybrid method with a phonetic pattern dictionary and CCV (consonant vowel) LTS (letter to sound) rules, for unlimited vocabulary Korean TTS. We constructed a prosody model using a probabilistic method and decision tree-based method. The probabilistic method atone usually suffers from performance degradation due to inherent data sparseness problems. So we adopted tree-based error correction to overcome these training data limitations.

  • PDF

Aspects of Prosodic Phrases' Formation Produced by Chinese Speakers in the Reading of Korean Text (낭독체에 나타난 중국인 학습자들의 운율구 실현 양상 -청취실험을 바탕으로-)

  • Yune, Young-Sook
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.29-41
    • /
    • 2008
  • The purpose of this paper is to examine how Chinese speakers realize Korean prosodic phrases in the reading of Korean texts. Prosodic phrase, in this study, is defined as basic unit of spoken language which can be perceived as purely separate phonetic unit by both hearer and speaker, and is realized with a coherent intonational configuration. Prosodic phrase plays an important role in both speech production and perception. In the second language acquisition, prosody influences the accuracy and fluency of spoken language. The main purpose of this study is to describe the aspect of syntagmatic operation of prosody that produces prosodic phrases. We have specifically examined the relations between the prosodic phrase's boundary and its syntactic status. Furthermore, we examined internal syntactic structure of each prosodic phrase. And the results of each analysis were compared to the aspects of prosodic phrases' formation produced by native Korean speakers. The results show that Chinese speakers tend to coincide the prosodic phrases with syntactic structure more than native Korean speakers.

  • PDF

The Role of Prosody in Dialect Synthesis and Authentication

  • Yoon, Kyu-Chul
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.25-31
    • /
    • 2009
  • The purpose of this paper is to examine the viability of synthesizing Masan dialect with Seoul dialect and to examine the role of prosody in the authentication of the synthesized Masan dialect. The synthesis was performed by transferring one or more of the prosodic features of the Masan utterance onto the Seoul utterance. The hypothesis is that, given an utterance composed of the phonemes shared by both dialects, as more prosodic features of the Masan utterance are transferred onto the Seoul utterance, the Seoul utterance will be identified as more authentic Masan utterance. The prosodic features involved were the fundamental frequency contour, the segmental durations, and the intensity contour. The synthesized Masan utterances were evaluated by thirteen native speakers of Masan dialect. The result showed that the fundamental frequency contour and the segmental durations had main effects on the perceptual shift from Seoul to Masan dialect.

  • PDF

A Study of Decision Tree Modeling for Predicting the Prosody of Corpus-based Korean Text-To-Speech Synthesis (한국어 음성합성기의 운율 예측을 위한 의사결정트리 모델에 관한 연구)

  • Kang, Sun-Mee;Kwon, Oh-Il
    • Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.91-103
    • /
    • 2007
  • The purpose of this paper is to develop a model enabling to predict the prosody of Korean text-to-speech synthesis using the CART and SKES algorithms. CART prefers a prediction variable in many instances. Therefore, a partition method by F-Test was applied to CART which had reduced the number of instances by grouping phonemes. Furthermore, the quality of the text-to-speech synthesis was evaluated after applying the SKES algorithm to the same data size. For the evaluation, MOS tests were performed on 30 men and women in their twenties. Results showed that the synthesized speech was improved in a more clear and natural manner by applying the SKES algorithm.

  • PDF