• Title/Summary/Keyword: Prosodic Characteristics

Search Result 68, Processing Time 0.029 seconds

Effects of Prosodic Strengthening on the Production of English High Front Vowels /i, ɪ/ by Native vs. Non-Native Speakers (원어민과 비원어민의 영어 전설 고모음 /i, ɪ/ 발화에 나타나는 운율 강화 현상)

  • Kim, Sahyang;Hur, Yuna;Cho, Taehong
    • Phonetics and Speech Sciences
    • /
    • v.5 no.4
    • /
    • pp.129-136
    • /
    • 2013
  • This study investigated how acoustic characteristics (i.e., duration, F1, F2) of English high front vowels /i, ɪ/ are modulated by boundary- and prominence-induced strengthening in native vs. non-native (Korean) speech production. The study also examined how the durational difference in vowels due to the voicing of a following consonant (i.e., voiced vs. voiceless) is modified by prosodic strengthening in two different (native vs. non-native) speaker groups. Five native speakers of Canadian English and eight Korean learners of English (intermediate-advanced level) produced 8 minimal pairs with the CVC sequence (e.g., 'beat'-'bit') in varying prosodic contexts. Native speakers distinguished the two vowels in terms of duration, F1, and F2, whereas non-native speakers only showed durational differences. The two groups were similar in that they maximally distinguished the two vowels when the vowels were accented (F2, duration), while neither group showed boundary-induced strengthening in any of the three measurements. The durational differences due to the voicing of the following consonant were also maximized when accented. The results are discussed further in terms of phonetics-prosody interface in L2 production.

Voice Conversion Using Linear Multivariate Regression Model and LP-PSOLA Synthesis Method (선형다변회귀모델과 LP-PSOLA 합성방식을 이용한 음성변환)

  • 권홍석;배건성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.3
    • /
    • pp.15-23
    • /
    • 2001
  • This paper presents a voice conversion technique that modifies the utterance of a source speaker as if it were spoken by a target speaker. Feature parameter conversion methods to perform the transformation of vocal tract and prosodic characteristics between the source and target speakers are described. The transformation of vocal tract characteristics is achieved by modifying the LPC cepstral coefficients using Linear Multivariate Regression (LMR). Prosodic transformation is done by changing the average pitch period between speakers, and it is applied to the residual signal using the LP-PSOLA scheme. Experimental results show that transformed speech by LMR and LP-PSOLA synthesis method contains much characteristics of the target speaker.

  • PDF

The Prosodic Characteristics of Children with Cochlear Implant with Respect to the Articulation Rate, Pause, and Duration (인공와우이식 아동의 운율 특성 - 조음속도와 쉼, 지속시간을 중심으로 -)

  • Oh, Soonyoung;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.117-127
    • /
    • 2012
  • This research reports the prosodic characteristics (including articulation speech rate, pause characteristics, duration) of children with cochlear implants with reference to those of children with normal hearing. Subjects are 8-to 10-year-old children, balancing each number of gender as 24. Dialogue speech data are comprised of four types of sentence patterns. Results show that 1) there's a statistically meaningful difference on articulation speech rate between the two groups. 2) On pauses, they are not observed in exclamatory and declarative sentences in normal children. While imperative sentences show no statistical difference on the number of pauses between the two groups, interrogative sentences do. 3) Declarative, exclamatory, and interrogative sentences reveal statistical difference between the two groups in terms of the sentence's final two-syllable word duration, showing no difference on imperative sentences. 4) When it comes to the RFP (duration ratio of sentence final syllable to penultimate syllable), we no statistically meaningful difference between the two groups in all types of sentences exists. 5) Lastly, RWS (the ratio of sentence final two syllable word duration to that of whole sentence duration) shows statistical difference between two groups in imperative sentences, but not in all the rest types.

Pitch Contour Conversion Using Slanted Gaussian Normalization Based on Accentual Phrases

  • Lee, Ki-Young;Bae, Myung-Jin;Lee, Ho-Young;Kim, Jong-Kuk
    • Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.31-42
    • /
    • 2004
  • This paper presents methods using Gaussian normalization for converting pitch contours based on prosodic phrases along with experimental tests on the Korean database of 16 declarative sentences and the first sentences of the story of 'The Three Little Pigs'. We propose a new conversion method using Gaussian normalization to the pitch deviation of pitch contour subtracted by partial declination lines: by using partial declination lines for each accentual phrase of pitch contour, we avoid the problem that a Gaussian normalization using average values and standard deviations of intonational phrase tends to lose individual local variability and thus cannot modify individual characteristics of pitch contour from a source speaker to a target speaker. From the results of the experiments, we show that this slanted Gaussian normalization using these declination lines subtracted from pitch contour of accentual phrases can modify pitch contour more accurately than other methods using Gaussian normalization.

  • PDF

Ambiguity Types of the Homonymic & Heterographic Units for Improving Korean Voice Recognition System - a Preliminary Research (한국어 음성인식 시스템 향상을 위한 동음이철 단위의 중의성 유형 분류)

  • Yoon, Ae-Sun;Kang, Mi-Young
    • Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.67-81
    • /
    • 2008
  • The accuracy rate of P2G (Phoneme-to-Grapheme) is one of the important factors determining the quality of unlimited voice recognition (VR) systems. Few studies were, however, conducted to reduce ambiguities of a phoneme string which can be segmented into a variety of different linguistic units (i.e. morphemes, words, eo-jeols), thus be transformed into more than one grapheme string. This paper is a preliminary research for building a large knowledge base of those homonymic & heterographic units(HHUs), which will provide unlimited Korean VR systems with more accurate P2G information. This paper analyzes 2 main factors generating HHUs: (1) boundary determination of the prosodic unit; (2) its segmentation into linguistic units. In this paper, linguistic characteristics determining variable boundaries of a prosodic unit are investigated, and the ambiguity types of HHUs are classified in accordance with their morphological and syntactic structures as well as with the phonological rules governing them.

  • PDF

Intonational Characteristics of Korean Focus Realization by American Learners of Korean

  • Oh, Mi-Ra;Kang, Sun-Mi;Kim, Kee-Ho
    • Speech Sciences
    • /
    • v.11 no.1
    • /
    • pp.131-145
    • /
    • 2004
  • The informative or important entities in utterances are focused and the focused items are usually accompanied by changes in phonetic manifestation. Phonetic realizations triggered by focus include changes of tonal contours as well as segmental strengthening. Focus in Korean is characterized by new phrase initiation, dephrasing, and initial tone contour with an enlarged pitch range in addition to segmentally lengthened initial segment. Focusing on the prosodic cues which play an important role in delivering the speakers' intention, this study aims to find out what intonational characteristics of Korean focus are realized by English learners of Korean. The English learners are divided into two groups according to their fluency in Korean, and the differences in focus realization between each group are discussed. Furthermore, the phonological and phonetic realizations of focus by English learners of Korean are compared to those by Korean native speakers. The results of this study yields two suggestions for Korean intonation education of L2 learners. First, the comparison between the two speaker groups can give better understanding in how and why the Korean intonation of English speakers is different from that of Koreans. Second, each phonological and phonetic characteristic of focus realization can weigh differently and its realization provides a criterion for evaluation of L2 Korean proficiency.

  • PDF

Pitch Patterns of Interrogative Sentences in relation to the Focus (초점과 관련된 의문문 억양 패턴 실험)

  • Kim, Mi-Ran;Shin, Dong-Hyun;Choe, Jae-Woong;Kim, Kee-Ho
    • Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.203-217
    • /
    • 2000
  • In spoken language, the characteristics of prosodic realization are related to the meaning of utterance. The pitch pattern of an interrogative sentence which differs from that of declarative sentences can be considered in this respect.. If we consider the question-answer pair, we can find that the most important variation comes from the intended meaning of asking. In this paper, we experiment with four kinds of interrogative sentences and show that the difference in pitch patterns of interrogative sentences can be explained in relation to the focus phenomena that is, the differences of the boundary tones in interrogative sentences are due to the differences in the prosodic domain of focus. For a relevant explanation with the focus phenomena, we divided focus into the categories: emphatic focus, which plays a role in delivering the speaker's intended meaning for the sentence interpretation, and informational focus, delivers the central intended meaning of the utterance. The results can be summarized in three points. First, High boundary tone delivers the meaning of asking. Second, the realization of different boundary tones that are found in wh-question and alternative question are just phonetic variations caused by focusing. Third, the high rise boundary tone in echo questions is related to the meaning of surprise or incredulity, and this relation is a consensus of existing opinion, that is, the speaker's attitude of surprise can raise the pitch range. From these results we can distinguish between boundary type and phonetic variation, and we can also give appropriate meaning to the different boundary tones in interrogative sentences that have been regarded as merely a part of sentence type.

  • PDF

An Acoustic Study on the Voice Imitation(3) - Based on a professional voice imitator′s speech - (모방 발화의 음향음성학적 연구(3) -전문 성대 모사자의 자료를 중심으로-)

  • Ahn Byoung-seob;Park Mi-young
    • MALSORI
    • /
    • no.52
    • /
    • pp.1-14
    • /
    • 2004
  • In this study, we investigated acoustic characteristics of imitated utterances by a professional voice imitator, focusing on prosodic properties such as vowel formants and f0 distribution. To see the patterns of a voice imitation by a professional voice imitator, we compared the imitator's voice data with target speakers' voice data. The professional imitator, Mr. Bae produced utterances imitating the former President Kim's, the comedian Choi's, and the singer Bae's voices. Auditorily, the imitator was judged to imitate all the target speakers' voices successfully. However, acoustic examination showed that the imitator was better at imitating the singer Bae's voice in that the imitator's and the singer Bae's voices are more alike with respect to vowel formants and f0 distribution. We infer this is because the imitator's normal voice is very similar to the singer Bae's voice. On the other hand, the imitator's voice data showed that the patterns of vowel formants and f0 distribution found in the imitator's imitation voices of the other two target speakers were different from those of target speakers' voices.

  • PDF

Unit Generation Based on Phrase Break Strength and Pruning for Corpus-Based Text-to-Speech

  • Kim, Sang-Hun;Lee, Young-Jik;Hirose, Keikichi
    • ETRI Journal
    • /
    • v.23 no.4
    • /
    • pp.168-176
    • /
    • 2001
  • This paper discusses two important issues of corpus-based synthesis: synthesis unit generation based on phrase break strength information and pruning redundant synthesis unit instances. First, the new sentence set for recording was designed to make an efficient synthesis database, reflecting the characteristics of the Korean language. To obtain prosodic context sensitive units, we graded major prosodic phrases into 5 distinctive levels according to pause length and then discriminated intra-word triphones using the levels. Using the synthesis unit with phrase break strength information, synthetic speech was generated and evaluated subjectively. Second, a new pruning method based on weighted vector quantization (WVQ) was proposed to eliminate redundant synthesis unit instances from the synthesis database. WVQ takes the relative importance of each instance into account when clustering similar instances using vector quantization (VQ) technique. The proposed method was compared with two conventional pruning methods through objective and subjective evaluations of synthetic speech quality: one to simply limit the maximum number of instances, and the other based on normal VQ-based clustering. For the same reduction rate of instance number, the proposed method showed the best performance. The synthetic speech with reduction rate 45% had almost no perceptible degradation as compared to the synthetic speech without instance reduction.

  • PDF

The effect of word length on f0 intervals: Evidence from North Kyungsang children

  • Kim, Jungsun
    • Phonetics and Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.107-116
    • /
    • 2015
  • The present experiment investigated the effect of word length on the length of f0 intervals for North Kyungsang children. In order to find out the lengths of the f0 intervals, the f0 values at the midpoints of vowels in words were measured. F0 estimates were computed as intervals consistent with the logarithmic scale corresponding to the number of syllables in the words. The results indicated that the mean f0 intervals in words of different lengths showed a significant difference for the HH in HH vs. HHL and the LH in LH vs. LLH for North Kyungsang children. Adult speakers from the North Kyungsang region significantly differed only within the HH in HH vs. HHL. Adult speakers made a noticeable contribution in this characteristic from the children. The result of the adult study was presented to confirm whether the children used a North Kyungsang dialect. With respect to individual speaker differences, the North Kyungsang children showed more or less consistent patterns in quantile-quantile plots for the HH vs. HHL, but for the HL vs. LHL and LH vs. LLH, there were more variations than for the HH vs. HHL. The individual speakers' variation was the largest for the HL vs. LHL and the smallest for HH vs. HHL. Considering these results, the effect of word length on f0 intervals tended to show pitch accent-type-specific characteristics in the process of prosodic acquisition.