• Title/Summary/Keyword: Prosody

Search Result 208, Processing Time 0.02 seconds

Semantic Prosody and Meaning Equivalence: Is Korean pin konggan Equivalent to ‘Empty Space’ or ‘Blank Space’\ulcorner (의미운률과 의미 등가성: ‘빈 공간’은 ‘empty space’인가 ‘blank space’인가\ulcorner)

  • 조의연
    • Korean Journal of English Language and Linguistics
    • /
    • v.3 no.4
    • /
    • pp.589-609
    • /
    • 2003
  • The purpose of this paper is to show that lexical equivalency in translation can be achieved when it is based on semantic prosodies of lexical items. This paper examines the semantic prosodies of two seemingly synonymous English adjectives ‘empty’ and ‘blank’ on the basis of the corpus given in Cobuild English Collocations on CD-ROM and proposes that they are different in terms of spatial dimensions. Thus when a Korean equivalent pin derived from the verb pita is translated into English, syntagmatic phraseological environments of the Korean adjective must be taken into account to attain the equivalency of the source and target languages. Relevant Korean corpus was taken from the 21st Century Sejong Plan (2002). Out of 12 examples of pin konggan, five appear to be equivalent to ‘blank’ and seven to ‘empty.’ The five to seven ratio in different usage indicates that the equivalency problem concerning the lexical item pin is not a trivial matter in translation.

  • PDF

ToBI Based Prosodic Representation of the Kyungnam Dialect of Korean

  • Cho, Yong-Hyung
    • Speech Sciences
    • /
    • v.2
    • /
    • pp.159-172
    • /
    • 1997
  • This paper proposes a prosodic representation system of the Kyungnam dialect of Korean, based on the ToBI system. In this system, diverse intonation patterns are transcribed on the four parallel tiers: a tone tier, a break index tier, an orthographic tier, and a miscellaneous tier. The tone tier employs pitch accents, phrase accents, and boundary tones marked with diacritics in order to represent various pitch events. The break index tier uses five break indices, numbered from 0 to 4, in order to represent degrees of connectiveness in speech by associating each inter-word position with a break index. In this, each break index represents a boundary of some kind of constituent. This system can contribute not only to a more detailed theory connecting prosody, syntax, and intonation, but also to current text-to-speech synthesis approaches, speech recognition, and other quantitative computational modellings.

  • PDF

Listener's Age Estimation by Prosody Manipulation (운율 변조 양상에 따른 청자의 연령 지각)

  • Kim, Jiyoun;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.6 no.2
    • /
    • pp.81-88
    • /
    • 2014
  • The normal aging process on speech production and these changes are perceived by listeners. This study examined whether age perception changed under various conditions of prosodic manipulations in normal listeners, comparing the prosodic changes according to age and sex in adulthood. The older and younger voices were resynthesized by manipulation of the speaking rate and pitch to shift the perceived age of the groups toward each other. Two-way repeated ANOVA were conducted to determine if the prosodic type of resynthesized cue resulted in a significant shift in perceived age of young and old voices. The manipulation of the speaking rate resulted in a significant shift in perceived age for the older and younger groups. A significant shift in age estimates was not observed for the younger male group when pitch was manipulated. There were significant gender-by-age group interactions for prosodic manipulation type. Age-related changes in the prosodic properties of speech may ultimately influence speech perception.

POSTTS : Corpus Based Korean TTS based on Natural Language Analysis (POSTTS : 자연어 분석을 통한 코퍼스 기반 한국어 TTS)

  • Ha Ju-Hong;Zheng Yu;Kim Byeongchang;Lee Geunbae Lee
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.87-90
    • /
    • 2003
  • In order to produce high quality synthesized speech, it is very important to get an accurate grapheme-to-phoneme conversion and prosody model from texts using natural language processing. Robust preprocessing for non-Korean characters should also be required. In this paper, we analyzed Korean texts using a morphological analyzer, part-of-speech tagger and syntactic chunker. We present a new grapheme-to-phoneme conversion method, i.e. a dictionary-based and rule-based hybrid method, for unlimited vocabulary Korean TTS. We constructed a prosody model using a probabilistic method and decision tree-based method.

  • PDF

An Acoustic Phonetic Study about Voice Imitation(2) -Focusing on Prosody Feature- (모방발화에 대한 음향음성학적 연구(2) -운율 특징을 중심으로-)

  • Park Miyoung;Park Jihye;Shin Jiyoung;Kang Sunmee
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.56-60
    • /
    • 2003
  • The purpose of this paper is to research voice imitation. Voice imitation changes various phonetic feature. Also, in our experimental results, voice imitation has preferential prosody difference. For imitating voice, imitators change their fundamental frequency bandwidths for the most part. Imitative speakers change their high fundamental frequencies effectively while they maintain their low fundamental frequencies. Also, excellent group is distinctly superior to common group for imitating prosodic patterns. That is, the f0 bandwidth's change and the prosodic patterns are significant in imitating voice. But the low f0 is maintain by all speakers.

  • PDF

Variational autoencoder for prosody-based speaker recognition

  • Starlet Ben Alex;Leena Mary
    • ETRI Journal
    • /
    • v.45 no.4
    • /
    • pp.678-689
    • /
    • 2023
  • This paper describes a novel end-to-end deep generative model-based speaker recognition system using prosodic features. The usefulness of variational autoencoders (VAE) in learning the speaker-specific prosody representations for the speaker recognition task is examined herein for the first time. The speech signal is first automatically segmented into syllable-like units using vowel onset points (VOP) and energy valleys. Prosodic features, such as the dynamics of duration, energy, and fundamental frequency (F0), are then extracted at the syllable level and used to train/adapt a speaker-dependent VAE from a universal VAE. The initial comparative studies on VAEs and traditional autoencoders (AE) suggest that the former can efficiently learn speaker representations. Investigations on the impact of gender information in speaker recognition also point out that gender-dependent impostor banks lead to higher accuracies. Finally, the evaluation on the NIST SRE 2010 dataset demonstrates the usefulness of the proposed approach for speaker recognition.

Automatic severity classification of dysarthria using voice quality, prosody, and pronunciation features (음질, 운율, 발음 특징을 이용한 마비말장애 중증도 자동 분류)

  • Yeo, Eun Jung;Kim, Sunhee;Chung, Minhwa
    • Phonetics and Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.57-66
    • /
    • 2021
  • This study focuses on the issue of automatic severity classification of dysarthric speakers based on speech intelligibility. Speech intelligibility is a complex measure that is affected by the features of multiple speech dimensions. However, most previous studies are restricted to using features from a single speech dimension. To effectively capture the characteristics of the speech disorder, we extracted features of multiple speech dimensions: voice quality, prosody, and pronunciation. Voice quality consists of jitter, shimmer, Harmonic to Noise Ratio (HNR), number of voice breaks, and degree of voice breaks. Prosody includes speech rate (total duration, speech duration, speaking rate, articulation rate), pitch (F0 mean/std/min/max/med/25quartile/75 quartile), and rhythm (%V, deltas, Varcos, rPVIs, nPVIs). Pronunciation contains Percentage of Correct Phonemes (Percentage of Correct Consonants/Vowels/Total phonemes) and degree of vowel distortion (Vowel Space Area, Formant Centralized Ratio, Vowel Articulatory Index, F2-Ratio). Experiments were conducted using various feature combinations. The experimental results indicate that using features from all three speech dimensions gives the best result, with a 80.15 F1-score, compared to using features from just one or two speech dimensions. The result implies voice quality, prosody, and pronunciation features should all be considered in automatic severity classification of dysarthria.

Notes on Descriptions of the Prosodic System in French Grammars in the Age of Enlightenment & the Departure of the International Phonetic Alphabet (계몽주의 시대 프랑스 문법서에서 기술한 운율 현상과 국제음성기호의 출발에 대한 고찰)

  • Park, Moon-Kyou
    • The Journal of the Korea Contents Association
    • /
    • v.21 no.4
    • /
    • pp.658-667
    • /
    • 2021
  • Our study aimed to analyze and reinterpret, by an acoustic approach, the descriptions of the 18th century prosody and introduce the figurative pronunciation system, which is the International Phonetic Alphabet pioneer. Our methodology compares and analyzes grammars and documents on the transcription system and restructures the prosodic structure. It is certain that the 18th century grammarians widely accepted the prosody theories made by Arnauld & Lancelot of the seventeenth century. In particular, grammar scholars accepted the dichotomous classification of the accent structures as prosodic and oratorical accents. The prosodic accent has a relation to intonation, and the oratorical accent has as its key elements intonation and intensity. Regarding the temporal structure, the lengthening of the final syllable was observed systematically by grammarians of the 18th century. This time structure is similar to that of today. Therefore, we can conclude that the final elongation, an essential characteristic of the modern French accent, has already played an imbued role in 18th century prosody. Despite this, the 18th century grammarians did not assign it the status of accent, as it was a stereotype that matches accent with intonation.

Spontaneous Speech and Prosody DB (대화체 음성 및 운율 DB)

  • 이호영
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1995.06a
    • /
    • pp.298-301
    • /
    • 1995
  • 자연스런 대화체 발화를 합성해 낼 수 있는 음성합성기를 개발하고, 무한대 어휘의 대화체 발화를 인식할 수 있는 음성인식기를 개발하기 위해서는 정교하게 제작된 방대한 양의 대화체 음성 및 운율 DB를 필수적으로 갖춰야한다. 이 논문에서는 대화체 음성 자료의 수집 방법과 대화체 음성 및 운율 DB 제작 방법에 관해 자세하게 논의한다.

  • PDF

《시인옥설》에 나타난 송대 시안론

  • Lee, Gyu-Il
    • 중국학논총
    • /
    • no.68
    • /
    • pp.95-114
    • /
    • 2020
  • 《詩人玉屑》是南宋後期魏慶之所撰, 被評爲宋代三大詩話叢集之一. 此書收錄兩宋文人對詩格, 詩法的重要理論和見解. 尤其是宋代詩學的重要內容詩眼論, 《詩人玉屑》收錄很豐富的言論, 具有考察價值. 所謂詩眼指一首詩的眼目, 一句或一篇之內最精煉的字, 也是全詩主旨所在. 這概念從魏晉南北朝繪畫理論和禪宗理論發展來, 後來進入到宋代詩學領域. 詩眼的前提爲造語的創意性, 爲此需務去陳言而尋找新語. 宋人吸取經典語, 俗語, 方言, 禪語等, 以此爲創作的語言因子. 運用實字虛字是詩眼論的重要內容. 宋人認爲作詩眼時, 實字重於虛字, 動詞貴於名詞. 就位置而言, 宋人有"五字詩以第三字爲眼, 七字詩以第五字爲眼也"的共識, 同時强調活字, 響字, 拗字的活用. 但這不是要死守的固定原則, 而是相同於活法概念, 可以靈活運用.