Search | Korea Science

A Computation Study of Prosodic Structures of Korean for Speech Recognition and Synthesis:Predicting Phonological Boundaries (음성인식.합성을 위한 한국어 운율단위 음운론의 계산적 연구:음운단위에 따른 경계의 발견)

Lee, Chan-Do
- The Transactions of the Korea Information Processing Society
- /
- v.4 no.1
- /
- pp.280-287
- /
- 1997
The introduction of phonological knowledge, prosodic information to speech recognition and synthesis systems is very important to build successful spoken language systems. First, related works of computational phonology is overviewed and the theoretical and experimental studies of prosodic structures and boundaries in Korean are summarized. The main focus of this study is to decide which prosodic phrasing trained on a simple recurrent network. The results show information other than phonetic features. This method can be combined with other useful information to predict the boundaries more correctly and to help segmentation, which are vital for the successful speech recognition and synthesis systems.
PDF

Aspects of Chinese Korean learners' production of Korean aspiration at different prosodic boundaries (운율 층위에 따른 중국인학습자들의 한국어 유기음화 적용 양상)

Yune, Youngsook
- Phonetics and Speech Sciences
- /
- v.9 no.4
- /
- pp.9-17
- /
- 2017
The aim of this study is to examine whether Chinese Korean learners (CKL) can correctly produce the aspiration in 'a lenis obstruents /k/, /t/, /p/, /ʧ/+/h/ sound' sequence at the lexical and post-lexical level. For this purpose 4 Korean native speakers (KNS), 10 advanced and 10 intermediate CKL participated in a production test. The material analyzed consisted of 10 Korean sentences in which aspiration can be applied at different prosodic boundaries (syllable, word, accentual phrase). The results showed that for KNS and CKL, the rate of application of aspiration was different according to prosodic boundaries. Aspiration was more frequently applied at the lexical level than at the post-lexical level and it was more frequent at the word boundary than at the accentual phrase boundary. For CKL, pronunciation errors were either non-application of aspiration or coda obstruent omission. In the case of non-application of aspiration, CKL produced the target syllable as an underling form and they did not transform it as a surface form. In the case of coda obstruent ommision, most of the errors were caused by the inherent complexity of phonological process.
https://doi.org/10.13064/KSSS.2017.9.4.009 인용 PDF KSCI

Prosodic Annotation in a Thai Text-to-speech System

Potisuk, Siripong
- Proceedings of the Korean Society for Language and Information Conference
- /
- 2007.11a
- /
- pp.405-414
- /
- 2007
This paper describes a preliminary work on prosody modeling aspect of a text-to-speech system for Thai. Specifically, the model is designed to predict symbolic markers from text (i.e., prosodic phrase boundaries, accent, and intonation boundaries), and then using these markers to generate pitch, intensity, and durational patterns for the synthesis module of the system. In this paper, a novel method for annotating the prosodic structure of Thai sentences based on dependency representation of syntax is presented. The goal of the annotation process is to predict from text the rhythm of the input sentence when spoken according to its intended meaning. The encoding of the prosodic structure is established by minimizing speech disrhythmy while maintaining the congruency with syntax. That is, each word in the sentence is assigned a prosodic feature called strength dynamic which is based on the dependency representation of syntax. The strength dynamics assigned are then used to obtain rhythmic groupings in terms of a phonological unit called foot. Finally, the foot structure is used to predict the durational pattern of the input sentence. The aforementioned process has been tested on a set of ambiguous sentences, which represents various structural ambiguities involving five types of compounds in Thai.
PDF

Prediction of Prosodic Boundaries Using Dependency Relation

Kim, Yeon-Jun;Oh, Yung-Hwan
- The Journal of the Acoustical Society of Korea
- /
- v.18 no.4E
- /
- pp.26-30
- /
- 1999
This paper introduces a prosodic phrasing method in Korean to improve the naturalness of speech synthesis, especially in text-to-speech conversion. In prosodic phrasing, it is necessary to understand the structure of a sentence through a language processing procedure, such as part-of-speech (POS) tagging and parsing, since syntactic structure correlates better with the prosodic structure of speech than with other factors. In this paper, the prosodic phrasing procedure is treated from two perspectives: dependency parsing and prosodic phrasing using dependency relations. This is appropriate for Ural-Altaic, since a prosodic boundary in speech usually concurs with a governor of dependency relation. From experimental results, using the proposed method achieved 12% improvement in prosody boundary prediction accuracy with a speech corpus consisting 300 sentences uttered by 3 speakers.
PDF

A cross-modal naming study: Effects of prosodic boundaries on the comprehension of relative clauses in Japanese

Kang, Soyoung;Kashiwagi, Akiko;Nakayama, Mineharu;Speer, Shari R.
- Cross-Cultural Studies
- /
- v.24
- /
- pp.157-169
- /
- 2011
Compared to studies on prosodic effects on the comprehension of syntactic ambiguity in English, there are relatively few that investigated prosodic effects in East-Asian languages. This study examined the role of prosodic information in processing syntactically ambiguous sentences in Japanese. For syntactically ambiguous sentences containing relative clauses, this paper investigated whether prosodic information is immediately available during the process of these ambiguous sentences. Results from an auditory comprehension experiment with an on-line, cross-modal naming task seemingly suggest that contrary to the findings from the off-line study that examined the same constructions, prosodic information may not be immediately available to Japanese listeners. A possible account for failure to obtain effects of prosodic information is provided.

Automatic Synthesis Method Using Prosody-Rich Database (대용량 운율 음성데이타를 이용한 자동합성방식)

김상훈
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.08a
- /
- pp.87-92
- /
- 1998
In general, the synthesis unit database was constructed by recording isolated word. In that case, each boundary of word has typical prosodic pattern like a falling intonation or preboundary lengthening. To get natural synthetic speech using these kinds of database, we must artificially distort original speech. However, that artificial process rather resulted in unnatural, unintelligible synthetic speech due to the excessive prosodic modification on speech signal. To overcome these problems, we gathered thousands of sentences for synthesis database. To make a phone level synthesis unit, we trained speech recognizer with the recorded speech, and then segmented phone boundaries automatically. In addition, we used laryngo graph for the epoch detection. From the automatically generated synthesis database, we chose the best phone and directly concatenated it without any prosody processing. To select the best phone among multiple phone candidates, we used prosodic information such as break strength of word boundaries, phonetic contexts, cepstrum, pitch, energy, and phone duration. From the pilot test, we obtained some positive results.
PDF

Acquisition of prosodic phrasing and edge tones by Korean learners of English

Choe, Wook Kyung
- Phonetics and Speech Sciences
- /
- v.8 no.4
- /
- pp.31-38
- /
- 2016
The purpose of the current study was to examine the acquisition of the second language prosody by Korean learners of English. Specifically, this study investigated Korean learners' patterns of prosodic phrasing and their use of edge tones (i.e., phrase accents and boundary tones) in English, and then compared the patterns with those of native English speakers. Eight Korean learners and 8 native speakers of English read 5 different English passages. Both groups' patterns of tones and prosodic phrasing were analyzed using the Mainstream American English Tones and Break Indices (MAE_ToBI) transcription conventions. The results indicated that the Korean learners chunked their speech into prosodic phrases more frequently than the native speakers did. This frequent prosodic phrasing pattern was especially noticeable in sentence-internal prosodic phrases, often where there was no punctuation mark. Tonal analyses revealed that the Korean learners put significantly more High phrase accents (H-) on their sentence-internal intermediate phrase boundaries than the native speakers of English. In addition, compared with the native speakers, the Korean learners used significantly more High boundary tones (both H-H% and L-H%) for the sentence-internal intonational phrases, while they used similar proportion of High boundary tones for the sentence-final intonational phrases. Overall, the results suggested that Korean learners of English successfully acquired the meanings and functions of prosodic phrasing and edge tones in English as well as that they are able to efficiently use these prosodic features to convey their own discourse intention.
https://doi.org/10.13064/KSSS.2016.8.4.031 인용 PDF KSCI

A Prosodic Labeling System of Intonation Patterns and Prosodic Structures in Korean

Cho, Yong-Hyung
- Speech Sciences
- /
- v.4 no.1
- /
- pp.113-133
- /
- 1998
The system proposed in this paper prosodically transcribes the intonation patterns, prosodic structures, phrasings, and other prosodic aspects of Korean utterances, on four parallel tiers: a tone tier, an orthographic tier, a break index tier, and a miscellaneous tier. The tone tier employs two phrase accents (L* and H *), three accentual phrase boundary tones (L-, H-, LH-), and four intonational phrase boundary tones (L%,H%,LH%,LHL%) in order to provide a phonological transcription of pitch events associated with accented syllables and phrase boundaries. The break index tier uses five break indices, numbered from 0 to 4, which mark a prosodic grouping of words and its prosodic structure in an utterance. Among the five indices, the break index 3 and the break index 4 align with an accentual phrase boundary tone and an intonational phrase boundary tone, respectively, in the tone tier.
PDF

Analysis and Prediction of Prosodic Phrage Boundary (운율구 경계현상 분석 및 텍스트에서의 운율구 추출)

Kim, Sang-Hun;Seong, Cheol-Jae;Lee, Jung-Chul
- The Journal of the Acoustical Society of Korea
- /
- v.16 no.1
- /
- pp.24-32
- /
- 1997
This study aims to describe, at one aspect, the relativity between syntactic structure and prosodic phrasing, and at the other, to establish a suitable phrasing pattern to produce more natural synthetic speech. To get meaningful results, all the word boundaries in the prosodic database were statistically analyzed, and assigned by the proper boundary type. The resulting 10 types of prosodic boundaries were classified into 3 types according to the strength of the breaks, which are zero, minor, and major break respectively. We have found out that the durational information was a main cue to determine the major prosodic boundary. Using the bigram and trigram of syntactic information, we predicted major and minor classification of boundary types. With brigram model, we obtained the correct major break prediction rates of 4.60%, 38.2%, the insertion error rates of 22.8%, 8.4% on each Test-I and Test-II text database respectively. With trigram mode, we also obtained the correct major break prediction rates of 58.3%, 42.8%, the insertion error rates of 30.8%, 42.8%, the insertion error rates of 30.8%, 11.8% on Test-I and Test-II text database respectively.
PDF

The Role of Pitch Range Reset in Korean Sentence Processing

Kong, Eun-Jong
- Phonetics and Speech Sciences
- /
- v.2 no.1
- /
- pp.33-39
- /
- 2010
This study investigates the effect of pitch range reset in Korean listeners' processing of syntactically ambiguous participle structures. Unlike Japanese and English,in Korean, the downtrend or the reset of pitch range does not consistently differentiate Accentual Phrases (AP), a lower level of phrasing, from Intonational Phrases (IP), a higher level of phrasing. Therefore, we explore Korean listeners' comprehension patterns for syntactically ambiguous speech strings varying in 1) the relative height of F0 peaks across prosodic units, and 2) the types of prosodic phrasing, to see whether pitch range reset informs the recovery of syntactic structure even though it is not reflected in the intonational hierarchy in Korean. The results show that the hierarchical level of prosodic phrasing affects the parsing pattern of syntactic ambiguity. The pitch range reset also cued the location of syntactic boundaries, but this effect was confined to phrases across AP.
PDF

Search Result 40, Processing Time 0.018 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)