• 제목/요약/키워드: linguistic prosody

검색결과 18건 처리시간 0.021초

PROSODY IN SPEECH TECHNOLOGY - National project and some of our related works -

  • Hirose Keikichi
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 2002년도 하계학술발표대회 논문집 제21권 1호
    • /
    • pp.15-18
    • /
    • 2002
  • Prosodic features of speech are known to play an important role in the transmission of linguistic information in human conversation. Their roles in the transmission of para- and non- linguistic information are even much more. In spite of their importance in human conversation, from engineering viewpoint, research focuses are mainly placed on segmental features, and not so much on prosodic features. With the aim of promoting research works on prosody, a research project 'Prosody and Speech Processing' is now going on. A rough sketch of the project is first given in the paper. Then, the paper introduces several prosody-related research works, which are going on in our laboratory. They include, corpus-based fundamental frequency contour generation, speech rate control for dialogue-like speech synthesis, analysis of prosodic features of emotional speech, reply speech generation in spoken dialogue systems, and language modeling with prosodic boundaries.

  • PDF

'Because of Doing' and 'Because of Happening': A Corpus-based Analysis of Korean Causal Conjunctives, -nula(ko) and -nun palamey

  • Oh, Sang-Suk
    • 한국언어정보학회지:언어와정보
    • /
    • 제8권2호
    • /
    • pp.131-147
    • /
    • 2004
  • the two Korean causal conjunctive suffixes, -nula(ko) and -nun palamey, based on corpus linguistic analysis. Many of the linguistic accounts available, both in pedagogical reference and in the literature on linguistics, provide incomplete analyses of these suffixes, based on fabricated linguistic data. Using naturally occurring, real linguistic data, this paper examines the syntactic and semantic structures of the two causal suffixes through a consideration of three areas of corpus linguistic analysis: token frequencies, collocations, and semantic prosody. An analysis based on concordance data reveals that the two causal connectives, -nula(ko) and -nun palamey, have more differences than similarities in terms of syntactic and semantic constraints. The idiosyncratic structures of the two suffixes are discussed in terms of same subject condition, verb selection, same agent condition, synchronicity condition, and negative semantic prosody.

  • PDF

좌반구 손상과 우반구 손상 뇌졸중 환자의 의문문 유형에 따른 운율 특성 비교 (Comparison of prosodic characteristics by question type in left- and right-hemisphere-injured stroke patients)

  • 유영미;성철재
    • 말소리와 음성과학
    • /
    • 제13권3호
    • /
    • pp.1-13
    • /
    • 2021
  • 의사소통에 중요한 역할을 하는 운율은 기능에 따라 언어적 운율과 정서적 운율로 구분한다. 대뇌 편재화 관점에서 정서적 운율 처리는 우반구가 주로 담당한다는 결과가 일반론적으로 받아들여지지만, 언어적 운율에 대한 연구들은, 연구간의 방법적인 차이로 인해 결과가 상이하게 나타난다. 본 연구는 정상 화자 9명과 뇌졸중 환자 14명(좌반구 손상 7명, 우반구 손상 7명)의 세 집단을 대상으로 대뇌 편재화의 관점에서 언어적 운율 특성을 살펴보기 위해 세 가지 형태 의문문(의문사 의문문, 예-아니오 의문문, 선택 의문문)에서의 발화속도, 지속시간, 음도, 강도와 관련된 운율 특징을 청지각 평가와 함께 살펴보았다. 연구결과, 통계적으로 유의한 주요변수들이 좌반구 손상 환자의 자료에서 결함을 보였으며, 예-아니오 의문문과 선택 의문문보다 의문사 의문문에서 더욱 두드러졌다. 이러한 경향은 특히 음도와 발화속도 관련 변수들에서 두드러졌다. 본 연구의 결과는 한국어 사용자의 의문사 사용에 있어서 어휘-의미론적, 구문론적 정보와 같은 언어학적으로 관련 있는 운율 처리의 경우 우반구보다 좌반구에서 대체로 우세하다는 점을 방증한다.

음성합성을 위한 C-ToBI기반의 중국어 운율 경계와 F0 contour 생성 (Chinese Prosody Generation Based on C-ToBI Representation for Text-to-Speech)

  • 김승원;정옥;이근배;김병창
    • 대한음성학회지:말소리
    • /
    • 제53호
    • /
    • pp.75-92
    • /
    • 2005
  • Prosody Generation Based on C-ToBI Representation for Text-to-SpeechSeungwon Kim, Yu Zheng, Gary Geunbae Lee, Byeongchang KimProsody modeling is critical in developing text-to-speech (TTS) systems where speech synthesis is used to automatically generate natural speech. In this paper, we present a prosody generation architecture based on Chinese Tone and Break Index (C-ToBI) representation. ToBI is a multi-tier representation system based on linguistic knowledge to transcribe events in an utterance. The TTS system which adopts ToBI as an intermediate representation is known to exhibit higher flexibility, modularity and domain/task portability compared with the direct prosody generation TTS systems. However, the cost of corpus preparation is very expensive for practical-level performance because the ToBI labeled corpus has been manually constructed by many prosody experts and normally requires a large amount of data for accurate statistical prosody modeling. This paper proposes a new method which transcribes the C-ToBI labels automatically in Chinese speech. We model Chinese prosody generation as a classification problem and apply conditional Maximum Entropy (ME) classification to this problem. We empirically verify the usefulness of various natural language and phonology features to make well-integrated features for ME framework.

  • PDF

CRF를 이용한 운율경계추성 성능개선 (Improvements on Phrase Breaks Prediction Using CRF (Conditional Random Fields))

  • 김승원;이근배;김병창
    • 대한음성학회지:말소리
    • /
    • 제57호
    • /
    • pp.139-152
    • /
    • 2006
  • In this paper, we present a phrase break prediction method using CRF(Conditional Random Fields), which has good performance at classification problems. The phrase break prediction problem was mapped into a classification problem in our research. We trained the CRF using the various linguistic features which was extracted from POS(Part Of Speech) tag, lexicon, length of word, and location of word in the sentences. Combined linguistic features were used in the experiments, and we could collect some linguistic features which generate good performance in the phrase break prediction. From the results of experiments, we can see that the proposed method shows improved performance on previous methods. Additionally, because the linguistic features are independent of each other in our research, the proposed method has higher flexibility than other methods.

  • PDF

한국어 대등적 연결어미 '-고'의 함축 의미와 운율 (The Implicational Meaning and Prosody of Conjunctive Marker '-ko' in Korean)

  • 김미란
    • 음성과학
    • /
    • 제8권4호
    • /
    • pp.289-305
    • /
    • 2001
  • The conjunctive marker '-ko' in Korean can be interpreted as meaning either conjunctive 'and' or ordering 'and then'. The interpretation of '-ko' is ambiguous in written texts but not in spoken texts. It is because the meaning of the utterance is determined by the combination of the text with its prosody. The two meanings of ' -ko' can be explained by the theory of implicature, which was introduced by Grice (1973, 1981). This paper examines the meaning of the marker '-ko' with respect to the relation between its meaning and prosody. The results of the experiments in this paper showed that the prosodic phrasing in Korean influences the interpretation of the marker '-ko'. When two constituents combined by '-ko' are realized in the same accentual phrase, the marker can be interpreted as meaning 'exactly be orderly'. This meaning can be classified as the Particularlized Conversational Implicature (PCl) in Gricean theory. In the other cases of phrasing, the marker '-ko' can mean either 'conjunctive' or 'be orderly' by the Generalized Conversational Implicature (GCI). The fact that phrasing determines the interpretations of the marker '-ko' can be seen as supporting the view that prosody interacts with various levels of linguistic phenomena from phonology to pragmatics.

  • PDF

Utilizing Prosodic Information on the Sentence Comprehension in Children with High Functioning Autism

  • Chung, Chan-Hee;Lee, Hee-Ran;Kim, Jin-Dong
    • 대한의생명과학회지
    • /
    • 제23권4호
    • /
    • pp.362-371
    • /
    • 2017
  • The purpose of this study is to investigate difficulties in using prosodic information to identify the meaning of ambiguous sentences in children with high functioning autism (HFA). Fifteen high functioning autistic children and fifteen children who matched their chronological age (CA) participated in this study. We compared the performance of the two groups by conducting syntactically and affectively ambiguous sentence comprehension (SASC and AASC) tasks. The results of this study show that in both tasks, the difference between the two groups was statistically significant at each condition and the performance of high functioning autistic children was significantly lower. In a correlation analysis of major variables, children who matched CA showed a correlation between prosody-only (PO) and AASC, while children with HFA showed a correlation between PO and MO (morpheme-only). Children with HFA used grammatical morpheme information to understand general sentences. We found that the ability to use prosodic information in children with HFA is significantly lower than that of normally developed children. Considering the relevance of prosody to linguistic, non-linguistic and emotional aspects of communication, improving prosodic perception is thought to be a way to mediate deficits in the comprehension of ambiguous sentences in children with HFA.

인공 신경망의 한국어 운율 발생에 관한 연구 (The Study on Korean Prosody Generation using Artificial Neural Networks)

  • 민경중;임운천
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 2004년도 춘계학술발표대회 논문집 제23권 1호
    • /
    • pp.337-340
    • /
    • 2004
  • 한국어 문-음성 합성 시스템(TTS: Text-To-Speech)은 합성음의 자연스러움을 증가시키기 위해 운율 발생 알고리듬을 만들어 시스템에 적용하고 있다. 운율 법칙은 각국의 언어에 대한 언어학적 정보나 자연음에서 구한 운율에 대한 지식을 기반으로 음성 합성 시스템에 적용하고 있다. 그러나 이렇게 구한 운율 법칙이 자연음에 존재하는 모든 운율 법칙을 포함할 수도 없고, 또 추출한 운율 법칙이 틀린 법칙이라면, 합성음의 자연감이나 이해도는 떨어질 것이므로, TTS의 실용화에 장애가 될 수 있다. 이러한 점을 감안하여 본 논문에서는 자연음에 내재하는 운율을 학습할 수 있는 인공 신경망을 이용한 운율발생 신경망을 제안하였다. 훈련단계에서 인공 신경망의 입력 단에 한국어 문장의 음소 열을 차례로 이동시켜 인가하면 입력 단의 중앙에 해당하는 음소의 운율 정보가 출력되도록 훈련시킬 때, 목표 패턴을 이용한 감독학습을 통해, 자연음에 내재하는 운율을 학습하도록 하였다. 평가 단계에서 문장의 음소 열을 입력하고, 추정율을 측정하여 인공 신경망이 한국어 문장에 내재하는 운율을 학습하여 발생시킬 수 있음을 살펴보았다.

  • PDF

Prosodic Contour Generation for Korean Text-To-Speech System Using Artificial Neural Networks

  • Lim, Un-Cheon
    • The Journal of the Acoustical Society of Korea
    • /
    • 제28권2E호
    • /
    • pp.43-50
    • /
    • 2009
  • To get more natural synthetic speech generated by a Korean TTS (Text-To-Speech) system, we have to know all the possible prosodic rules in Korean spoken language. We should find out these rules from linguistic, phonetic information or from real speech. In general, all of these rules should be integrated into a prosody-generation algorithm in a TTS system. But this algorithm cannot cover up all the possible prosodic rules in a language and it is not perfect, so the naturalness of synthesized speech cannot be as good as we expect. ANNs (Artificial Neural Networks) can be trained to learn the prosodic rules in Korean spoken language. To train and test ANNs, we need to prepare the prosodic patterns of all the phonemic segments in a prosodic corpus. A prosodic corpus will include meaningful sentences to represent all the possible prosodic rules. Sentences in the corpus were made by picking up a series of words from the list of PB (phonetically Balanced) isolated words. These sentences in the corpus were read by speakers, recorded, and collected as a speech database. By analyzing recorded real speech, we can extract prosodic pattern about each phoneme, and assign them as target and test patterns for ANNs. ANNs can learn the prosody from natural speech and generate prosodic patterns of the central phonemic segment in phoneme strings as output response of ANNs when phoneme strings of a sentence are given to ANNs as input stimuli.

Modelling Duration In Text-to-Speech Systems

  • 정현성
    • 대한음성학회지:말소리
    • /
    • 제49호
    • /
    • pp.159-174
    • /
    • 2004
  • The development of the durational component of prosody modelling was overviewed and discussed in text-to-speech conversion of spoken English and Korean, showing the strengths and weaknesses of each approach. The possibility of integrating linguistic feature effects into the duration modelling of TTS systems was also investigated. This paper claims that current approaches to language timing synthesis still require an understanding of how segmental duration is affected by context. Three modelling approaches were discussed: sequential rule systems, Classification and Regression Tree (CART) models and Sums-of-Products (SoP) models. The CART and SoP models show good performance results in predicting segment duration in English, while it is not the case in the SoP modelling of spoken Korean.

  • PDF