• 제목/요약/키워드: prosodic boundaries

검색결과 40건 처리시간 0.023초

PROSODY IN SPEECH TECHNOLOGY - National project and some of our related works -

  • Hirose Keikichi
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 2002년도 하계학술발표대회 논문집 제21권 1호
    • /
    • pp.15-18
    • /
    • 2002
  • Prosodic features of speech are known to play an important role in the transmission of linguistic information in human conversation. Their roles in the transmission of para- and non- linguistic information are even much more. In spite of their importance in human conversation, from engineering viewpoint, research focuses are mainly placed on segmental features, and not so much on prosodic features. With the aim of promoting research works on prosody, a research project 'Prosody and Speech Processing' is now going on. A rough sketch of the project is first given in the paper. Then, the paper introduces several prosody-related research works, which are going on in our laboratory. They include, corpus-based fundamental frequency contour generation, speech rate control for dialogue-like speech synthesis, analysis of prosodic features of emotional speech, reply speech generation in spoken dialogue systems, and language modeling with prosodic boundaries.

  • PDF

가변 Break를 이용한 코퍼스 기반 일본어 음성 합성기의 성능 향상 방법 (A Performance Improvement Method using Variable Break in Corpus Based Japanese Text-to-Speech System)

  • 나덕수;민소연;이종석;배명진
    • 한국음향학회지
    • /
    • 제28권2호
    • /
    • pp.155-163
    • /
    • 2009
  • Text-to-speech 시스템에서 입력 텍스트로부터 운율 정보를 생성하기 위해서는 운율구 경계, 음소 지속시간, 기본주파수 포락선 설정의 3가지 기본적인 모듈이 필요하다. Break 인덱스 (BI; Break Index)는 합성기에서 운율구의 경계를 나타내고, 자연스러운 합성음을 생성하기 위해서는 BI를 정확히 예측하여야 한다. 그러나 BI는 문장의 의미나 화자의 읽기 습관(reading style)에 따라 임의적으로 결정되는 경우가 많아 정확한 예측이 매우 어렵다. 특히 일본어 합성기에서는 악센트 구 경계 (APB; Accentual Phrase Boundary)와 major phrase 경계 (MPB; Major Phrase Boundary)의 정확한 예측이 어렵다. 따라서 본 논문에서는 APB와 MPB 예측 오류를 보완할 수 있는 방법을 제안한다. BI를 고정 break (FB; Fixed Break)와 가변 break (VB; Variable Break)로 분류하여 합성단위 선택을 수행한다. 일반적으로 BI는 한번 생성되면 변하지 않는다. 따라서 BI가 잘못 생성된 경우 최적의 합성음을 생성할 수 없게 되는데, VB는 생성된 BI와 그것과 유사한 BI를 함께 이용하여 합성단위 선택을 수행함으로써 합성음의 BI가 생성된 BI와 다를 수 있는 것을 의미한다. APB와 MPB에 해당하는 BI에 대하여 VB인지 FB인지 CART(Classification and Regression Tree)를 이용하여 예측하고, VB인 경우 기본 주파수와 음소 지속시간에 대해 다중 운율 모델을 생성하여 합성단위 선택을 수행하였다. MOS 테스트 결과 원음이 4.99, 제안한 방법을 4.25, 기존의 방법은 4.01로 합성음의 자연성을 향상시킬 수 있었다.

Accentual Effects on Lateralization

  • Kim, Soo-Jung
    • 음성과학
    • /
    • 제8권3호
    • /
    • pp.15-30
    • /
    • 2001
  • Lateralization, the change of a coronal nasal into a lateral in an l-n sequence, has been considered to be prosodically unrestricted, e.g. an utterance-span rule, in Korean (Han 1993, Park 1990). However, aerodynamic data of the nasal do not corroborate their claims. In the paper, I look at how lateralization can best be characterized. Specifically, I ask whether its domain is best treated via a syntax-based (Nespor & Vogel 1986, Selkirk 1984) or an intonation-based approach (Pierrehumbert 1980, Jun 1993) to prosodic structure. Based on nasal airflow data as a means of monitoring velum activity coincident with a nasal stop in an l-n sequence, combined with pitch tracks to define an accentual phrase, I argue that lateralization is neither an utterance-span rule nor a syntax-based rule. Sentences recorded with a potential environment for lateralization show that lateralization occurs within an accentural phrase but is blocked between accentual phrase boundaries. When intonation-based and syntax-based models disagree about phrase boundaries, lateralization only occurs where the intonation-based model predicts it will. This indicates that lateralization is best defined as an accentual pheonomenon, being sensitive to the accentual phrase. This finding lends further support to an intonation-based model for Korean prosodic structure (Jun 1993).

  • PDF

모음 상승 현상의 음성적 고찰: 어미 {-고}의 실현을 중심으로 (A Phonetic Study of Vowel Raising: A Closer Look at the Realization of the Suffix {-go})

  • 이향원;신지영
    • 한국어학
    • /
    • 제81권
    • /
    • pp.267-297
    • /
    • 2018
  • Vowel raising in Korean has been primarily treated as a phonological, categorical change. This study aims to show how the Korean connective suffix {-go} is realized in various environments, and propose a principle of vowel raising based on both acoustic and perceptual data. To that end, we used a corpus of spoken Korean to analyze the types of syntactic constructions, the realization of prosodic boundaries (IP and PP), and the types of boundary tone associated with {-go}. It was found that the vowel tends to be raised most frequently in utterance-final position, while in utterance-medial position the vowel was raised more when the syntactic and prosodic distance between {-go} and the following constituent was smaller. The results for boundary tone also showed a correlation between vowel raising and the discourse function of the boundary tone. In conclusion, we propose that vowel raising is not simply an optional phenomenon, but rather a type of phonetic reduction related to the comprehension of the following constituent.

효율적인 기계학습 자질 선별을 통한 한국어 운율구 경계 예측 모델의 성능 향상 (Performance Improvement of a Korean Prosodic Phrase Boundary Prediction Model using Efficient Feature Selection)

  • 김민호;권혁철
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제37권11호
    • /
    • pp.837-844
    • /
    • 2010
  • 운율구 경계 예측은 대화체 음성합성을 실현하기 위한 주요한 자연언어처리 기술 중 하나이다. 본 논문은 자연스러운 한국어 운율구 경계 예측을 실현하고자 기존의 학습 자질을 대신할 새로운 학습 자질을 제안한다. 이 새로운 자질들은 기존의 학습 자질보다 실제 언어생활에서 운율구 경계 발생에 영향을 미치는 여러 요인을 더 잘 반영한다. 특히, 수작업으로 구축한 운율구 경계 예측 규칙을 이용하여 추출한 학습 자질은 높은 정확도 향상에 이바지한다. 본 논문에서 제안한 새로운 학습 자질을 바탕으로 CRFs(Conditional Random Fields)를 이용하여 운율구 경계 예측 모델을 만들었다. 그 결과 3단계 운율구 경계(강한 경계, 약한 경계, 운율구 내부 비경계) 예측에서 86.63%의 정확도를, 6단계 운율구 경계(상승조/하강조 강한 경계, 상승조/하강조/평탄조 약한 경계, 운율구 내부 비경계) 예측에서는 81.14%의 정확도를 보였다.

Prosodic Features at "Sentence Boundaries" in Oral Presentations

  • Umesaki, Atsuko-Furuta
    • 대한음성학회지:말소리
    • /
    • 제41호
    • /
    • pp.83-96
    • /
    • 2001
  • It is generally said that falling intonation is used at the end of a declarative sentence. However, this is not the case with all stretches of spontaneous speech which are marked in transcription as sentences. The present paper examines intonation patterns appearing at the end of declarative sentences in oral presentations, and discusses instances where falling intonation does not appear. The texts used for analysis are eight oral presentations collected at international conferences in the field of physics. Quantitative and qualitative analyses are carried out. Three major factors related to discourse structure have been found for non-occurrence of falling intonation at sentence boundaries.

  • PDF

Prosodic Features at "Sentence Boundaries" in Oral Presentations

  • Umesaki, Atsuko-Furuta
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2000년도 7월 학술대회지
    • /
    • pp.149-164
    • /
    • 2000
  • It is generally said that falling intonation is used at the end of a declarative sentence. However, this is not the case with all stretches of spontaneous speech which are marked in transcription as sentences. The present paper examines intonation patterns appearing at the end of declarative sentences in oral presentations, and discusses instances where falling intonation does not appear. The texts used for analysis are eight oral presentations collected at international conferences in the field of physics. Quantitative and qualitative analyses are carried out. Three major factors related to discourse structure have been found for nonoccurrence of falling intonation at sentence boundaries.

  • PDF

국어 파열연자음 유성음화에 관한 음향음성학적 고찰 -운율구조와 관련하여- (An acoustic study of Korean lenis stop voicing - in relation to prosodic structure -)

  • 김효숙;김선주;김선미
    • 대한음성학회지:말소리
    • /
    • 제39호
    • /
    • pp.15-24
    • /
    • 2000
  • This study aims to reexamine Korean Lenis Stop Voicing (henceforth, LSV) and to specify its phonetic conditions in phonetic terms. LSV optionally occurs within certain prosodic domains. They are called 'Malthomak'(Lee, 1996),'phonological phrase'(Kang, 1992), or 'accentual phrase'(Jun, 1993). On the basis of Jun's phrasing, this study focuses on the more specific phonetic conditions of LSV in the accentual phrase medial position, sub-classifying voicing as complete and partial. The results shows that whether the stops become completely voiced or partially voiced was determined by the various phonetic environments, such as adjacent segments and following intonational phrase boundaries. It is shown that the conditions of LSV should be described in terms of more detailed phonetic environments and that they could be used in predicting the class of voicing.

  • PDF

한국어 음성인식 시스템 향상을 위한 동음이철 단위의 중의성 유형 분류 (Ambiguity Types of the Homonymic & Heterographic Units for Improving Korean Voice Recognition System - a Preliminary Research)

  • 윤애선;강미영
    • 음성과학
    • /
    • 제15권4호
    • /
    • pp.67-81
    • /
    • 2008
  • The accuracy rate of P2G (Phoneme-to-Grapheme) is one of the important factors determining the quality of unlimited voice recognition (VR) systems. Few studies were, however, conducted to reduce ambiguities of a phoneme string which can be segmented into a variety of different linguistic units (i.e. morphemes, words, eo-jeols), thus be transformed into more than one grapheme string. This paper is a preliminary research for building a large knowledge base of those homonymic & heterographic units(HHUs), which will provide unlimited Korean VR systems with more accurate P2G information. This paper analyzes 2 main factors generating HHUs: (1) boundary determination of the prosodic unit; (2) its segmentation into linguistic units. In this paper, linguistic characteristics determining variable boundaries of a prosodic unit are investigated, and the ambiguity types of HHUs are classified in accordance with their morphological and syntactic structures as well as with the phonological rules governing them.

  • PDF

An Acoustic Analysis of the Aspiration Merger in Korean

  • Mi, Jang
    • 말소리와 음성과학
    • /
    • 제3권1호
    • /
    • pp.67-75
    • /
    • 2011
  • In Korean, 'Aspiration Merger' is the result of the heteromorphemic sequence of lenis stop and /h/ becoming a single aspirated stop word-medially. However, the contrast between lenis stop-plus-/h/ and an underlying aspirated stop is maintained when they span Phonological Phrase boundaries. By varying the position in the prosodic domain such as APP (Across Phonological Phrase) and PPM (Phonological Phrase Medial) positions, the phonetic properties of the two categories are compared. In the results from noise duration and change of intensity, lenis stop-plus-/h/ show a large difference between the APP and PPM positions. The results from a noise duration comparison show that the two categories are completely neutralized into aspirated stop in the PPM position and the complete neutralization is sensitive to prosodic phrasing.

  • PDF