• 제목/요약/키워드: Utterance

검색결과 382건 처리시간 0.025초

발화조건에 따른 정상 성인의 호흡 능력 차이 비교: 예비연구 (The Study of Breath Competence Depending on Utterance Condition by Healthy Speakers: a Preliminary Study)

  • 이인애;이혜은;황영진
    • 말소리와 음성과학
    • /
    • 제4권2호
    • /
    • pp.115-120
    • /
    • 2012
  • This study sought to compare breath competence in three different utterance conditions when reading a passage aloud, making a spontaneous speech, and singing. We tested 15 normal females (ages averaging $24{\pm}4.4$) and measured breath competence through an objective, aero-mechanical instrument called PAS (Phonatory aerodynamic system, model 6600, KAY Electronics, Inc). Breathing sets of inspiration and expiration were measured by breath group number, breath group duration, and the ratio of inspiration to expiration. The results from this study led us to the following conclusion: The breath group number and the breath group duration showed no significant difference. However, the only variance that we could find was in the ratio of inspiration and expiration. In significantly different speech patterns, singing resulted in the most varied ratio of inspiration and expiration, followed by reading a text aloud, and spontaneous speech. The average frequency rates and maximum intensity levels varied with regards to varying utterance conditions. This thus shows that breath competence and phonation competence have a closely interrelated relationship.

Initial-syllable lengthening of an utterance-internal phrase in Korean

  • Yun, Ilsung
    • 말소리와 음성과학
    • /
    • 제6권2호
    • /
    • pp.141-151
    • /
    • 2014
  • This study reports anti-hierarchical initial-syllable lengthening of an utterance-internal phrase in Korean. That is, the phrase-initial syllable (e.g., /a/ of "apa-do" or /ma/ of "mapa-do") starting with a voiced phoneme (i.e., vowels or voiced consonants) manifests itself as significantly longer when it is preceded by another phrase without a pause than when it leads an utterance or follows a pause utterance-internally. The phenomenon was examined with regard to two other factors: (1) tempo and (2) tenseness of the consonant (/p, $p^{\prime}$, $p^h$/) following the target syllable /a/. First, the effect of tempo on initial lengthening was not significant. Apart from the statistical significance, however, a tendency was observed, i.e., the slower the tempo is, the greater the lengthening. By contrast, the faster the tempo is, the higher the ratio (%) of lengthening. Second, contrary to our expectations, initial-syllable lengthening was even greater before tense stops /$p^{\prime}$, $p^h$/ than before lax stop /p/ regardless of tempo, and it was remarkable when it comes to the ratio (%), which means that initial lengthening is free of the pre-consonantal vowel shortening effect. Final-syllable lengthening is a pre-boundary marker, while the initial-syllable lengthening is regarded as a post-boundary marker of a phrase.

분류 우선순위 적용과 후보정 규칙을 이용한 효과적인 한국어 화행 분류 (Effective Korean Speech-act Classification Using the Classification Priority Application and a Post-correction Rules)

  • 송남훈;배경만;고영중
    • 정보과학회 논문지
    • /
    • 제43권1호
    • /
    • pp.80-86
    • /
    • 2016
  • 화행이란 발화 속에 포함되어 있는 화자에 의해 의도된 언어적 행위이다. 대화 시스템에서 입력된 발화에 적합한 화행을 분류하는 것은 중요하다. 기존의 화행분류에 관한 연구는 규칙기반과 기계학습 기반의 방법을 많이 사용한다. 본 논문에서는 대표적인 기계학습 방법인 지지벡터기계(SVM)와 변환기반 학습(TBL)을 조합한 화행 분류 방법을 제안한다. 이를 위해, 화행별 학습 발화의 수에 기반하여 분류 우선순위를 조정함으로써 지지벡터기계의 분류 편향 문제를 해결하였고, 오답일 확률이 높은 분류 결과에 대해서 변환 기반 학습을 통해 생성된 보정 규칙을 적용함으로써 화행분류 성능을 개선하는 방법을 제안한다. 본 논문에서 화행별 학습 발화 수의 차이를 고려한 분류 우선순위 변화와 후보정 규칙을 이용한 화행분류 방법을 실험을 통해 평가하였으며, 이는 학습 발화 수가 낮은 화행의 우선순위를 고려하지 않은 기존의 화행 분류보다 성능이 향상되었다.

음성 에이전트 상호작용에서 선행 발화가 사용자 경험에 미치는 영향 - 스마트홈 맥락에서 대화 유형 조건을 중심으로 - (The Effect of Preceding Utterance on the User Experience in the Voice Agent Interactions - Focus on the Conversational Types in the Smart Home Context -)

  • 강예슬;나경화;최준호
    • 문화기술의 융합
    • /
    • 제7권1호
    • /
    • pp.620-631
    • /
    • 2021
  • 이 연구는 스마트 홈 환경에서 대화 주제 유형에 따라 음성 에이전트의 선행 발화 방식이 사용자 경험에 미치는 효과를 확인하고자 하였다. 과제 중심적 대화와 관계 중심적 대화의 두 가지 대화 유형을 바탕으로, 스마트 스피커의 발화 방식을 선행 발화와 후행 발화로 구분하여 네 가지 시나리오를 제작하였다. 온라인 실험을 진행하여 총 62명의 참가자를 발화 방식에 따라 두 그룹으로 나누어, 대화 유형의 두 가지 시나리오를 진행하게 하고, 호감도, 심리적 저항감, 지각된 지능의 사용자 경험 요인을 측정하였다. 실험 결과, 대화 유형 중 과제 중심적 대화에서 호감도의 주효과가 나타났고, 발화 방식에서 선행 발화에 대한 심리적 저항감의 주효과가 나타났다. 선행 발화 방식은 과제 중심적 대화에서 호감도와 지각된 지능을 높이는 효과를 보였다.

On-Line Blind Channel Normalization for Noise-Robust Speech Recognition

  • Jung, Ho-Young
    • IEIE Transactions on Smart Processing and Computing
    • /
    • 제1권3호
    • /
    • pp.143-151
    • /
    • 2012
  • A new data-driven method for the design of a blind modulation frequency filter that suppresses the slow-varying noise components is proposed. The proposed method is based on the temporal local decorrelation of the feature vector sequence, and is done on an utterance-by-utterance basis. Although the conventional modulation frequency filtering approaches the same form regardless of the task and environment conditions, the proposed method can provide an adaptive modulation frequency filter that outperforms conventional methods for each utterance. In addition, the method ultimately performs channel normalization in a feature domain with applications to log-spectral parameters. The performance was evaluated by speaker-independent isolated-word recognition experiments under additive noise environments. The proposed method achieved outstanding improvement for speech recognition in environments with significant noise and was also effective in a range of feature representations.

  • PDF

Adaptive Channel Normalization Based on Infomax Algorithm for Robust Speech Recognition

  • Jung, Ho-Young
    • ETRI Journal
    • /
    • 제29권3호
    • /
    • pp.300-304
    • /
    • 2007
  • This paper proposes a new data-driven method for high-pass approaches, which suppresses slow-varying noise components. Conventional high-pass approaches are based on the idea of decorrelating the feature vector sequence, and are trying for adaptability to various conditions. The proposed method is based on temporal local decorrelation using the information-maximization theory for each utterance. This is performed on an utterance-by-utterance basis, which provides an adaptive channel normalization filter for each condition. The performance of the proposed method is evaluated by isolated-word recognition experiments with channel distortion. Experimental results show that the proposed method yields outstanding improvement for channel-distorted speech recognition.

  • PDF

한국어 운율구 기반의 피치궤적 변환의 통계적 접근 (Statistical Approaches to Convert Pitch Contour Based on Korean Prosodic Phrases)

  • Lee, Ki-Young
    • The Journal of the Acoustical Society of Korea
    • /
    • 제23권1E호
    • /
    • pp.10-15
    • /
    • 2004
  • In performing speech conversion from a source speaker to a target speaker, it is important that the pitch contour of the source speakers utterance be converted into that of the target speaker, because pitch contour of a speech utterance plays an important role in expressing speaker's individuality and meaning of the utterance. This paper describes statistical algorithms of pitch contour conversion for Korean language. Pitch contour conversions are investigated at two 1 evels of prosodic phrases: intonational phrase and accentual phrase. The basic algorithm is a Gaussian normalization [7] in intonational phrase. The first presented algorithm is combined with a declination-line of pitch contour in an intonational phrase. The second one is Gaussian normalization within accentual phrases to compensate for local pitch variations. Experimental results show that the algorithm of Gaussian normalization within accentual phrases is significantly more accurate than the other two algorithms in intonational phrase.

영어 발화와 가사 리듬의 재구조와 리듬보의 활용 (The Restructuring in English Utterance and Words and a Use of Textsetting)

  • 김기섭
    • 대한음성학회지:말소리
    • /
    • 제40호
    • /
    • pp.29-49
    • /
    • 2000
  • This study has two aim: one is to clarify the restructuring of English in utterance and the other is to make use of text-setting to be applied to getting accustomed to the English rhythm and pronunciation. Clitics prove to play a crucial role on the English restructuring, and are found to be attached to their previous and to their next head or host, thus forming, respectively, an on-cliticized rhythm, trochee and a pro-cliticized rhythm, iambus. En-cliticization proves to be preferred to pro-cliticization in most types of English rhythms. Accordingly, the restructuring turn out to occur all over the levels of the Prosodic Hierarchy. That is, syllables, words and clitic groups are restructured in poetry as well as in song words, which means the necessity of restructuring throughout the levels of the Prosodic Hierarchy from the syllable to the utterance. The present study suggests a good use of a rhythmic textsetting for learners of English to get accustomed to the stress-timed rhythm as well as to such changes in pronunciation as reductions, deletions, resolutions, contractions, and rhythms in English.

  • PDF

한국어에서의 공손함을 나타내는 운율적 특성에 관한 연구 (Prosodic Characteristics of Politeness in Korean)

  • 고현주;김상훈;김종진
    • 대한음성학회지:말소리
    • /
    • 제45호
    • /
    • pp.15-22
    • /
    • 2003
  • This study is a kind of a preliminary study to develop naturalness of dialog TTS system. In this study, as major characteristics of politeness in Korean, temporal(total duration of utterances, speech rate and duration of utterance final syllables) and F0(mean F0, boundary tone pattern, F0 range) features were discussed through acoustic analysis of recorded data of semantically neutral sentences, which were spoken by ten professional voice actors under two conditions of utterance type - namely, normal and polite type. The results show that temporal characteristics were significantly different according to the utterance type but F0 characteristics were not.

  • PDF

SWAPPING NATIVE AND NON-NATIVE SPEAKERS' PROSODY USING THE PSOLA ALGORITHM

  • Yoon Kyu-Chul
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 춘계 학술대회 발표논문집
    • /
    • pp.77-81
    • /
    • 2006
  • This paper presents a technique of imposing the prosodic features of a native speaker's utterance onto the same sentence uttered by a non-native speaker. Three acoustic aspects of the prosodic features were considered: the fundamental frequency (F0) contour, segmental durations, and the intensity contour. The fundamental frequency contour and the segmental durations of the native speaker's utterance were imposed on the non-native speaker's utterance by using the PSOLA (pitch-synchronous overlap and add) algorithm [1] implemented in Praat[2]. The intensity contour transfer was also done in Praat. The technique of transferring one or more of these prosodic features was elaborated and its implications in the area of language education were discussed.

  • PDF