• Title/Summary/Keyword: 발화길이

Search Result 70, Processing Time 0.024 seconds

Production and perception of Korean word-initial stops from a sound change perspective (음 변화 관점에서 바라본 한국어 어두 폐쇄음의 발화 및 지각)

  • Kim, Jin-Woo
    • Phonetics and Speech Sciences
    • /
    • v.13 no.3
    • /
    • pp.39-51
    • /
    • 2021
  • Based on spontaneous speech data collected in 2020, this study examined the production and perception of Korean lenis, aspirated, and fortis stops. Unlike the controlled experiments of previous studies, lenis and aspirated stops of males in their 30s were not distinguished by voice onset time (VOT) in spontaneous speech. Perceptual experiments were conducted on young females, the leaders of language change. F0 was found to serve as the primary cue for the perception of lenis stops, and then VOT distinguished the aspirated and fortis stops. The fact that the sounds were always perceived as lenis stops when F0 was low, irrespective of whether VOT was short or long, showed that F0 plays an absolute role in the perception of lenis stops. However, in some cases the aspirated and lenis stops were distinguished only by VOT, which does not happen in production. In terms of sound change, disagreement between production and perception systems occurs when sound change is in progress. In particular, when production change precedes perception change, it indicates that the sound change is in its latter stages. Young females still maintain the previous system in perception because the distinction of lenis and aspirated stops by VOT was valid in their parents' generation. In other words, VOT is still used for perception to communicate with other groups.

Analysis and Use of Intonation Features for Emotional States (감정 상태에 따른 발화문의 억양 특성 분석 및 활용)

  • Lee, Ho-Joon;Park, Jong C.
    • Annual Conference on Human and Language Technology
    • /
    • 2008.10a
    • /
    • pp.145-150
    • /
    • 2008
  • 본 논문에서는 8개의 문장에 대해서 6명의 화자가 5가지 감정 상태로 발화한 총 240개의 문장을 감정 음성 말뭉치로 활용하여 각 감정 상태에서 특징적으로 나타나는 억양 패턴을 분석하고, 이러한 억양 패턴을 음성 합성 시스템에 적용하는 방법에 대해서 논의한다. 이를 위해 본 논문에서는 감정 상태에 따른 특징적 억양 패턴을 억양구의 길이, 억양구의 구말 경계 성조, 하강 현상에 중점을 두어 분석하고, 기쁨, 슬픔, 화남, 공포의 감정을 구분 지을 수 있는 억양 특징들을 음성 합성 시스템에 적용하는 과정을 보인다. 본 연구를 통해 화남의 감정에서 나타나는 억양의 상승 현상을 확인할 수 있었고, 각 감정에 따른 특징적 억양 패턴을 찾을 수 있었다.

  • PDF

The Trade-off Effects between MLU and Fluency in Normal Preschool-age Children (발화길이와 유창성 간의 교환효과: 언어 발달시기에 있는 36-48 개월의 정상아동을 대상으로)

  • Lee, Su-Jin;Hwang, Mi-Na
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.157-168
    • /
    • 2001
  • The limited capacity model has been used to explain linguistic interactions and trade-offs that occur in children's speech. The purpose of the present investigation is to explore the interrelationship of MLU (as an index of syntactic development) and fluency in the spontaneous speech of normal children. Twenty normal children's (ten girls and ten boys, aged 36-48 months) spontaneous speech samples were obtained during free-play interactions with their mothers or other adults. The results indicated that the MLU of disfluent utterances were significantly longer than that of fluent utterances. Also, disfluencies occurred more frequently in longer utterances than in shorter utterances. In addition, the utterances where disfluencies occurred more than 2 times were longer than those where disfluencies occurred once. These results imply that the increase of MLU appear to affect not only the occurrence of disfluent utterances, but also the number of disfluencies within the utterances. In other' words, these findings show that there are trade-off effects between MLU and fluency. This is discussed within a limited capacity framework.

  • PDF

Age classification of emergency callers based on behavioral speech utterance characteristics (발화행태 특징을 활용한 응급상황 신고자 연령분류)

  • Son, Guiyoung;Kwon, Soonil;Baik, Sungwook
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.13 no.6
    • /
    • pp.96-105
    • /
    • 2017
  • In this paper, we investigated the age classification from the speaker by analyzing the voice calls of the emergency center. We classified the adult and elderly from the call center calls using behavioral speech utterances and SVM(Support Vector Machine) which is a machine learning classifier. We selected two behavioral speech utterances through analysis of the call data from the emergency center: Silent Pause and Turn-taking latency. First, the criteria for age classification selected through analysis based on the behavioral speech utterances of the emergency call center and then it was significant(p <0.05) through statistical analysis. We analyzed 200 datasets (adult: 100, elderly: 100) by the 5 fold cross-validation using the SVM(Support Vector Machine) classifier. As a result, we achieved 70% accuracy using two behavioral speech utterances. It is higher accuracy than one behavioral speech utterance. These results can be suggested age classification as a new method which is used behavioral speech utterances and will be classified by combining acoustic information(MFCC) with new behavioral speech utterances of the real voice data in the further work. Furthermore, it will contribute to the development of the emergency situation judgment system related to the age classification.

A study on the clinical utility of voiced sentences in acoustic analysis for pathological voice evaluation (장애음성의 음향학적 분석에서 유성음 문장의 임상적 유용성에 관한 연구)

  • Ji-sung Kim
    • The Journal of the Acoustical Society of Korea
    • /
    • v.42 no.4
    • /
    • pp.298-303
    • /
    • 2023
  • This study aimed to investigate the clinical utility of voiced sentence tasks for voice evaluation. To this end, we analyzed the correlation between perturbation-based acoustic measurements [jitter percent (jitter), shimmer percent (shimmer), Noise to Harmonic Ratio (NHR)] using sustained vowel phonation, and cepstrum-based acoustic measurements [Cepstral Peak Prominence (CPP), Low/High spectral ratio (L/H ratio)] using voiced sentences. As a result of analyzing data collected from 65 patients with voice disorders, there was a significant correlation between the CPP and jitter (r = -.624, p = .000), shimmer (r = -.530, p = .000), NHR (r = -.469, p = .000).This suggests that the cepstrum measurement of voiced sentences can be used as an alternative to the analysis limitations of the pathological voice such as not possible perturbation-based acoustic measurement, and result difference according to the analysis section.

Pause Predictor for Korean Text-to-Speech conversion (한국어 음성합성기용 끊어읽기 추정기)

  • 이정철;김상훈;성굉모
    • The Journal of the Acoustical Society of Korea
    • /
    • v.17 no.5
    • /
    • pp.51-56
    • /
    • 1998
  • 문장내 휴지구간의 위치와 길이는 합성음의 자연성을 결정짓는 주요 운율 파라미터 중 하나이다. 본 연구에서는 한국어 음성합성기의 합성음 생성에서 자연성 개선을 위해서 문장내 끊어읽기 위치 및 길이를 추정하기 위한 방법을 제안한다. 먼저 실제 발화에서 끊어 읽기가 발생하는 요인을 검토하였다. 그리고 이들 요인에 부합하여 텍스트에 4단계의 끊어 읽기를 표기함으로써 다량의 데이터를 확보하고 이를 이용한 NN 학습 결과와 HMM 추정 기의 성능을 비교 검토한다. 현재까지의 결과로는 NN 학습의 경우 끊어읽기 없는 경우와 긴 끊어읽기의 추정에서는 우수한 예측능력을 보이지만 짧은 끊어읽기, 중간 끊어읽기의 경 우는 HMM의 성능이 우수한 것으로 판명되었다. 전반적인 성능에서는 HMM이 우수하며 끊어읽기 종류에 따라 추정오차가 10∼25%로서 안정적인 결과를 얻었으며 TTS에의 활용 가능성을 보였다.

  • PDF

Segmental and prosodic environments and vowel devoicing in Korean (분절음적, 운율적 환경과 무성모음의 실현)

  • Shin Ji-Young;Chae Eun-Ae
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • spring
    • /
    • pp.309-312
    • /
    • 2002
  • 무성모음화 현상이 어떠한 분절음적, 운율적 환경에서 주로 실현되는가를 알아보기 위하여 선행자음의 분절음적 환경, 후행자음의 분절음적 환경, 해당 강세구의 음절수, 운율 구조상의 위치 등 모두 네 가지를 변수로 실험을 진행하였다. 모두 10명의 화자(남5, 여5)가 발화한 1140개의 자료에 나타난 행당 모음의 길이를 측정하는 방법으로 분석을 실시하였다. 그 결과 선행자음은 [+기식성]과 [+지속성]을 가진 환경이, 후행 자음은 [-지속정]과 [기식성]을 가진 환경이 무성모음화가 잘 일어나는 환경인 것으로 밝혀졌다. 음절수의 증가는 큰 영향을 주지 않는 것으로 보였고, 대체로 두 번째 강세구의 단어초에 위치하는 경우에 모음의 길이가 짧거나 무성모음화되는 경향이 관찰되었다.

  • PDF

Improvements on Speech Recognition for Fast Speech (고속 발화음에 대한 음성 인식 향상)

  • Lee Ki-Seung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.2
    • /
    • pp.88-95
    • /
    • 2006
  • In this Paper. a method for improving the performance of automatic speech recognition (ASR) system for conversational speech is proposed. which mainly focuses on increasing the robustness against the rapidly speaking utterances. The proposed method doesn't require an additional speech recognition task to represent speaking rate quantitatively. Energy distribution for special bands is employed to detect the vowel regions, the number of vowels Per unit second is then computed as speaking rate. To improve the Performance for fast speech. in the pervious methods. a sequence of the feature vectors is expanded by a given scaling factor, which is computed by a ratio between the standard phoneme duration and the measured one. However, in the method proposed herein. utterances are classified by their speaking rates. and the scaling factor is determined individually for each class. In this procedure, a maximum likelihood criterion is employed. By the results from the ASR experiments devised for the 10-digits mobile phone number. it is confirmed that the overall error rate was reduced by $17.8\%$ when the proposed method is employed

Attentive Knowledge Selection Model for Knowledge-Grounded Multi-turn Dialogue System (지식 기반 다중 대화 시스템을 위한 주의 집중 지식 선택 모델)

  • Lee, Dohaeng;Jang, Youngjin;Huang, Jin-Xia;Kwon, Oh-Woog;Kim, Harksoo
    • Annual Conference on Human and Language Technology
    • /
    • 2021.10a
    • /
    • pp.361-364
    • /
    • 2021
  • 지식 기반 다중 대화 시스템은 지식 정보를 포함한 응답을 생성하는 대화 시스템이다. 이 시스템은 응답 생성에 필요한 지식 정보를 찾아내는 지식 선택 작업과 찾아낸 지식 정보를 바탕으로 문맥을 고려한 응답을 생성하는 응답 생성 작업으로 구성된다. 본 논문에서는 지식 선택 작업을 기계독해 프레임워크에 적용하여 해결하는 방법을 제안한다. 지식 선택 작업은 여러 개의 발화로 이루어진 대화 기록을 바탕으로 지식 문서 내에 존재하는 지식을 찾아내는 작업이다. 본 논문에서는 대화 기록 모델링 계층을 활용해 마지막 발화와 관련 있는 대화 기록을 찾아내고, 주의 집중 풀링 계층을 활용해 긴 길이의 지식을 효과적으로 추출하는 방법을 제안한다. 실험 결과, 목적지향 지식 문서 기반 대화 데이터 셋인 Doc2dial 데이터의 지식 선택 작업에서 F1 점수 기준 76.52%, EM 점수 기준 66.21%의 성능을 기록해 비교 모델 보다 높은 성능을 기록하는 것을 확인할 수 있었다.

  • PDF

Nonfluency Characteristics of Children in Multicultural Families (다문화가정 아동의 비유창성 특성)

  • Shin, Myung-Sun
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.3
    • /
    • pp.254-261
    • /
    • 2011
  • The purpose of the present study was to investigate the characteristics of disfluency in 3~5 year-old multicultural family children(MFC). 24 children(12 MFC, 12 Korean monolingual children, KMC with the same chronological age and language age) participated in this study. The experimental tasks consisted of story retelling tasks(SRT) and picture description tasks(PDT). In all the tasks, the scores of total disfluency of the MFC were significantly higher than those of the KMC. In all the tasks, the frequency of abnormal disfluency of the MFC were significantly higher than those of the KMC and the speech rates of the MFC were significantly lower than those of the KMC. The disfluency observed in MFC indicates that language ability influences on their disfluencies and fluency support of MFC is an important factor in general language support.