• 제목/요약/키워드: phonetic data

검색결과 200건 처리시간 0.02초

제주방언 이중모음의 음향분석 (The Acoustic Analysis of the Diphthongs in Jeju Dialect)

  • 김원보
    • 음성과학
    • /
    • 제12권2호
    • /
    • pp.29-41
    • /
    • 2005
  • This paper is to show the diphthong system of Jeju dialect speakers in their 70s or more on the basis of the acoustic analysis of their phonetic data. It is revealed through the analysis of their phonetic data that they clearly distinguish such diphthongs as [we], [w$\epsilon$], [yc] and [yo]. However, this paper shows that they are phonetically insensitive to the separation between [ye] and [y$\epsilon$] and they seldom make a precise pronunciation of diphthong [iy], which male speakers tend to pronounce to be [i] and female speakers to be [i].

  • PDF

음소 질의어 집합 생성 알고리즘 (Phonetic Question Set Generation Algorithm)

  • 김성아;육동석;권오일
    • 한국음향학회지
    • /
    • 제23권2호
    • /
    • pp.173-179
    • /
    • 2004
  • 음소 질의어 집합은 문맥 속에서 비슷한 조음 효과를 보이는 음소들을 분류해 놓은 것으로서, 음성 인식 시스템 학습 시 결정트리를 기반으로 HMM (hidden Markov model)의 상태들을 클러스터링할 때 사용된다. 현재까지의 음소 질의어 집합은 대부분 음성학자나 언어학자들에 의해 수작업으로 제시되어 왔는데, 이러한 지식 기반음소 질의어들은 언어 또는 유사음소 단위 (PLU: phone like unit)에 종속될 뿐 아니라 생성된 클러스터 내의 동질성을 저하시킬 수 있다는 단점이 있다. 본 논문에서는 이와 같은 문제점들을 해결하기 위해 음성 데이터를 사용하여 측정한 음소들 사이의 유사도를 기반으로 언어나 유사음소단위에 상관없이 자동으로 음소 질의어 집합을 생성하는 알고리즘을 제안한다. 실험결과, 제안한 방법으로 생성된 음소 질의어들을 사용한 인식기의 에러율이 약 14.3%감소하여 데이터 기반의 음소 질의어 집합이 상태 클러스터링에 효율적임을 관측하였다.

K-ToBI (Korean ToBI) Labelling Conventions (Version 3.0)

  • Juo, Suo-Ah
    • 음성과학
    • /
    • 제7권1호
    • /
    • pp.143-169
    • /
    • 2000
  • This chapter presents an overview of Korean intonational structure and proposes a revised version of K -ToBI (Korean TOnes and Break Indices), a prosodic transcription convention for Seoul Korean. In the new version of K-ToBI, a tone tier is separated into two tiers: a phonological tone tier and a phonetic tone tier. A phonological tone tier labels tones marking the prosodic structure of an utterance, and a phonetic tone tier labels individual tones of an AP and an IP conforming to the surface pitch contour. Labelling surface tonal patterns will provide us data to test the underlying tonal patterns and to build phonetic implementation rules.

  • PDF

Acoustic correlates of prosodic prominence in conversational speech of American English, as perceived by ordinary listeners

  • Mo, Yoon-Sook
    • 말소리와 음성과학
    • /
    • 제3권3호
    • /
    • pp.19-26
    • /
    • 2011
  • Previous laboratory studies have shown that prosodic structures are encoded in the modulations of phonetic patterns of speech including suprasegmental as well as segmental features. Drawing on a prosodically annotated large-scale speech data from the Buckeye corpus of conversational speech of American English, the current study first evaluated the reliability of prosody annotation by a large number of ordinary listeners and later examined whether and how prosodic prominence influences the phonetic realization of multiple acoustic parameters in everyday conversational speech. The results showed that all the measures of acoustic parameters including pitch, loudness, duration, and spectral balance are increased when heard as prominent. These findings suggest that prosodic prominence enhances the phonetic characteristics of the acoustic parameters. The results also showed that the degree of phonetic enhancement vary depending on the types of the acoustic parameters. With respect to the formant structure, the findings from the present study more consistently support Sonority Expansion Hypothesis than Hyperarticulation Hypothesis, showing that the lexically stressed vowels are hyperarticulated only when hyperarticulation does not interfere with sonority expansion. Taken all into account, the present study showed that prosodic prominence modulates the phonetic realization of the acoustic parameters to the direction of the phonetic strengthening in everyday conversational speech and ordinary listeners are attentive to such phonetic variation associated with prosody in speech perception. However, the present study also showed that in everyday conversational speech there is no single dominant acoustic measure signaling prosodic prominence and listeners must attend to such small acoustic variation or integrate acoustic information from multiple acoustic parameters in prosody perception.

  • PDF

음성학적으로 본 사상체질 (A Phonetic Study of 'Sasang Constitution')

  • 문승재;탁지현;황혜정
    • 대한음성학회지:말소리
    • /
    • 제55권
    • /
    • pp.1-14
    • /
    • 2005
  • Sasang Constitution, one branch of oriental medicine, claims that people can be classified into four different 'constitutions:' Taeyang, Taeum, Soyang, and Soeum. This study investigates whether the classification of the constitutions could be accurately made solely based on people's voice by analyzing the data from 46 different voices whose constitutions were already determined. Seven source-related parameters and four filter-related parameters were phonetically analyzed and the GMM(Gaussian mixture model) was tried on the data. Both the results from phonetic analyses and GMM showed that all the parameters (except one) failed to distinguish the constitutions of the people successfully. And even the single exception, B2 (the bandwidth of the second formant) did not provide us with sufficient reasons to be the source of distinction. This result seems to suggest one of the two conclusions: either the Sasang Constitutions cannot be substantiated with phonetic characteristics of peoples' voices with reliable accuracy, or we need to find yet some other parameters which haven't been conventionally proposed.

  • PDF

음성학적으로 본 사상체질 (A Phonetic Study of 'Sasang Constitution')

  • 문승재;탁지현;황혜정
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2005년도 춘계 학술대회 발표논문집
    • /
    • pp.63-66
    • /
    • 2005
  • Sasang Constitution, one branch of oriental medicine, claims that people can be classified into four different 'constitutions:' Taeyang, Taeum, Soyang, and Soeum. This study investigates whether the classification of the 'constitutions' could be accurately made solely based on people's voice by analyzing the data from 46 different voices whose constitutions were already determined. Seven source-related parameters and four filter-related parameters were phonetically analyzed and the GMM(gaussian mixture model) was tried with the data. Both the results from phonetic analyses and GMM showed that all the parameters (except one)failed to distinguish the constitutions of the people successfully. And even the single exception, the bandwidth of F2, did not provide us with sufficient reasons to be the source of distinction. This result seems to suggest one of the two conclusions: either the Sasang Constitutions cannot be substantiated with phonetic characteristics of peoples' voices with reliable accuracy, or we need to find yet some other parameters which haven't been conventionally proposed.

  • PDF

한국어 발음 교육을 위한 음성 DB 구축 방안 (Designing of Speech DB for Korean Pronunciation Education)

  • 정명숙
    • 대한음성학회지:말소리
    • /
    • 제47호
    • /
    • pp.51-72
    • /
    • 2003
  • The purpose of this paper is to design Speech Database for Korean pronunciation education. For this purpose, I investigated types of speech errors of Korean-learners, made texts for recording, which involves all types of speech errors, and showed how to gather speech data and how to tag their informations. It's natural that speech data should include Korean-learners' speech and Korean people's speech, because Speech DB that I try to develop is for teaching Korean pronunciation to foreigners. So this DB should have informations about speakers and phonetic informations, which are about phonetic value of segments and intonation of sentences. The intonation of sentence varies with the type of sentence, the structure of prosodic units, the length of a prosodic unit and so on. For this reason, Speech DB must involve tags about these informations.

  • PDF

분산 음성인식 시스템의 성능향상을 위한 음소 빈도 비율에 기반한 VQ 코드북 설계 (A VQ Codebook Design Based on Phonetic Distribution for Distributed Speech Recognition)

  • 오유리;윤재삼;이길호;김홍국;류창선;구명완
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2006년도 춘계 학술대회 발표논문집
    • /
    • pp.37-40
    • /
    • 2006
  • In this paper, we propose a VQ codebook design of speech recognition feature parameters in order to improve the performance of a distributed speech recognition system. For the context-dependent HMMs, a VQ codebook should be correlated with phonetic distributions in the training data for HMMs. Thus, we focus on a selection method of training data based on phonetic distribution instead of using all the training data for an efficient VQ codebook design. From the speech recognition experiments using the Aurora 4 database, the distributed speech recognition system employing a VQ codebook designed by the proposed method reduced the word error rate (WER) by 10% when compared with that using a VQ codebook trained with the whole training data.

  • PDF

한국어 음성인식을 위한 음성학 기반의 유사음소단위 집합 설계 (A Phonetics Based Design of PLU Sets for Korean Speech Recognition)

  • 홍혜진;김선희;정민화
    • 대한음성학회지:말소리
    • /
    • 제65호
    • /
    • pp.105-124
    • /
    • 2008
  • This paper presents the effects of different phone-like-unit (PLU) sets in order to propose an optimal PLU set for the performance improvement of Korean automatic speech recognition (ASR) systems. The examination of 9 currently used PLU sets indicates that most of them include a selection of allophones without any sufficient phonetic base. In this paper, a total of 34 PLU sets are designed based on Korean phonetic characteristics arid the effects of each PLU set are evaluated through experiments. The results show that the accuracy rate of each phone is influenced by different phonetic constraint(s) which determine(s) the PLU sets, and that an optimal PLU set can be anticipated through the phonetic analysis of the given speech data.

  • PDF

LPC 벡터 양자화를 이용한 가변률 CELP 음성코딩에 관한 연구 (Variable Rate CELP Coding with Phonetic Segmentation using LPC Vector Quantization)

  • 정영호
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 제11회 음성통신 및 신호처리 워크샵 논문집 (SCAS 11권 1호)
    • /
    • pp.205-209
    • /
    • 1994
  • This paper presents a variable rate speech coding method with phonetic segmentation, called for PSVXC. Multiple access techniques that require efficient encoding of speech to achieve capacity improvements are currently emerging in the cellular telephone system. The variable rate speech coder have the reduced average data rate required to transmit conversational speech. Each frame of active speech is classified into one of four phonetic classes. A distinct coding configuration and bit-rate is applied to each category. And also a split vector quantization is used to accurately quantize the LPC information using LSP parameters.

  • PDF