• 제목/요약/키워드: phonetic data

검색결과 200건 처리시간 0.019초

A Longitudinal Case Study of Late Babble and Early Speech in Southern Mandarin

  • Chen, Xiaoxiang
    • 비교문화연구
    • /
    • 제20권
    • /
    • pp.5-27
    • /
    • 2010
  • This paper studies the relation between canonical/variegated babble (CB/VB) and early speech in an infant acquiring Mandarin Chinese from 9 to 17 months. The infant was audio-and video-taped in her home almost every week. The data analyzed here come from 1,621 utterances extracted from 23 sessions ranging from 30 minutes to one hour, from age 00:09;07 to 01:05;27. The data was digitized, and segments from 23 sessions were transcribed in narrow IPA and coded for analysis. Babble was coded from age 00:09;07 to 01:00;00, and words were coded from 01:00;00 to 01:05;27, proto-words appeared at 11 months, and some babble was still present after 01:10;00. 3821 segments were counted in CB/VB utterances, plus the segments found in 899 word tokens. The data transcription was completed and checked by the author and was rechecked by two other researchers who majored in Chinese phonetics in order to ensure the reliability, we reached an agreement of 95.65%. Mandarin Chinese is phonetically very rich in consonants, especially affricates: it has aspirated and unaspirated stops in labial, alveolar, and velar places of articulation; affricates and fricatives in alveolar, retroflex, and palatal places; /f/; labial, alveolar, and velar nasals; a lateral;[h]; and labiovelar and palatal glides. In the child's pre-speech phonetic repertoire, 7 different consonants and 10 vowels were transcribed at 00:09;07. By 00:10;16, the number of phones was more than doubled (17 consonants, 25 vowels), but the rate of increase slowed after 11 months of age. The phones from babbling remained active throughout the child's early and subsequent speech. The rank order of the occurrence of the major class types for both CB and early speech was: stops, approximants, nasals, affricates, fricatives and lateral. As expected, unaspirated stops outnumbered aspirated stops, and front stops and nasals were more frequent than back sounds in both types of utterances. The fact that affricates outnumbered fricatives in the child's late babble indicates the pre-speech influence of the ambient language. The analysis of the data also showed that: 1) the phonetic characteristics of CB/VB and early meaningful speech are extremely similar. The similarities of CB/VB and speech prove that the two are deeply related; 2) The infant has demonstrated similar preferences for certain types of sounds in the two stages; 3) The infant's babbling was patterned at segmental level, and this regularity was similarly evident in the early speech of children. The three types being coronal plus front vowel; labial plus central and dorsal plus back vowel exhibited much overlap in the phonetic forms of CB/ VB and early speech. So the child's CB/ VB at this stage already shared the basic architecture, composition and representation of early speech. The evidence of similarity between CB/VB and early speech leaves no doubt that phones present in CB/VB are indeed precursors to early speech.

음향 및 음소 정보를 이용한 연속제의 자동 음소 분할에 대한 연구 (A Study on Automatic Phoneme Segmentation of Continuous Speech Using Acoustic and Phonetic Information)

  • 박은영;김상훈;정재호
    • 한국음향학회지
    • /
    • 제19권1호
    • /
    • pp.4-10
    • /
    • 2000
  • 본 논문은 자동 음소 분할기의 음소 경계 오류를 보상하기 위한 후처리(Postprocessing)에 관한 연구이다. 자동 분절 경계의 오류 범위를 줄일 수 있는 후처리기를 제안하고, 자동 분절 결과를 직접 합성 단위로 사용할 수 있는 대량의 합성용 운율데이터 베이스 구축에 유용함을 기술한다. 제안된 후처리기는 수작업으로 보정된 데이터의 특징벡터를 다층 신경회로망(MLP: Multi-layer perceptron)을 통해 학습을 한 후, 자동 분절 결과와 MLP 기반 후처리를 이용하여 새로운 음소 경계를 추출한다. 우선, 특징벡터 set은 음성학적 지식이 최대한 반영되도록 선정되었다. 그리고, 경계를 추출하기 위해서 비선형 패턴분리에 탁월한 성능을 보이는 MLP를 이용한다. MLP는 매우 다양하게 나타나는 음소 경계간 음성학적 특징을 단시간 내에 적용할 수 있기 때문이다. 마지막으로, 음운환경별로 특징 벡터가 적용되는 제안된 후처리 알고리즘을 이용하여 자동 분절의 경계 오류에 대한 보상이 이루어진다. 문장 단위로 발화된 합성용 데이터베이스에서 후처리기로 보정된 분절 결과는 음성 언어 번역 시스템의 분할율보다 약 19.9%의 향상된 성능을 보였으며, 절대오류 (|Hand label position-Auto label position|)는 약 28.6% 감소되었다.

  • PDF

실제 발화 상황에서 프랑스어와 한국어의 음절구조 비교 (A Comparative Study of Syllable Structures between French and Korean in Real Utterances)

  • 이은영
    • 음성과학
    • /
    • 제10권2호
    • /
    • pp.237-248
    • /
    • 2003
  • This paper compares the syllable structure of French and Korean analyzing the speech data of these two languages recorded during the actual speech. Reference to the syllable structure of French is made from F. Wioland's research data. As for the Korean data, the primary data are drawn from the 30-minute radio interview in which two male TV anchors in their early 60s talk to each other. The secondary source of the data is collected by having the primary data replicated by the two male announcers in their early 20's broadcasting in the university ra야o station of KAIST. With reference to the data collected in French and Korean, this paper provides the statistical frequency of each type of syllable structure in each language through the acoustic analysis of the spectrograms and renders a phonetic account of the characteristics of each syllable type in the two languages. Also discussed in this paper is the distributional condition in which each syllable structure is laid out in the speech context.

  • PDF

산업용 음성 DB를 위한 XML 기반 메타데이터 (XML Based Meta-data Specification for Industrial Speech Databases)

  • 주영희;홍기형
    • 대한음성학회지:말소리
    • /
    • 제55권
    • /
    • pp.77-91
    • /
    • 2005
  • In this paper, we propose an XML based meta-data specification for industrial speech databases. Building speech databases is very time-consuming and expensive. Recently, by the government supports, huge amount of speech corpus has been collected as speech databases. However, the formats and meta-data for speech databases are different depending on the constructing institutions. In order to advance the reusability and portability of speech databases, a standard representation scheme should be adopted by all speech database construction institutions. ETRI proposed a XML based annotation scheme [51 for speech databases, but the scheme has too simple and flat modeling structure, and may cause duplicated information. In order to overcome such disadvantages in this previous scheme, we first define the speech database more formally and then identify object appearing in speech databases. We then design the data model for speech databases in an object-oriented way. Based on the designed data model, we develop the meta-data specification for industrial speech databases.

  • PDF

사상체질 진단요소들 간의 일치도 분석연구 (The research on agreement statistics analysis between factors of diagnosis)

  • 장은수;김호석;이시우;김종열
    • 한국한의학연구원논문집
    • /
    • 제12권2호통권17호
    • /
    • pp.103-113
    • /
    • 2006
  • Objectives we intended to know how much did it relate with the results between the instruments of diagnosis by using methods of three factors - QSCCII, PSSC(Phonetic System for Sasang Constitution)-2004, and body measurement which are usually used in diagnosing the Sasang Constitution in clinics Methods We diagnosed Sasang constitution through QSCCII, PSSC(Phonetic System for Sasang Constitution)-2004, Body measurement as a dignosis factors and we used Kappa coefficient to estimate simularity between diagnosis factors, and SPSS 12.0K to analyze data Results and conchusions 1. The orders of agreement statistics are different in the currency of Sasang Constitution diagnosis, Soeum-in was highest and Taeum-in lowest in the the fricency of Sasang Conctitution Diagnosis in the QSCCII, Soeum-in was highest Soyang-in lowest in the PSSC and Taeum-in highest, Soyang-in lowest in the body measurement so, we analogized incorrection in Sasang Constitution Diagnosis 2. Among 443 subjects, 156 (35.3%) had same dignosis in three Sasang Constitution factors. It means agreement statistics among factors of diagnosis are very low, so it is absolutely nessessary to research connection among those, especially Soyang-in part 3. Totally, it is not robust to apply these factors on Sasang Constitution diagnosis, especially agreement statistics between two kinds of Sasang Constitution diagnosis as $0.358{\sim}0.380$. However, we can have a possibility the more we use Sasang Constitution dignosis factors, the higher the agreement statistics is, through the ascending of agreement statistics as $0.526{\sim}0.592$, among three kinds of Sasang Constitution diagnosis To evaluate accuracy of Sasang Constitution diagnosis, it is nessessary to collect data from the subjects who are dignosed through the evidences such as herb medicine, disease and normal symption observation, etc. Using these data, we have to evaluate correction of seperated Sasang Constitution diagnosis methods and to connect those.

  • PDF

20세기 초 한국어 단모음의 음향음성학적 연구 (A Phonetic Investigation of Korean Monophthongs in the Early Twentieth Century)

  • 한정임;김주연
    • 말소리와 음성과학
    • /
    • 제6권1호
    • /
    • pp.31-38
    • /
    • 2014
  • The current study presents an instrumental phonetic analysis of Korean monophthong vowels in the early twentieth century Seoul Korean, based on audio recordings of elementary school textbooks Botonghakgyo Joseoneodokbon (Korean Reading Textbook for Elementary School). The data examined in this study were a list of the Korean mono syllables (Banjeol), and a short passage, recorded by one 41-year-old male speaker in 1935, as well as a short passage recorded by one 11-year-old male speaker in 1935. The Korean monophthongs were examined in terms of acoustic analysis of the vowel formants (F1, F2) and compared to those recorded by 18 male speakers of Seoul Korean in 2013. The results show that in 1935, 1) /e/ and /ɛ/ were clearly separated in the vowel space; 2) /o/ and /u/ were also clearly separated without any overlapping values; 3) some tokens of /y/ and /ø/ were produced as monophthongs, not as diphthongs. Based on the results, we can observe the historical change of the Korean vowels over 80-90 years such as 1) /e/ and /ɛ/ have been merged; and 2) /o/ has been raised and overlapped with /u/.

한국어 장애음 지각에서의 VOT와 F0의 상관 관계 (The Correlation of VOT and f0 In the Perception of Korean Obstruents)

  • 김미담
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 10월 학술대회지
    • /
    • pp.163-167
    • /
    • 2003
  • The present thesis examines the correlation of VOT and F0 in the three-way distinction of Korean obstruents, conducting production and perception tests. In the production test, one female native speaker of Korean with a Seoul dialect (the author) recorded 15 repetitions of a monosyllabic word list including /ka, kha, k*a, pa, pha, p*a, ta, tha, t*a, ca, cha, c*a/ in random order, VOT and F0 of the following vowels were measured, and the result was significant for the three-way distinction with a strong correlation between VOT and F0, and also in the VOT-F0 plot, no overlapping among the domains was observed. As for the perception test, I manipulated the data recorded in the production test, heightening or lowering their F0 values. In all, 14 subjects (seven males and seven females) participated in the identification test. The result was as follows: the fortis stimuli were not influenced by F0 changes, and the VOT and F0 values at the lenis-aspirated boundary were negatively correlated. From these results I concluded the following: 1) VOT and F0 can distinguish the three domains of Korean obstruents without overlapping; 2) the fortis perception does not need F0 as its acoustic cue; and 3) VOT and F0 in the distinction between the lenis and aspirated are in the phonetic trading relation[2].

  • PDF

음성 신호를 사용한 감정인식의 특징 파라메터 비교 (Comparison of feature parameters for emotion recognition using speech signal)

  • 김원구
    • 대한전자공학회논문지SP
    • /
    • 제40권5호
    • /
    • pp.371-377
    • /
    • 2003
  • 본 논문에서 음성신호를 사용하여 인간의 감정를 인식하기 위한 특징 파라메터 비교에 관하여 연구하였다. 이를 위하여 여러 가지 감정 상태에 따라 분류된 한국어 음성 데이터 베이스를 이용하여 얻어진 음성 신호의 피치와 에너지의 평균, 표준편차와 최대 값 등 통계적인 정보 나타내는 파라메터와 음소의 특성을 나타내는 MFCC 파라메터가 사용되었다. 파라메터들의 성능을 평가하기 위하여 문장 및 화자 독립 감정 인식 시스템을 구현하여 인식 실험을 수행하였다. 성능 평가를 위한 실험에서는 운율적 특징으로 피치와 에너지와 각각의 미분 값을 사용하였고, 음소의 특성을 나타내는 특징으로 MFCC와 그 미분 값을 사용하였다. 벡터 양자화 방법을 사용한 화자 및 문장 독립 인식 시스템을 사용한 실험 결과에서 MFCC와 델타 MFCC를 사용한 경우가 피치와 에너지를 사용한 방법보다 우수한 성능을 나타내었다.

Phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary

  • Yang, Byunggon
    • 말소리와 음성과학
    • /
    • 제8권2호
    • /
    • pp.11-16
    • /
    • 2016
  • This study explores the phoneme distribution and syllable structure of entry words in the CMU English Pronouncing Dictionary to provide phoneticians and linguists with fundamental phonetic data on English word components. Entry words in the dictionary file were syllabified using an R script and examined to obtain the following results: First, English words preferred consonants to vowels in their word components. In addition, monophthongs occurred much more frequently than diphthongs. When all consonants were categorized by manner and place, the distribution indicated the frequency order of stops, fricatives, and nasals according to manner and that of alveolars, bilabials and velars according to place. These results were comparable to the results obtained from the Buckeye Corpus (Yang, 2012). Second, from the analysis of syllable structure, two-syllable words were most favored, followed by three- and one-syllable words. Of the words in the dictionary, 92.7% consisted of one, two or three syllables. This result may be related to human memory or decoding time. Third, the English words tended to exhibit discord between onset and coda consonants and between adjacent vowels. Dissimilarity between the last onset and the first coda was found in 93.3% of the syllables, while 91.6% of the adjacent vowels were different. From the results above, the author concludes that an analysis of the phonetic symbols in a dictionary may lead to a deeper understanding of English word structures and components.

CROSS-LANGUAGE SPEECH PERCEPTION BY KOREAN AND POLISH.

  • Paradowska, Anna
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2000년도 7월 학술대회지
    • /
    • pp.178-178
    • /
    • 2000
  • This paper IS concerned with adults' foreign language aquisition and intends to research the relationship between the mother tongue's phonetic system (L1) and the perception of the foreign language (L2), in this paper Polish and Korean. The questions that are to help to define the aforementioned relationship are I) how Polish perceive Korean vowels, 2) how Koreans perceive Polish vowels, and 3) how Koreans perceive Korean vowels pronounced by Poles. In order to identify L2's vowels, the listeners try to fit them into the categories of their own language (L1). On the one hand, vowels that are the same in both languages and those that are articulated where no other vowel is articulated, have the best rate of recognition. For example, /i/ in both languages is a front close vowel and in both languages there are no other front close vowels. Therefore, vowels /i/ (and /a/) have the best rate of recognition in all three experiments. On the other hand, vowels that are unfamiliar to the listeners do not seem to have the worst rate of recognition. The vowels that have the worst rate of recognition are those, that are similar, but not quite the same as those of L1. This research proves that "equivalence classification prevents L2 learners from producing similar L2 phones, but not new L2 phones, authentically" (Flege, 1987). Polish speakers can pronounce unfamiliar L2 vowels "more authentically" than those similar to L1 vowels. However, the difference is not significant and this subject requires further research (different data, more informants).

  • PDF