• Title/Summary/Keyword: Formant Analysis

Search Result 191, Processing Time 0.04 seconds

Change Measurement of Voice Analysis Parameter by an Increase of Intake the Caffeine (카페인 섭취량 증가에 따른 음성 분석 요소의 변화 측정)

  • Seo, Kyoung-Won;Jang, Yong-Jo;Kang, Deok-Hyun;Bae, Jung-Su;Yean, Yong-Hem;Lim, Soon-Yong;Min, Ji-Seon;Kim, Bong-Hyun;Ka, Min-Kyoung;Cho, Dong-Uk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.656-659
    • /
    • 2010
  • 오늘날 현대 사회에서 여가생활시간이 늘어나고 이에 따라 사람들은 잦은 커피타임을 갖고 있으며 이로 인해 커피의 섭취량이 날로 늘어나고 있다. 날로 늘어나는 커피 섭취량과 같이 커피의 주성분인 카페인 섭취량 또한 날로 증가 하고 있다. 따라서 본 논문에서는 카페인의 성분과 효능, 인체에 미치는 영향 정도와 음성에 관여하는 정도 등에 대한 결과를 추출하여 실질적으로 카페인이 음성에 미치는 요소를 분석하였다. 이를 위해 본 논문에서는 음성분석 프로그램인 Praat를 사용하였으며 성대의 변화량과 몸 속의 공명음인 Formant를 실험 요소로 적용하였다. 데이터 자료에서 유용성을 입증하고 문제점 해결에 대해 분석하였으며 실험에 의해 제안한 방법의 활용성을 입증하는 연구를 수행하였다.

The Effects of Secondhand Smoking on Articulators Based on Phonetic Analysis (음성학적 분석 기반의 간접흡연이 조음기관에 미치는 영향)

  • Seo, Kyoung-Won;Kang, Deok-Hyun;Bae, Jung-Su;Jang, Yong-Jo;Yean, Yong-Hem;Lim, Soon-Yong;Min, Ji-Seon;Kim, Bong-Hyun;Ka, Min-Kyoung;Cho, Dong-Uk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2010.11a
    • /
    • pp.648-651
    • /
    • 2010
  • 웰빙의 바람을 타고 이제 자신의 건강을 관리하는 사람들이 많아지고, 흡연에 대한 좋지 않은 인식이 높아지면서 금연의 열풍이 강하게 불고 있다. 하지만 금연을 한다고 해도 주위의 담배연기는 우리 몸의 건강을 해치기 때문에 담배연기로부터 해방되기는 매우 어렵다. 실제로 흡연하는 배우자를 가진 사람은 그렇지 않은 사람에 비해 심장병 발생률은 40%, 폐암 발생률은 30%가 더 높다. 따라서 본 논문에서는 간접흡연이 인체의 조음기관에 미치는 영향을 분석하기 위해 간접흡연에 따른 음성의 변화를 측정하고 비교, 분석하는 실험을 수행하였다. 이를 위해 간접흡연 전과 후의 음성을 수집하여 음성분석학적 요소 기술 중 Pitch, Jitter, Shimmer 등의 성대 진동 요소를 적용하고 인체 내의 공명기관을 분석하는 Formant를 적용하여 실험을 수행하여 간접흡연이 음성에 미치는 영향을 연구하였다.

PATTERNS OF ASSIMILATION OF IGBO VOWELS : AN ACOUSTIC ACCOUNT

  • Clara I. Ikekeonwu
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.514-514
    • /
    • 1996
  • Igbo, a new Benue Congo language has a vowel harmony system which, like that of Akan, is based on the pharynx size or tongue root position. In this study we examine Igbo vowel harmony with particular reference to assimilatory patterns of vowels in different harmony sets. This is to gain some insight into the factors involved in Igbo vowel assimilation, and to establish to what extent reports on Akan vowel assimilation are validated in Igbo. Tokens of the eight phonemic vowels of Standard Igbo are recorded from three native speakers of Igbo. The vowels are acoustically investigated (using the LPC analysis of CSL) in individual lexical items and within carefully designed carrier phrases. The F1 and F2 values of the vowels are obtained as these formant values are generally useful in establishing the salient characteristics of vowels. Vowels from the harmony sets are juxtaposed in the carrier phrases to ascertain the extent of assimilation. Results of the investigation show that the F1 values, to a large extend, are enough to characterize these vowels. The (-Expanded) vowels have higher F1 values than their (+Expanded) counterpart. Where there is an overlap in F1 values for some vowels the F1 bandwidth values serve to distinguish between the vowels. The overlap often reported in Akan for /I/ and /e/ on the one hand and /${\mho}$/ and /o/ on the other is not validated in Igbo. While the F1 values for these pairs of vowels are quite similar for one of our speakers, there is an appreciable difference between the F1 values of these vowels for the other two speakers. There is however an overlap for /e/ and /o/ for one of the speakers. Assimilations are generally regressive across word boundaries. It is, however, necessary to point out that the general perceptual impression that one of the vowels completely assimilates to the other, is not borne out by our investigation. Most of our F1 and F2 values for the vowels in individual lexical items are altered in assimilations. This then suggests that assimilation involving these vowels is partial rather than complete. The emerging 'allophones' are acoustically similar to the (+Expanded) vowel involved in the assimilation, that is when vowels from different harmony sets are involved. We conclude that while assimilation of Igbo vowels involves some phonological considerations, phonetic factors appear to be permanent in deciding the final form of the vowels.

  • PDF

THE STUDY OF PHONETIC CHANGE AFTER THE ORTHOGNATHIC SURGERY FOR THE PATIENT OF MANDIBULAR PROGNATHISM (하악전돌증환자(下顎前突症患者)의 악교정수술후(顎矯正手術後) 음성변화(音聲變化)에 관(關)한 연구(硏究))

  • Kim, Byung Ju;Kim, Yeo Gab
    • Maxillofacial Plastic and Reconstructive Surgery
    • /
    • v.15 no.4
    • /
    • pp.239-252
    • /
    • 1993
  • This study was done to analyze phonetic dysfunction and the effect of orthognathic surgery to phonation in the patients of mandibular prognathism. 20 persons were chosen as normal group and 20 patients of mandibular prognathism as abnormal for this study. 5 vowel sounds-'ㅏ(a)', 'ㅔ(e)', 'ㅣ(i)', 'ㅗ(o)', 'ㅜ(u)' and 14 consonant sounds-'ㄱ(g)', "ㄴ(n), 'ㄷ(d)', 'ㄹ(l)', 'ㅁ(m)', 'ㅂ(b)', 'ㅅ(s)', 'ㅇ(ng)', 'ㅈ(j)', 'ㅊ(ch)', 'ㅋ(k)', 'ㅌ(t)', 'ㅍ(p)', 'ㅎ(h)', were checked. We recorded these sounds in the period of preopration, postop. 12 months, postop. 24 months. A Formant ratio and a length of consonant time were studied with discriminant analysis. As a result of the study, the following conclusion were gained. 1. As a result of the analysis on vowel dysfunction patiensts of mandibular prognathism, more than 80% of men patients showed the dysfunction in prelingual sound 'ㅔ(e)' and 'ㅣ(i)'. More than 70% of women patients showed the dysfunction in all vowel. 2. One year later from the orthognathic surgery, men patients showed a marked improvement in 'ㅏ(a)'. The next were 'ㅗ(o)', 'ㅜ(u)' and 'ㅣ(i)'. Women patients showed a marked improvement in 'ㅜ(u)'. 3. Two year later from the orthognathic surgery, men patients showed a marked improvement in prelingual sound 'ㅔ(e)' and postlingual sound 'ㅗ(o)'. Women patients showed a marked improvement in 'ㅏ(a)'. More than 20% of patients showed the phonetic improvement compared with the condition of the postop. 12 months. 4. As a result of the analysis on consonant dysfunction patient of mandibular prognathism. more than 80% of men patients showed the dysfunction in lingual sound 'ㅅ(s)'. Most women patients showed the dysfunction in labial sound 'ㅁ(m)' and lingual sound 'ㄴ(n)'. More than 50% of patients showed the dysfunction in labial sound and lingual sound. 5. One year later from the orthognathic surgery. men patients showed a complete improvement in hard palatal sound 'ㅈ(j)'. The next were labial sound 'ㅂ(b)', lingual sound 'ㅅ(s)', soft palatal sound 'ㄱ(g)' and 'ㅋ(k)'. Women patients showed a marked improvement in soft palatal sound 'ㅇ(ng)' and 'ㄱ(g)'. 6. Two year later from the orthognathic surgery, all patients showed remarkable improvement in consonant sounds. except for labial sound 'ㅁ(m)', 'ㅍ(p)' and lingual sound 'ㄴ(n)'. The improvement ratio was increased as the time was going on compared with the condition of postop 12 months.

  • PDF

Effective Feature Vector for Isolated-Word Recognizer using Vocal Cord Signal (성대신호 기반의 명령어인식기를 위한 특징벡터 연구)

  • Jung, Young-Giu;Han, Mun-Sung;Lee, Sang-Jo
    • Journal of KIISE:Software and Applications
    • /
    • v.34 no.3
    • /
    • pp.226-234
    • /
    • 2007
  • In this paper, we develop a speech recognition system using a throat microphone. The use of this kind of microphone minimizes the impact of environmental noise. However, because of the absence of high frequencies and the partially loss of formant frequencies, previous systems developed with those devices have shown a lower recognition rate than systems which use standard microphone signals. This problem has led to researchers using throat microphone signals as supplementary data sources supporting standard microphone signals. In this paper, we present a high performance ASR system which we developed using only a throat microphone by taking advantage of Korean Phonological Feature Theory and a detailed throat signal analysis. Analyzing the spectrum and the result of FFT of the throat microphone signal, we find that the conventional MFCC feature vector that uses a critical pass filter does not characterize the throat microphone signals well. We also describe the conditions of the feature extraction algorithm which make it best suited for throat microphone signal analysis. The conditions involve (1) a sensitive band-pass filter and (2) use of feature vector which is suitable for voice/non-voice classification. We experimentally show that the ZCPA algorithm designed to meet these conditions improves the recognizer's performance by approximately 16%. And we find that an additional noise-canceling algorithm such as RAST A results in 2% more performance improvement.

A Study on Fuzziness Parameter Selection in Fuzzy Vector Quantization for High Quality Speech Synthesis (고음질의 음성합성을 위한 퍼지벡터양자화의 퍼지니스 파라메타선정에 관한 연구)

  • 이진이
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.8 no.2
    • /
    • pp.60-69
    • /
    • 1998
  • This paper proposes a speech synthesis method using Fuzzy VQ, and then study how to make choice of fuzziness value which optimizes (controls) the performance of FVQ in order to obtain the synthesized speech which is closer to the original speech. When FVQ is used to synthesize a speech, analysis stage generates membership function values which represents the degree to which an input speech pattern matches each speech patterns in codebook, and synthesis stage reproduces a synthesized speech, using membership function values which is obtained in analysis stage, fuzziness value, and fuzzy-c-means operation. By comparsion of the performance of the FVQ and VQ synthesizer with simmulation, we show that, although the FVQ codebook size is half of a VQ codebook size, the performance of FVQ is almost equal to that of VQ. This results imply that, when Fuzzy VQ is used to obtain the same performance with that of VQ in speech synthesis, we can reduce by half of memory size at a codebook storage. And then we have found that, for the optimized FVQ with maximum SQNR in synthesized speech, the fuzziness value should be small when the variance of analysis frame is relatively large, while fuzziness value should be large, when it is small. As a results of comparsion of the speeches synthesized by VQ and FVQ in their spectrogram of frequency domain, we have found that spectrum bands(formant frequency and pitch frequency) of FVQ synthesized speech are closer to the original speech than those using VQ.

  • PDF

Analysis of Voice Color Similarity for the development of HMM Based Emotional Text to Speech Synthesis (HMM 기반 감정 음성 합성기 개발을 위한 감정 음성 데이터의 음색 유사도 분석)

  • Min, So-Yeon;Na, Deok-Su
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.9
    • /
    • pp.5763-5768
    • /
    • 2014
  • Maintaining a voice color is important when compounding both the normal voice because an emotion is not expressed with various emotional voices in a single synthesizer. When a synthesizer is developed using the recording data of too many expressed emotions, a voice color cannot be maintained and each synthetic speech is can be heard like the voice of different speakers. In this paper, the speech data was recorded and the change in the voice color was analyzed to develop an emotional HMM-based speech synthesizer. To realize a speech synthesizer, a voice was recorded, and a database was built. On the other hand, a recording process is very important, particularly when realizing an emotional speech synthesizer. Monitoring is needed because it is quite difficult to define emotion and maintain a particular level. In the realized synthesizer, a normal voice and three emotional voice (Happiness, Sadness, Anger) were used, and each emotional voice consists of two levels, High/Low. To analyze the voice color of the normal voice and emotional voice, the average spectrum, which was the measured accumulated spectrum of vowels, was used and the F1(first formant) calculated by the average spectrum was compared. The voice similarity of Low-level emotional data was higher than High-level emotional data, and the proposed method can be monitored by the change in voice similarity.

The Comparative Study of Effect on Speech before and after Orthognathic Surgery of Patients (악교정 환자의 악교정 수술전후 발음양상에 대한 비교연구)

  • Kwon, Kyung-Hwan;Kim, Soo-Nam;Lee, Dong-Keun;Cho, Yong-Min;Lee, Suk-Hyang
    • Maxillofacial Plastic and Reconstructive Surgery
    • /
    • v.22 no.2
    • /
    • pp.191-205
    • /
    • 2000
  • The purpose of this study was undertaken to determine the effects of orthognathic surgery on speech. The hyposis stated herein is that functional behaviors of the dentofacial complex, such as speech production, may be adversely affected by deviations of a structural nature(especially, Class III malocclusion). Twenty adults with Class III malocclusion(13 female and 7 male) were studied preoperative, immediate postoperative and either 6 or 12 months postoperative lateral cephalograms. They had mandibular prognathism and had undergone mandible setback operation. The position of tongue, soft palate(Uvula), hyoid bone, respiratory track width, and pharyngeal depth were assessed on lateral cephalograms with 23 cephalometric variables, ANOVA, Paired t-tests and Pearson's product-moment correlation coefficient tests were used to evalute the operative changes in all cephalometric parameters. A experienced speech and language pathologists performed narrow phonetic transcriptions of tape-recorded words and sentences produced by each of the ninth patients and the recording tapes were analyzed by phonetic computer program(Computerized Speech Lab(CSL) Model 4300BI(U.S.A.)) These judges also recorded their ratings of each patient's overall consonants, hypernasality, hyponasality, and articulation proficiency. The results obtained are as follows; 1. There were significant changes in distance of posterior pharyngeal wall to tongue (TI-TW2, TS-TW3) after the surgery at 6 months postoperatively(each p<0.01 p<0.05). 2. The posterior tongue point(TI, TS, PPT) moved posteriorly after surgery and remained to its changed position at 6 months postoperatively(p<0.05). The displacement of tongue was correlated with the movement of mandibular setback amount(p<0.05). The hyoid bone moved posteriorly superiorly after immediate postoperative period. There was significant changes in hyoid bone movement after immediated postoperative period(p<0.05), but returned to its original position during the follow-up period(p>0.05) 3. The soft palate was displaced posteriorly superiorly after immediated operative period and remained to its changed position at 6 months postoperatively(p<0.05). ANS-PNS-SPT angle increasing, PPU-PPPo distance narrowing was showed after surgery, and remained its appearance 6 months postoperatively(p<0.05). 4. There were significant changes in formant value and squre diagram of vowel sound after the orthognathic surgery and the follow-up period. There were significant changes in /ㅅ/sound and posterior tongue sound. 5. The posterior movement of tongue and the posteriosuperior movement of soft palate was correlated with mandibular setback amount after orthognathic surgery. On the vowel squre diagram, the author found that the place of articulation after operation moved downward, backward, upward. 6. In assessing speech abnormalities, dental occlusion should be considered as a contributing factor. The vast majority of subjects with preoperative misarticulations eliminated or reduced their errors following orthognathic surgery. There was significant difference in speech impovement between pre- and postoperation.

  • PDF

A Study on Acoustical Properties of Soprano′s Singing (소프라노의 성악 발성에 대한 음향학적 특징 연구)

  • 임동철;문소연;이행세
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.5
    • /
    • pp.60-64
    • /
    • 2000
  • This paper studies the relation between the Fundamental Frequency (F0) and the formants of simple vowels in the Korean language sung by sopranos. It is hewn that, in soprano singing, the F0 of a vowel affects its formants. For this reason the formants of simple vowels sung by sopranos must be considered in all over the soprano singing range. We recorded the five simple vowel sounds /a/, /e/, /i/, /o/, and /u/ sung by five professional sopranos from A3 (220.0Hz) to A5 (880.0Hz) in the major scale and compared the formants of the sung vowels with those of spoken vowels. We observed that F1 and F2 of sung vowels were stable in low F0 (lower than B4) but in high F0 (higher than B4), F1 and F2 lost their stabilities. In the case of /a/, /o/, and /u/, the slope of the F1-F2 graph was about 2.6, and those of the F0-F2 and F0-Fl graphs were 2.2-2.5 and 0.7-1.0, respectively. And as the F0 increases, the F1 and F2 of sung vowels /a/, /e/, /i/, /o/, and /u/ were almost the same. At A5, the Fl and F2 of five sung vowels had the same values. This results suggest that the relation between the F0 and the formants be used to synthesize soprano's singing vowels.

  • PDF

Perceptual cues for /o/ and /u/ in Seoul Korean (서울말 /?/와 /?/의 지각특성)

  • Byun, Hi-Gyung
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.1-14
    • /
    • 2020
  • Previous studies have confirmed that /o/ and /u/ in Seoul Korean are undergoing a merger in the F1/F2 space, especially for female speakers. As a substitute parameter for formants, it is reported that female speakers use phonation (H1-H2) differences to distinguish /o/ from /u/. This study aimed to explore whether H1-H2 values are being used as perceptual cues for /o/-/u/. A perception test was conducted with 35 college students using /o/ and /u/ spoken by 41 females, which overlap considerably in the vowel space. An acoustic analysis of 182 stimuli was also conducted to see if there is any correspondence between production and perception. The identification rate was 89% on average, 86% for /o/, and 91% for /u/. The results confirmed that when /o/ and /u/ cannot be distinguished in the F1/F2 space because they are too close, H1-H2 differences contribute significantly to the separation of the two vowels. However, in perception, this was not the case. H1-H2 values were not significantly involved in the identification process, and the formants (especially F2) were still dominant cues. The study also showed that even though H1-H2 differences are apparent in females' production, males do not use H1-H2 in their production, and both females and males do not use H1-H2 in their perception. It is presumed that H1-H2 has not yet been developed as a perceptual cue for /o/ and /u/.