• Title/Summary/Keyword: Vocal tract

검색결과 173건 처리시간 0.019초

Harmonics(배음)와 Formant Bandwidth(포먼트 폭)를 이용한 음성특성(音聲特性)과 사상체질간(四象體質間)의 상관성(相關性) 연구(硏究) (A Study on the Correlation Between Sasang Constitution and Sound Characteristics Used Harmonics and Formant Bandwidth)

  • 박성진;김달래
    • 사상체질의학회지
    • /
    • 제16권1호
    • /
    • pp.61-73
    • /
    • 2004
  • This study was prepared to investigate the correlation between Sasang constitutional groups and voice characteristics using voice analysis system(in this study, CSL). I focused on the voice characteristics in terms of harmonics, Formant frequency and Formant Bandwidth. The subjects were 71 males. I classified them into three groups, that is Soeumin group, Soyangin group and Taeumin group. The classification method of Constitution used two ways, QSCCII(Questionnarie for the Sasang Constitution Classification II) and Interview with a specialist in Sasang Constitution. So 71 people were categorized into 31 Soeumin(people), 18 Soyangin(people) and 22 Taeumin(people). Pitch is approximately similar to the fundamental frequency(F0) in voices. Shimmer in dB gives an evaluation of the period-to-period variability of the peak-to-peak amplitude within the analyzed voice sample. FFT(Fast Fourier Transform) method in CSL can display sampled voices into harmonics. H1 is the first peak and h2 is the second peak in the harmonics. The amplitude difference of h1 and h2(h1-h2) can be explained as the speaker's phonation type, And Formant frequency and bandwidth can be explained as the speaker's vocal tract. So I checked the harmonics and Formant frequency and Bandwidth as the voice parameters. First I have captured /e/ voices from all subjects using microphone. And then I analyzed /e/ voices with CSL. Power Spectrum and Formant History is the menu in the CSL which can display harmonics and Formant frequency and bandwidth. The results about the correlation between Sasang Constitutional Groups and voice parameters are as follows; 1. There is no significant amplitude difference of harmonics(h1-h2) among three groups. 2. There is the significant difference between Soeumin Group and Soyangin Group in Formant Frequency 1 and Formant Bandwidth 1(p<0.05). Any other parameters have no significance. I assume that Soyangin Group has clearer and brighter voice than Soeumin Group according to the Formant Bandwidth difference. And I think its result has coincidence with the context of "Dongyi Suse Bowon" and "Sasangimhejinam".

  • PDF

식도발성의 숙련 정도에 따른 모음의 음향학적 특징과 자음 산출에 대한 연구 (Analysis of Acoustic Characteristics of Vowel and Consonants Production Study on Speech Proficiency in Esophageal Speech)

  • 최성희;최홍식;김한수;임성은;이성은;표화영
    • 음성과학
    • /
    • 제10권3호
    • /
    • pp.7-27
    • /
    • 2003
  • Esophageal Speech uses the esophageal air during phonation. Fluent esophageal speakers frequently intake air in oral communication, but unskilled esophageal speakers are difficult with swallowing lots of air. The purpose of this study was to investigate the difference of acoustic characteristics of vowel and consonants production according to the speech proficiency level in esophageal speech. 13 normal male speakers and 13 male esophageal speakers (5 unskilled esophageal speakers, 8 skilled esophageal speakers) with age ranging from 50 to 70 years old. The stimuli were sustained /a/ vowel and 36 meaningless two syllable words. Used vowel is /a/ and consonants were 18 : /k, n, t, m, p, s, c, $C^{h},\;k^{h},\;t^{h},\;p^{h}$, h, I, k', t', p', s', c'/. Fundermental frequency (Fx), Jitter, shimmer, HNR, MPT were measured with by electroglottography using Lx speech studio (Laryngograph Ltd, London, UK). 36 meaningless words produced by esophageal speakers were presented to 3 speech-language pathologists who phonetically transcribed their responses. Fx, Jitter, HNR parameters is significant different between skilled esophageal speakers and unskilled esophageal speakers (P<.05). Considering manner of articulation, ANOVA showed that differences in two esophageal speech groups on speech proficiency were significant; Glide had the highest number of confusion with the other phoneme class, affricates are the most intelligible in the unskilled esophageal speech group, whereas in the skilled esophageal speech group fricatives resulted highest number of confusions, nasals are the most intelligible. In the place of articulation, glottal /h/ is the highest confusion consonant in both groups. Bilabials are the most intelligible in the skilled esophageal speech, velars are the most intelligible in the unskilled esophageal speech. In the structure of syllable, 'CV+V' is more confusion in the skilled esophageal group, unskilled esophageal speech group has similar confusion in both structures. In unskilled esophageal speech, significantly different Fx, Jitter, HNR acoustic parameters of vowel and the highest confusions of Liquid, Nasals consonants could be attributed to unstable, improper contact of neoglottis as vibratory source and insufficiency in the phonatory air supply, and higher motoric demand of remaining articulation due to morphological characteristics of vocal tract after laryngectomy.

  • PDF

켑스트럼으로부터 변환된 로그 스펙트럼을 이용한 포먼트 평활화 켑스트럴 평균 차감법 (Formant-broadened CMS Using the Log-spectrum Transformed from the Cepstrum)

  • 김유진;정혜경;정재호
    • 한국음향학회지
    • /
    • 제21권4호
    • /
    • pp.361-373
    • /
    • 2002
  • 본 논문에서는 음성 인식과 화자 인식에서 채널 변이 정규화를 위해 널리 사용되는 전통적인 켑스트럴 평균차감법 (CMS: Cepstral Mean Subtraction)의 성능을 향상시키기 위한 정규화 방법을 제안한다. 기존의 켑스트럴 평균 차감법은 장구간 켑스트럼의 평균으로 채널 성분을 추정하므로 유성음의 포먼트에 의해 채널 성분이 편향되는 단점을 가진다. 제안된 포먼트 평활화 켑스트럴 평균 차감법 (FBCMS; Formant-broadened CMS)은 켑스트럼으로부터 변환된 로그 스펙트럼에서 포먼트 위치를 쉽게 찾을 수 있고, 포먼트는 전극점 모델로 표현되는 성도 전달 함수의 우세 극점에 대응된다는 사실에 근거한다. 따라서 제안된 방법은 켑스트럼으로부터 음성의 포먼트를 구하고, 이로부터 포먼트의 대역폭을 확장한 켑스트럼을 구한 후 평균함으로써 채널 켑스트럼 성분으로부터 우세 극점들의 영향을 제거한다. 전극점 모델의 우세 극점을 얻기 위해 다항식 인수분해 과정을 거치지 않으므로 연산량을 줄일 수 있으며 포먼트에 해당하는 우세 극점만으로 선택적으로 처리할 수 있다. 본 연구에서는 4가지의 모의 채널을 이용하여 전통적인 켑스트럴 평균 차감법, 극점 필터화 켑스트럴 평균 차감법 (Pole-filtered CMS) 그리고 제안된 방법의 비교실험을 수행하였다. 실제 채널 켑스트럼과 추정된 채널 켑스트럼과의 거리를 측정하는 실험에서 음성에 의한 편향을 완화시켜 실제 채널에 보다 가까운 평균 켑스트럼을 얻을 수 있음을 확인하였다. 또한 문장독립 화자 식별에서 제안된 방법은 전통적인 켑스트럴 평균 차감법보다 우세하고 극점 필터화 켑스트럴 평균 차감법 (Pole-filtered CU)과는 비슷한 결과를 보였다. 결과적으로 제안된 방법은 전통적인 켑스트럴 평균 차감법에 기반하여 효과적인 채널 정규화가 가능하다는 것을 보였다.