• Title/Summary/Keyword: 포먼트

Search Result 98, Processing Time 0.026 seconds

The implementation of Korean adult's optimal formant setting by Praat scripting (성인 포먼트 측정에서의 최적 세팅 구현: Praat software와 관련하여)

  • Park, Jiyeon;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.97-108
    • /
    • 2019
  • An automated Praat script was implemented to measure optimal formant frequencies for adults. Optimal formant analysis could be interpreted to show that the deviation of formant frequency that resulted from the two variously combined setting parameters (maximum formant and number of formants) was minimal. To increase the reliability of formant analysis, LPC order should be set differently, based on the gender or vowel type. Praat recommends 5,000 Hz and 5,500 Hz as maximum formant settings and, at the same time, recommends 5 as the number of formants for males and females. However, verification is needed to determine whether these recommended settings are valid for Korean vowels. Statistical analysis showed that formant frequencies significantly varied across the adapted scripts, especially with respect to the data on females. Formant plots and statistical results showed that linear_script and qtone_script are much more reliable in formant measurements. Among four kinds of scripts, the linear and qtone_scripts proved to be more stable and reliable. While the linear_script was designed to have a linearly increased formant step in for-loop, the increment of formant step in the qtone_script was arranged by quarter tone scale (base frequency×common ratio ($\sqrt[24]{2}$)). When looking at the tendency of the formant setting drawn by the two referred algorithms in the context of front vowel [i, e], the maximum formant was set higher; and the number of formants set at a lower value than recommended by Praat. The back vowel [o, u], on the contrary, has a lower maximum formant and a higher number of formants than the standard setting.

Formant-broadened CMS Using the Log-spectrum Transformed from the Cepstrum (켑스트럼으로부터 변환된 로그 스펙트럼을 이용한 포먼트 평활화 켑스트럴 평균 차감법)

  • 김유진;정혜경;정재호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.361-373
    • /
    • 2002
  • In this paper, we propose a channel normalization method to improve the performance of CMS (cepstral mean subtraction) which is widely adopted to normalize a channel variation for speech and speaker recognition. CMS which estimates the channel effects by averaging long-term cepstrum has a weak point that the estimated channel is biased by the formants of voiced speech which include a useful speech information. The proposed Formant-broadened Cepstral Mean Subtraction (FBCMS) is based on the facts that the formants can be found easily in log spectrum which is transformed from the cepstrum by fourier transform and the formants correspond to the dominant poles of all-pole model which is usually modeled vocal tract. The FBCMS evaluates only poles to be broadened from the log spectrum without polynomial factorization and makes a formant-broadened cepstrum by broadening the bandwidths of formant poles. We can estimate the channel cepstrum effectively by averaging formant-broadened cepstral coefficients. We performed the experiments to compare FBCMS with CMS, PFCMS using 4 simulated telephone channels. In the experiment of channel estimation, we evaluated the distance cepstrum of real channel from the cepstrum of estimated channel and found that we were able to get the mean cepstrum closer to the channel cepstrum due to an softening the bias of mean cepstrum to speech. In the experiment of text-independent speaker identification, we showed the result that the proposed method was superior than the conventional CMS and comparable to the pole-filtered CMS. Consequently, we showed the proposed method was efficiently able to normalize the channel variation based on the conventional CMS.

Characteristics of Vowel Formants, Voice Intensity, and Fundamental Frequency of Female with Amyotrophic Lateral Sclerosis using Spectrograms (스펙트로그램을 이용한 근위축성측삭경화증 여성 화자의 모음 포먼트, 음성강도, 기본주파수의 변화)

  • Byeon, Haewon
    • Journal of the Korea Convergence Society
    • /
    • v.10 no.9
    • /
    • pp.193-198
    • /
    • 2019
  • This study analyzed the changes of vowel formant, voice intensity, and fundamental frequency of vowels for 11 months using acoustochemical spectrogram analysis of women diagnosed with amyotrophic lateral sclerosis (ALS). The test word was a vowel /a, i, u/ and a diphthong /h + ja + da/, /h + wi + da/, and /h +ɰi+ da/. Speech data were collected through the word reading task presented on the monitor using 'Alvin' program, and the recording environment was set to 5,500 Hz for the nyquist frequency and 11,000 Hz for the sampling rate. The records were analyzed by using spectrograms to vowel formants, voice intensity, and fundamental frequency. As a result of analysis, the fundamental frequency and intensity of the ALS process were decreased and the formant slope of the diphthong was decreased rather than the formant change in the vowel. This result suggests that the vowel distortion of ALS due to disease progression is due to the decrease of tongue and jaw co morbidity.

A Study on the Speech Signal Processing for Cochlear Implant using the PLP Analysis (청각보철을 위한 PLP방식의 음성신호처리에 관한 연구)

  • Kim, Young-Sun;Choi, Doo-Il;Park, Sang-Hui;Beack, Seung-Hwa
    • Proceedings of the KOSOMBE Conference
    • /
    • v.1992 no.05
    • /
    • pp.167-170
    • /
    • 1992
  • 본 논문에서는 감각성 난청자들이 정상인들과 유사한 음성 인식을 하도록 청각 보철 기기를 구성하였다. 음성의 포먼트를 추출하기 위해서는 PLP(Perceptual Linear Prediction) 방식을 이용하였으며, pitch 추출을 위해서는 3 단계 클리핑 함수를 이용한 자기 상관법을 이용하였다. 또한 다중 채널 - 다중 전극 방식을 이용하여 내이의 헤어셀에 17 개의 전극을 삽입하여 신호를 가하는 시뮬레이션을 하였다. 실험에 사용한 데이타는 모음 /a/, /e/, /i/, /o/, /u/로 전모음과 후모음의 차이를 구별하였으며 두번째 포먼트의 변화와 포먼트 통합 이론에 대한 검증을 하였다.

  • PDF

On the Frequency Domain Pitch Detection of Noise Corrupted Speech Signals -Minimizing the Effects of the F1 by the Spectral AMDF- (배경잡음하에서 주파수영역 피치검출에 관한 연구 -스펙트럼 AMDF에 의한 제 1포먼트 영향 제거법-)

  • Bae, Myung-Jin;Park, Chan-Sou;Ann, Sou-Guil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.10 no.4
    • /
    • pp.12-18
    • /
    • 1991
  • Detecting the fundamental frequency(Fo) of the speech signal is a problem in many speech applications. A problem of the pitch detection method in the frequency domain is occurred by the first formant and the background noise. Thus, in this paper, we proposed a pitch detection algorithm in the frequency domain that reduces the effects of the first formant and the background noise by the spectral AMDF function. Several computer simulation results showed that the proposed algorithm was very effective for fundamental frequency detection.

  • PDF

Influence of Temporo-mandibular Joint Training Using Physical Therapy on the Vowel Acoustic Characteristics (TM Joint의 물리치료를 통한 훈련이 모음의 음향학적 특성에 미치는 영향)

  • Min, Dong-Gi;Lee, Jae-Hong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.5
    • /
    • pp.2203-2208
    • /
    • 2011
  • This study was to examine the change of vowel acoustic characteristics of the temporomandibular joint disorder patients by maintaining normal vocalization pattern of the temporomandibular joint through increasing the range of motion, that was, the oral cavity sonorant cavity of the temporomandibular joint, related to vowel articulation through temporomandibular training using the physical therapy. The subjects of this study were 3 male adults in 20-30s that were diagnosed with temporomandibular joint disorder. As a result of conducting temporomandibular training program using the physical therapy, the $1^{st}$ Formant Frequency(F1), $2^{nd}$ Formant Frequency(F2), and Fundamental Frequency(F0) of the temporomandibular joint disorder patients were increased compared to before and this showed the change of the $1^{st}$ Formant Frequency(F1) related to the open mouth grade of a vowel, as well as the $2^{nd}$ Formant Frequency(F2), and Fundamental Frequency(F0) related to the front-back of a vowel which shows the relationship between the temporomandibular joint, vowels and voice calculation.

Perceptual cues for /o/ and /u/ in Seoul Korean (서울말 /?/와 /?/의 지각특성)

  • Byun, Hi-Gyung
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.1-14
    • /
    • 2020
  • Previous studies have confirmed that /o/ and /u/ in Seoul Korean are undergoing a merger in the F1/F2 space, especially for female speakers. As a substitute parameter for formants, it is reported that female speakers use phonation (H1-H2) differences to distinguish /o/ from /u/. This study aimed to explore whether H1-H2 values are being used as perceptual cues for /o/-/u/. A perception test was conducted with 35 college students using /o/ and /u/ spoken by 41 females, which overlap considerably in the vowel space. An acoustic analysis of 182 stimuli was also conducted to see if there is any correspondence between production and perception. The identification rate was 89% on average, 86% for /o/, and 91% for /u/. The results confirmed that when /o/ and /u/ cannot be distinguished in the F1/F2 space because they are too close, H1-H2 differences contribute significantly to the separation of the two vowels. However, in perception, this was not the case. H1-H2 values were not significantly involved in the identification process, and the formants (especially F2) were still dominant cues. The study also showed that even though H1-H2 differences are apparent in females' production, males do not use H1-H2 in their production, and both females and males do not use H1-H2 in their perception. It is presumed that H1-H2 has not yet been developed as a perceptual cue for /o/ and /u/.

Correlation Analysis of Between Paranasal Sinuses and Formant Frequency According to External Stimulation (외부 자극에 따른 부비동과 포먼트주파수와의 상관성 분석)

  • Kim, Bong-Hyun
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.8
    • /
    • pp.1955-1961
    • /
    • 2013
  • Paranasal sinuses of the empty space is filled with air that exists in the bones in the face. However, the pus becomes inflamed paranasal sinuses sinusitis onset brings the voice of change, and complained of headaches and lethargy. Therefore, in this paper, paranasal sinuses related diseases to predict voice analysis parameter as measured by changes in paranasal sinuses through external stimuli is investigated and carried out a study to analysis the function consisting of the frontal sinus, ethmoid sinus, maxillary sinus, sphenoid sinus. From this, cold pack stimulation in the paranasal sinus area for stimulation before and after voice was performed by measuring formant frequency and external stimuli through correlation analysis of the mutual impact on paranasal sinuses were analyzed.

Acoustic Analysis for Thermal Environment-related Vocalizations in Laying Hens (산란계의 열환경별 특이음에 대한 음성학적 분석)

  • Jeon, J.H.;Yeon, S.C.;Ha, J.K.;Lee, S.J.;Chang, H.H.
    • Journal of Animal Science and Technology
    • /
    • v.47 no.4
    • /
    • pp.697-702
    • /
    • 2005
  • The aim of this study was to divide vocalizations of laying hens (Hy-Line Brown) into general vocalizations (GVs), heat stress-related vocalization (HSV), and cold stress-related vocalizations (CSVs) and to determine if they are classified by the discriminant function analysis method. Thirty laying hens, 65-wk-old, were recorded using digital video recorders 2 times from 10:00 to 14:00 h in each thermal environment (thermoneutral: $22.0{\pm}1.8^{\circ}C$, too hot: $32.0{\pm}2.0^{\circ}C$, too cold: $8.0{\pm}1.9^{\circ}C)$ after a 7 day acclimation period. When the laying hens were not recorded, they were kept in thermoneutral conditions. The GVs, HSV, and CSVs were divided based on the shapes of spectrums and spectrograms. The GVs, HSV, and CSVs were identified as 5, 1, and 3 types, respectively. Pitch, intensity, duration, formant 1, formant 2, formant 3, and formant 4 among the thermal environment-related vocalizations were significantly different (P<0.001). The discrimination rate determined by discriminant function analysis was 86.2%. These results suggest that HSV and CSVs are present and may be used as an indicator of the thermal environment.

The effect of palatal height on the Korean vowels (구개의 높이가 한국어 모음 발음에 미치는 효과에 관한 연구)

  • Chung, Bo-Yoon;Lim, Young-Jun;Kim, Myung-Joo;Nam, Shin-Eun;Lee, Seung-Pyo;Kwon, Ho-Beom
    • The Journal of Korean Academy of Prosthodontics
    • /
    • v.48 no.1
    • /
    • pp.69-74
    • /
    • 2010
  • Purpose: The purpose of this study was to analyze the influence of palatal height on Korean vowels and speech intelligibility in Korean adults and to produce baseline data for future prosthodontic treatment. Material and methods: Forty one healthy Korean men and women who had no problem in pronunciation, hearing, and communication and had no history of airway disease participated in this study. Subjects were classified into H, M, and L groups after clinical determination of palatal height with study casts. Seven Korean vowels were used as sample vowels and subjects'clear speech sounds were recorded using Multispeech software program on computer. The F1 and the F2 of 3 groups were produced and they were compared. In addition, the vowel working spaces of 3 groups by /a/, /i/, and /u/ corner vowels were obtained and their areas were compared. Kruskal-Wallis test and Mann-Whiteny U test were used as statistical methods and P < .05 was considered statistically significant. Results: There were no significant differences in formant frequencies among 3 groups except for the F2 formant frequency between H and L group (P = .003). In the analysis of vowel working space areas of 3 groups, the vowel working spaces of 3 groups were similar in shape and no significant differences of their areas were found. Conclusion: The palatal height did not affect vowel frequencies in most of the vowels and speech intelligibility. The dynamics of tongue activity seems to compensate the morphological difference.