• Title/Summary/Keyword: Vocal tract characteristics

Search Result 43, Processing Time 0.024 seconds

Effect of Music Training on Categorical Perception of Speech and Music

  • L., Yashaswini;Maruthy, Sandeep
    • Korean Journal of Audiology
    • /
    • v.24 no.3
    • /
    • pp.140-148
    • /
    • 2020
  • Background and Objectives: The aim of this study is to evaluate the effect of music training on the characteristics of auditory perception of speech and music. The perception of speech and music stimuli was assessed across their respective stimulus continuum and the resultant plots were compared between musicians and non-musicians. Subjects and Methods: Thirty musicians with formal music training and twenty-seven non-musicians participated in the study (age: 20 to 30 years). They were assessed for identification of consonant-vowel syllables (/da/ to /ga/), vowels (/u/ to /a/), vocal music note (/ri/ to /ga/), and instrumental music note (/ri/ to /ga/) across their respective stimulus continuum. The continua contained 15 tokens with equal step size between any adjacent tokens. The resultant identification scores were plotted against each token and were analyzed for presence of categorical boundary. If the categorical boundary was found, the plots were analyzed by six parameters of categorical perception; for the point of 50% crossover, lower edge of categorical boundary, upper edge of categorical boundary, phoneme boundary width, slope, and intercepts. Results: Overall, the results showed that both speech and music are perceived differently in musicians and non-musicians. In musicians, both speech and music are categorically perceived, while in non-musicians, only speech is perceived categorically. Conclusions: The findings of the present study indicate that music is perceived categorically by musicians, even if the stimulus is devoid of vocal tract features. The findings support that the categorical perception is strongly influenced by training and results are discussed in light of notions of motor theory of speech perception.

Tube phonation in water for patients with hyperfunctional voice disorders: The effect of tube diameter and water immersion depth on bubble height and maximum phonation time (과기능적 음성장애 환자의 물저항발성: 튜브 직경과 물 깊이가 물거품 높이 및 최대발성지속시간에 미치는 영향)

  • Min Gyeong Kim;Seong Hee Choi;Jong-In Youn
    • Phonetics and Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.31-40
    • /
    • 2023
  • Tube phonation in water has been widely used for voice training among semi-occluded vocal tract (SOVT) exercises in which the patient bubbles with phonation keeping the tube submerged in water. This study aims to investigate the effect of tube diameter and water depth on bubble height and maximum phonation time (MPT) for patients with hyperfunctional voice disorders. Seventeen patients with hyperfunctional voice disorders were asked to bubble with sustained /u/ at the different inner diameters of tube (5, 7, and 10 mm), water depth (4, 7, and 10 cm). A water resistance phonation biofeedback system using a water height sensor was used for recording bubble height and MPT. The bubble height was significantly changed by the tube diameter while MPT was significantly changed with the tube diameter and water depth. Although the wider tube presented significantly lower bubble height for a given depth, relatively consistent bubble height was maintained. Depending on the water depth, the bubble height did not significantly differ for a given tube diameter. In addtion, MPT significantly decreased with water depth and a wider tube led significantly shorter MPT. A water level-driven water resistance biofeedback system provided useful information on bubble characteristics and vocal fold vibration depending on tube diameter and water depth. It can be useful to monitor the breath support during water resistance phonation for patients with hyperfunctional voice disorders.

Voice Similarities between Sisters

  • Ko, Do-Heung
    • Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.43-50
    • /
    • 2001
  • This paper deals with voice similarities between sisters who are supposed to have common physiological characteristics from a single biological mother. Nine pairs of sisters who are believed to have similar voices participated in this experiment. The speech samples obtained from one pair of sisters were eliminated in the analysis because their perceptual score was relatively low. The words were measured in both isolation and context, and the subjects were asked to read the text five times with about three seconds of interval between readings. Recordings were made at natural speed in a quiet room. The data were analyzed in pitch and formant frequencies using CSL (Computerized Speech Lab) and PCQuirer. It was found that data of the initial vowels are much more similar and homogeneous than those of vowels in other positions. The acoustic data showed that voice similarities are strikingly high in both pitch and formant frequencies. It is assumed that statistical data obtained from this experiment can be used as a guideline for modelling speaker identification and speaker verification.

  • PDF

A Study on Korean, English and Japanese Speaker Recognitions Using the Peak and Valley Pitch Detection and the Fuzzy Theory (PVPF방법과 퍼지 이론을 이용한 한국어, 영어 및 일본어 화자 인식에 관한 연구)

  • Kim, Yeon-Suk
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.2
    • /
    • pp.522-533
    • /
    • 1999
  • This paper proposes speaker recognition algorithm which includes both the pitch parameter and the fuzzy inference. This study proposes a pitch detection method PVPF(peak and valley pitch detection fuction) by means of comparing spectra which utilizes the transform characteristics between time and frequency. In this paper, makes reference pattern using membership function and performs vocal tract recognition of common character using fuzzy pattern matching in order to include time variation width for non-linear utterance time.

  • PDF

Analysis of the Relationship Between Sasang Constitutional Groups and Speech Features Based on a Listening Evaluation of Voice Characteristics (목소리 특성의 청취 평가에 기초한 사상체질과 음성 특징의 상관관계 분석)

  • Kwon, Chulhong;Kim, Jongyeol;Kim, Keunho;Jang, Junsu
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.71-77
    • /
    • 2012
  • Sasang constitution experts utilize voice characteristics as an auxiliary measure for deciding a person's constitutional group. This study aims at establishing a relationship between speech features and the constitutional groups by subjective listening evaluation of voice characteristics. A speech database of 841 speakers whose constitutional groups have been already diagnosed by Sasang constitution experts was constructed. Speech features related to speech source and vocal tract filter were extracted from five vowels and one sentence. Statistically significant speech features for classifying the groups were analyzed using SPSS. The features contributed to constitution classification were speaking rate, Energy, A1, A2, A3, H1, H2, H4, CPP for males in their 20s, F0_mean, CPP, SPI, HNR, Shimmer, Energy, A1, A2, A3, H1, H2, H4 for females in their 20s, Energy, A1, A2, A3, H1, H2, H4, CPP for male in the 60s, and Jitter, HNR, CPP, SPI for females in their 60s. Experimental results show that speech technology is useful in classifying constitutional groups.

VOICE SOURCE ESTIMATION USING SEQUENTIAL SVD AND EXTRACTION OF COMPOSITE SOURCE PARAMETERS USING EM ALGORITHM

  • Hong, Sung-Hoon;Choi, Hong-Sub;Ann, Sou-Guil
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.893-898
    • /
    • 1994
  • In this paper, the influence of voice source estimation and modeling on speech synthesis and coding is examined and then their new estimation and modeling techniques are proposed and verified by computer simulation. It is known that the existing speech synthesizer produced the speech which is dull and inanimated. These problems are arised from the fact that existing estimation and modeling techniques can not give more accurate voice parameters. Therefore, in this paper we propose a new voice source estimation algorithm and modeling techniques which can not give more accurate voice parameters. Therefore, in this paper we propose a new voice source estimation algorithm and modeling techniques which can represent a variety of source characteristics. First, we divide speech samples in one pitch region into four parts having different characteristics. Second, the vocal-tract parameters and voice source waveforms are estimated in each regions differently using sequential SVD. Third, we propose composite source model as a new voice source model which is represented by weighted sum of pre-defined basis functions. And finally, the weights and time-shift parameters of the proposed composite source model are estimeted uning EM(estimate maximize) algorithm. Experimental results indicate that the proposed estimation and modeling methods can estimate more accurate voice source waveforms and represent various source characteristics.

  • PDF

Dysphagia Handicap Index and Swallowing Characteristics based on Laryngeal Functions in Korean Elderly (한국 정상 노인층의 삼킴장애지수와 후두 기능에 따른 삼킴 특성)

  • Kim, Geun-Hee;Choi, Seong Hee;Lee, Kyoung-Jae;Choi, Chul-Hee
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.3-12
    • /
    • 2014
  • Larynx plays an important role in phonation and protection of the respiratory tract during swallowing. The reduced anatomical and physiological function in elevation of larynx and glottis closure can cause problems in voice and swallowing. The present study investigated the Korean version of handicap index of dysphagia in elderly Koreans. Therefore, 60 normal elderly Koreans ranged from 65 to 95 and 20 normal Korean young adults aged from 20 to 25 were participated in this study to compare total (T), physical (P), functional (F), and emotional (E) index scores between two groups as well as among sub groups (60s, 70s, 80s) in elderly. For swallowing, total and sub dysphagia handicap index (DHI) scores, voice quality during /a/phonation following swallowing (saliva and water), intensity of coughing, and L-DDK were measured. The results showed that functional (P), physical (P), emotional (E) scores as well as total (T) score were significantly different between young adults and old adults in DHI(p<.05). Additionally, there was a negative correlation between total DHI score and intensity of coughing (r=-.51) as well as L-DDK (r=-.70). These findings suggest that a slow rate in vocal fold adduction and reduced intensity of coughing in the elderly affect swallowing function. Thus, recently translated Korean version of DHI may be useful as supplement in evaluating the swallowing problems in elderly people.

The Study on the Acoustical Characteristics and Speech Intelligibility of Vowels Produced by the Maxillectomized Patients before and after Obturator-Wearing (Palatal Cancer환자의 Obturator 장착전후 모음의 음향학적 특성과 말 명료도에 관한 연구)

  • 최성희;정문규;김호중;표화영;심현섭;최홍식
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.10 no.2
    • /
    • pp.140-148
    • /
    • 1999
  • The use of obturator is the prosthetic rehabilitation approach for restoration of the defected maxillary shape and function for the patients with palatal defect. The obturator can change the shape of vocal tract and nasality, but few reports on the effects of the change were presented. So, the authors performed the experimental study to compare the difference between the sizes of vowel triangles produced by maxillectomized patients before and after obturator-wearing and to consider how much improvement in speech intelligibility can be expected by obturator wearing. The 8 patients who were totally maxillectomized due to palatal cancer were participated as subjects. They produced 5 vowels(/a/, /i/, /u/, /e/, /o/) before and after obturator-wearing. The formants of the vowels were analyzed by the spectrogram of CSL, and their speech intelligibility were judged by normal 8 listeners. As results, the frequency of the first and the second formant showed no significant difference between the articulation before and after wearing, but the comparison of the sizes of vowel triangles, related with the speech intelligibility, showed significant difference. The vowel triangle of the articulation after wearing was larger than that of the articulation before wearing. /i/ showed the lowest speech intelligibility score among the vowel articulation before wearing. After wearing obturators, their scores increased on the whole, especially, in /a/, but the intelligibility of /u/ decreased after wearing.

  • PDF

Comparison of Adult and Child's Speech Recognition of Korean (한국어에서의 성인과 유아의 음성 인식 비교)

  • Yoo, Jae-Kwon;Lee, Kyoung-Mi
    • The Journal of the Korea Contents Association
    • /
    • v.11 no.5
    • /
    • pp.138-147
    • /
    • 2011
  • While most Korean speech databases are developed for adults' speech, not for children's speech, there are various children's speech databases based on other languages. Because there are wide differences between children's and adults' speech in acoustic and linguistic characteristics, the children's speech database needs to be developed. In this paper, to find the differences between them in Korean, we built speech recognizers using HMM and tested them according to gender, age, and the presence of VTLN(Vocal Tract Length Normalization). This paper shows the speech recognizer made by children's speech has a much higher recognition rate than that made by adults' speech and using VTLN helps to improve the recognition rate in Korean.

A Proposition of the Fuzzy Correlation Dimension for Speaker Recognition (화자인식을 위한 퍼지상관차원 제안)

  • Yoo, Byong-Wook;Kim, Chang-Seok;Park, Hyun-Sook
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.36S no.1
    • /
    • pp.115-122
    • /
    • 1999
  • In this paper, we confirmed that a speech signal is a chaos signal, and in order to use it as a speaker recognition parameter, analyzed chaos dimension. In order to raise speaker identification and pattern recognition, by making up the strange attractor involving an individual's vocal tract characteristics very well and applying fuzzy membership function to correlation dimension, we proposed fuzzy correlation dimension. By estimating the correlation of the points making up an attractor are limited according space dimension value, fuzzy correlation dimension absorbed the variation of the reference pattern attractor and test pattern attractor. Concerning fuzzy correlation dimension, by estimating the distance according to the average value of discrimination error per each speaker and reference pattern, investigated the validity of speaker recognition parameter.

  • PDF