• Title/Summary/Keyword: Vocal Tract

Search Result 172, Processing Time 0.022 seconds

A Rare Case of Acute Obstructive Laryngitis in a Cat with Severe Respiratory Distress

  • Hyeona Bae;Dongbin Lee;DoHyeon Yu
    • Journal of Veterinary Clinics
    • /
    • v.40 no.2
    • /
    • pp.124-129
    • /
    • 2023
  • A 5-year-old neutered male domestic short-haired cat presented with acute dyspnea characterized by open-mouth breathing and stridor for 2 days. Direct visualization via laryngoscopy revealed diffuse laryngeal swelling and severe thickening of the vocal folds bilaterally; thus, the upper respiratory tract was obstructed owing to severe edema. Neutrophil infiltration was found on fine needle aspiration of the larynx cytology, and no discrete mass with polyp or neoplasia was identified on diagnostic imaging. The cat was diagnosed with acute obstructive laryngitis, and a tracheostomy tube was immediately installed. After 17 days of treatment with steroids, doxycycline and azithromycin, the swollen larynx gradually improved, and there was no recurrence of laryngitis or respiratory obstruction. A feline upper respiratory polymerase chain reaction panel revealed Mycoplasma felis infection; however, it could not be determined whether it was pathogenic or opportunistic. Herein, we report a case of obstructive laryngitis in a cat. When respiratory obstruction due to acute laryngitis is identified, a good prognosis is expected with rapid and appropriate treatment.

Comparison of Korean Speech De-identification Performance of Speech De-identification Model and Broadcast Voice Modulation (음성 비식별화 모델과 방송 음성 변조의 한국어 음성 비식별화 성능 비교)

  • Seung Min Kim;Dae Eol Park;Dae Seon Choi
    • Smart Media Journal
    • /
    • v.12 no.2
    • /
    • pp.56-65
    • /
    • 2023
  • In broadcasts such as news and coverage programs, voice is modulated to protect the identity of the informant. Adjusting the pitch is commonly used voice modulation method, which allows easy voice restoration to the original voice by adjusting the pitch. Therefore, since broadcast voice modulation methods cannot properly protect the identity of the speaker and are vulnerable to security, a new voice modulation method is needed to replace them. In this paper, using the Lightweight speech de-identification model as the evaluation target model, we compare speech de-identification performance with broadcast voice modulation method using pitch modulation. Among the six modulation methods in the Lightweight speech de-identification model, we experimented on the de-identification performance of Korean speech as a human test and EER(Equal Error Rate) test compared with broadcast voice modulation using three modulation methods: McAdams, Resampling, and Vocal Tract Length Normalization(VTLN). Experimental results show VTLN modulation methods performed higher de-identification performance in both human tests and EER tests. As a result, the modulation methods of the Lightweight model for Korean speech has sufficient de-identification performance and will be able to replace the security-weak broadcast voice modulation.

Intonatin Conversion using the Other Speaker's Excitation Signal (他話者의 勵起信號를 이용한 抑揚變換)

  • Lee, Ki-Young;Choi, Chang-Seok;Choi, Kap-Seok;Lee, Hyun-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.4
    • /
    • pp.21-28
    • /
    • 1995
  • In this paper an intonation conversion method is presented which provides the basic study on converting the original speech into the artificially intoned one. This method employs the other speaker's excitation signals as intonation information and the original vocal tract spectra, which are warped with the other speaker's ones by using DTW. as vocal features, and intonation converted speech signals are synthesized through short-time inverse Fourier transform(STIFT) of their product. To evaluate the intonation converted speech by this method, we collect Korean single vowels and sentences spoken by 30 males and compare fundamental frequency contours spectrograms, distortion measures and MOS test between the original speech and the converted one. The result shows that this method can convert and speech into the intoned one of the other speaker's.

  • PDF

The Acoustic Changes of Voice after Uvulopalatopharyngoplasty (구개인두성형술 후 음성의 음향학적 변화)

  • Hong, K.H.;Kim, S.W.;Yoon, H.W.;Cho, Y.S.;Moon, S.H.;Lee, S.H.
    • Speech Sciences
    • /
    • v.8 no.2
    • /
    • pp.23-37
    • /
    • 2001
  • The primary sound produced by the vibration of vocal folds reaches the velopharyngeal isthmus and is directed both nasally and orally. The proportions of the each component is determined by the anatomical and functional status of the soft palate. The oral sounds composed of oral vowels and consonants according to the status of vocal tract, tongue, palate and lips. The nasal sounds composed of nasal consonants and nasal vowels, and further modified according to the status of the nasal airway, so anatomical abnormalities in the nasal cavity will influence nasal sound. The measurement of nasal sounds of speech has relied on the subjective scoring by listeners. The nasal sounds are described with nasality and nasalization. Generally, nasality has been assessed perceptually in the effect of maxillofacial procedures for cleft palate, sleep apnea, snoring and nasal disorders. The nasalization is considered as an acoustic phenomenon. Snoring and sleep apnea is a typical disorders due to abundant velopharynx. The sleep apnea has been known as a cessation of breathing for at least 10 seconds during sleep. Several medical and surgical methods for treating sleep apnea have been attempted. The uvulopalatopharyngoplasty(UPPP) involves removal of 1.0 to 3.0 cm of soft palate tissue with removal of redundant oropharyngeal mucosa and lateral tissue from the anterior and sometimes posterior faucial pillars. This procedure results in a shortened soft palate and a possible risk following this surgery may be velopharyngeal malfunctioning due to the shortened palate. Few researchers have systematically studied the effects of this surgery as it relates to speech production. Some changes in the voice quality such as resonance (nasality), articulation, and phonation have been reported. In view of the conflicting reports discussed, there remains some uncertainty about the speech status in patients following the snoring and sleep apnea surgery. The study was conducted in two phases: 1) acoustic analysis of oral and nasal sounds, and 2) evaluation of nasality.

  • PDF

Emotion Recognition Based on Frequency Analysis of Speech Signal

  • Sim, Kwee-Bo;Park, Chang-Hyun;Lee, Dong-Wook;Joo, Young-Hoon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.2 no.2
    • /
    • pp.122-126
    • /
    • 2002
  • In this study, we find features of 3 emotions (Happiness, Angry, Surprise) as the fundamental research of emotion recognition. Speech signal with emotion has several elements. That is, voice quality, pitch, formant, speech speed, etc. Until now, most researchers have used the change of pitch or Short-time average power envelope or Mel based speech power coefficients. Of course, pitch is very efficient and informative feature. Thus we used it in this study. As pitch is very sensitive to a delicate emotion, it changes easily whenever a man is at different emotional state. Therefore, we can find the pitch is changed steeply or changed with gentle slope or not changed. And, this paper extracts formant features from speech signal with emotion. Each vowels show that each formant has similar position without big difference. Based on this fact, in the pleasure case, we extract features of laughter. And, with that, we separate laughing for easy work. Also, we find those far the angry and surprise.

On a Pitch Alteration Technique in the V/UV Spectrum for High Quality Speech Synthesis Technique (고음질 합성방식용 V/UV 스펙트럼상의 피치변경법에 관한 연구)

  • Jo, Wang-Rae;Bae, Myung-Jin;Kim, Dong-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.6
    • /
    • pp.99-103
    • /
    • 1996
  • Most waveform coding techniques attempt to reduce redundancy of speech signal while preserving the shape of the waveform. In speech synthesis, wavefrom coding methods are used to the synthesis by rule for high quality speech. However, it is difficult to apply the waveform coding to the synthesis by rule because the parameters of the wavefrom coding cannot be classified as either the excitation or the vocal tract parameters. The proposed method shows little spectrum distortion of 2.7% or less for 50% pitch changes. It also achieves smooth connection of wavefrom magnitudes among the frames by compensating the phase in time domain.

  • PDF

Voice Personality Transformation Using a Multiple Response Classification and Regression Tree (다중 응답 분류회귀트리를 이용한 음성 개성 변환)

  • 이기승
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.3
    • /
    • pp.253-261
    • /
    • 2004
  • In this paper, a new voice personality transformation method is proposed. which modifies speaker-dependent feature variables in the speech signals. The proposed method takes the cepstrum vectors and pitch as the transformation paremeters, which represent vocal tract transfer function and excitation signals, respectively. To transform these parameters, a multiple response classification and regression tree (MR-CART) is employed. MR-CART is the vector extended version of a conventional CART, whose response is given by the vector form. We evaluated the performance of the proposed method by comparing with a previously proposed codebook mapping method. We also quantitatively analyzed the performance of voice transformation and the complexities according to various observations. From the experimental results for 4 speakers, the proposed method objectively outperforms a conventional codebook mapping method. and we also observed that the transformed speech sounds closer to target speech.

Efficient Tracking of Speech Formant Using Closed Phase WRLS-VFF-VT Algorithm

  • Lee, Kyo-Sik;Park, Kyu-Sik
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.2E
    • /
    • pp.8-13
    • /
    • 2000
  • In this paper, we present an adaptive formant tracking algorithm for speech using closed phase WRLS-VFF-VT method. The pitch synchronous closed phase methods is known to give more accurate estimates of the vocal tract parameters than the pitch asynchronous method. However the use of a pitch-synchronous closed phase analysis method has been limited due to difficulties associated with the task of accurately isolating the closed phase region in successive periods of speech. Therefore we have implemented the pitch synchronous closed phase WRLS-VFF-VT algorithm for speech analysis, especially for formant tracking. The proposed algorithm with the variable threshold(VT) can provide a superior performance in the boundary of phone and voiced/unvoiced sound. The proposed method is experimentally compared with the other method such as two channel CPC method by using synthetic waveform and real speech data. From the experimental results, we found that the block data processing techniques, such as the two-channel CPC, gave reasonable estimates of the formant/antiformant. However, the data windows used by these methods included the effects of the periodic excitation pulses, which affected the accuracy of the estimated formants. On the other hand the proposed WRLS-VFF-VT method, which eliminated the influence of the pulse excitation by using an input estimation as part of the algorithm, gave very accurate formant/bandwidth estimates and good spectral matching.

  • PDF

A Link between Perceived and Produced Vowel Spaces of Korean Learners of English (한국인 영어학습자의 지각 모음공간과 발화 모음공간의 연계)

  • Yang, Byunggon
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.81-89
    • /
    • 2014
  • Korean English learners tend to have difficulty perceiving and producing English vowels. The purpose of this study is to examine a link between perceived and produced vowel spaces of Korean learners of English. Sixteen Korean male and female participants perceived two sets of English synthetic vowels on a computer monitor and rated their naturalness. The same participants produced English vowels in a carrier sentence with high and low pitch variation in a clear speaking mode. The author compared the perceived and produced vowel spaces in terms of the pitch and gender variables. Results showed that the perceived vowel spaces were not significantly different in either variables. Korean learners perceived the vowels similarly. They did not differentiate the tense-lax vowel pairs nor the low vowels. Secondly, the produced vowel spaces of the male and female groups showed a 25% difference which may have come from their physiological differences in the vocal tract length. Thirdly, the comparison of the perceived and produced vowel spaces revealed that although the vowel space patterns of the Korean male and female learners appeared similar, which may lead to a relative link between perception and production, statistical differences existed in some vowels because of the acoustical properties of the synthetic vowels, which may lead to an independent link. The author concluded that any comparison between the perceived and produced vowel space of nonnative speakers should be made cautiously. Further studies would be desirable to examine how Koreans would perceive different sets of synthetic vowels.

Dysphagia Handicap Index and Swallowing Characteristics based on Laryngeal Functions in Korean Elderly (한국 정상 노인층의 삼킴장애지수와 후두 기능에 따른 삼킴 특성)

  • Kim, Geun-Hee;Choi, Seong Hee;Lee, Kyoung-Jae;Choi, Chul-Hee
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.3-12
    • /
    • 2014
  • Larynx plays an important role in phonation and protection of the respiratory tract during swallowing. The reduced anatomical and physiological function in elevation of larynx and glottis closure can cause problems in voice and swallowing. The present study investigated the Korean version of handicap index of dysphagia in elderly Koreans. Therefore, 60 normal elderly Koreans ranged from 65 to 95 and 20 normal Korean young adults aged from 20 to 25 were participated in this study to compare total (T), physical (P), functional (F), and emotional (E) index scores between two groups as well as among sub groups (60s, 70s, 80s) in elderly. For swallowing, total and sub dysphagia handicap index (DHI) scores, voice quality during /a/phonation following swallowing (saliva and water), intensity of coughing, and L-DDK were measured. The results showed that functional (P), physical (P), emotional (E) scores as well as total (T) score were significantly different between young adults and old adults in DHI(p<.05). Additionally, there was a negative correlation between total DHI score and intensity of coughing (r=-.51) as well as L-DDK (r=-.70). These findings suggest that a slow rate in vocal fold adduction and reduced intensity of coughing in the elderly affect swallowing function. Thus, recently translated Korean version of DHI may be useful as supplement in evaluating the swallowing problems in elderly people.