• Title/Summary/Keyword: Speech intelligibility

Search Result 257, Processing Time 0.022 seconds

A Study on the Speech Intelligibility of Voice Disordered Patients according to the Severity and Utterance Level (음성장애의 중증도와 발화 수준에 따른 말 명료도의 변화 연구)

  • Pyo, Hwa-Young
    • Speech Sciences
    • /
    • v.15 no.2
    • /
    • pp.101-110
    • /
    • 2008
  • The purpose of this study was to investigate the speech intelligibility of voice disordered patients when we consider the severity and utterance level as variables. Based on the severity level, 12 patients were divided into three groups, G1, G2, and G3 group, respectively. Words, phrases and sentences produced by the speakers were judged by four listeners with normal hearing, and we compared the intelligibility scores of the three groups. As a result, the speech intelligibility was decreased as the severity level was increased, and the difference was statistically significant. However, the mean difference among words, phrases and sentences was not significant, and the variation of intelligibility according to the utterance level was not under the regular rules.

  • PDF

Comparison of Speech Rate and Long-Term Average Speech Spectrum between Korean Clear Speech and Conversational Speech

  • Yoo, Jeeun;Oh, Hongyeop;Jeong, Seungyeop;Jin, In-Ki
    • Journal of Audiology & Otology
    • /
    • v.23 no.4
    • /
    • pp.187-192
    • /
    • 2019
  • Background and Objectives: Clear speech is an effective communication strategy used in difficult listening situations that draws on techniques such as accurate articulation, a slow speech rate, and the inclusion of pauses. Although too slow speech and improperly amplified spectral information can deteriorate overall speech intelligibility, certain amplitude of increments of the mid-frequency bands (1 to 3 dB) and around 50% slower speech rates of clear speech, when compared to those in conversational speech, were reported as factors that can improve speech intelligibility positively. The purpose of this study was to identify whether amplitude increments of mid-frequency areas and slower speech rates were evident in Korean clear speech as they were in English clear speech. Subjects and Methods: To compare the acoustic characteristics of the two methods of speech production, the voices of 60 participants were recorded during conversational speech and then again during clear speech using a standardized sentence material. Results: The speech rate and longterm average speech spectrum (LTASS) were analyzed and compared. Speech rates for clear speech were slower than those for conversational speech. Increased amplitudes in the mid-frequency bands were evident for the LTASS of clear speech. Conclusions:The observed differences in the acoustic characteristics between the two types of speech production suggest that Korean clear speech can be an effective communication strategy to improve speech intelligibility.

The Study on the Acoustical Characteristics and Speech Intelligibility of Vowels Produced by the Maxillectomized Patients before and after Obturator-Wearing (Palatal Cancer환자의 Obturator 장착전후 모음의 음향학적 특성과 말 명료도에 관한 연구)

  • 최성희;정문규;김호중;표화영;심현섭;최홍식
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.10 no.2
    • /
    • pp.140-148
    • /
    • 1999
  • The use of obturator is the prosthetic rehabilitation approach for restoration of the defected maxillary shape and function for the patients with palatal defect. The obturator can change the shape of vocal tract and nasality, but few reports on the effects of the change were presented. So, the authors performed the experimental study to compare the difference between the sizes of vowel triangles produced by maxillectomized patients before and after obturator-wearing and to consider how much improvement in speech intelligibility can be expected by obturator wearing. The 8 patients who were totally maxillectomized due to palatal cancer were participated as subjects. They produced 5 vowels(/a/, /i/, /u/, /e/, /o/) before and after obturator-wearing. The formants of the vowels were analyzed by the spectrogram of CSL, and their speech intelligibility were judged by normal 8 listeners. As results, the frequency of the first and the second formant showed no significant difference between the articulation before and after wearing, but the comparison of the sizes of vowel triangles, related with the speech intelligibility, showed significant difference. The vowel triangle of the articulation after wearing was larger than that of the articulation before wearing. /i/ showed the lowest speech intelligibility score among the vowel articulation before wearing. After wearing obturators, their scores increased on the whole, especially, in /a/, but the intelligibility of /u/ decreased after wearing.

  • PDF

Non-Intrusive Speech Intelligibility Estimation Using Autoencoder Features with Background Noise Information

  • Jeong, Yue Ri;Choi, Seung Ho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.3
    • /
    • pp.220-225
    • /
    • 2020
  • This paper investigates the non-intrusive speech intelligibility estimation method in noise environments when the bottleneck feature of autoencoder is used as an input to a neural network. The bottleneck feature-based method has the problem of severe performance degradation when the noise environment is changed. In order to overcome this problem, we propose a novel non-intrusive speech intelligibility estimation method that adds the noise environment information along with bottleneck feature to the input of long short-term memory (LSTM) neural network whose output is a short-time objective intelligence (STOI) score that is a standard tool for measuring intrusive speech intelligibility with reference speech signals. From the experiments in various noise environments, the proposed method showed improved performance when the noise environment is same. In particular, the performance was significant improved compared to that of the conventional methods in different environments. Therefore, we can conclude that the method proposed in this paper can be successfully used for estimating non-intrusive speech intelligibility in various noise environments.

Voice Activity Detection Based on SNR and Non-Intrusive Speech Intelligibility Estimation

  • An, Soo Jeong;Choi, Seung Ho
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.11 no.4
    • /
    • pp.26-30
    • /
    • 2019
  • This paper proposes a new voice activity detection (VAD) method which is based on SNR and non-intrusive speech intelligibility estimation. In the conventional SNR-based VAD methods, voice activity probability is obtained by estimating frame-wise SNR at each spectral component. However these methods lack performance in various noisy environments. We devise a hybrid VAD method that uses non-intrusive speech intelligibility estimation as well as SNR estimation, where the speech intelligibility score is estimated based on deep neural network. In order to train model parameters of deep neural network, we use MFCC vector and the intrusive speech intelligibility score, STOI (Short-Time Objective Intelligent Measure), as input and output, respectively. We developed speech presence measure to classify each noisy frame as voice or non-voice by calculating the weighted average of the estimated STOI value and the conventional SNR-based VAD value at each frame. Experimental results show that the proposed method has better performance than the conventional VAD method in various noisy environments, especially when the SNR is very low.

Speech Intelligibility of Alaryngeal Voices and Pre/Post Operative Evaluation of Voice Quality using the Speech Recognition Program(HUVOIS) (음성인식프로그램을 이용한 무후두 음성의 말 명료도와 병적 음성의 수술 전후 개선도 측정)

  • Kim, Han-Su;Choi, Seong-Hee;Kim, Jae-In;Lee, Jae-Yol;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.15 no.2
    • /
    • pp.92-97
    • /
    • 2004
  • Background and Objectives : The purpose of this study was to examine objectively pre and post operative voice quality evaluation and intelligibility of alaryngeal voice using speech recognition program, HUVOIS. Materials and Methods : 2 laryngologists and 1 speech pathologist were evaluated 'G', 'R', 'B' in the GRBAS sclae and speech intelligibility using NTID rating scale from standard paragraph. And also acoustic estimates such as jitter, shimmer, HNR were obtained from Lx Speech Studio. Results : Speech recognition rate was not significantly different between pre and post operation for pathological vocie samples though voice quality(G, B) and acoustic values(Jitter, HNR) were significantly improved after post operation. In Alaryngeal voices, reed type electrolarynx 'Moksori' was the highest both speech intelligibility and speech recognition rate, whereas esophageal speech was the lowest. Coefficient correlation of speech intelligibility and speech recognition rate was found in alaryngeal voices, but not in pathological voices. Conclusion : Current study was not proved speech recognition program, HUVOIS during telephone program was not objective and efficient method for assisting subjective GRBAS scale.

  • PDF

Effects of oral-motor function on PCC and intelligibility in children with Down's syndrome and typically developing children (다운증후군아동과 일반아동의 구강운동기능이 자음정확도 및 말명료도에 미치는 영향)

  • Kang, Eunhye;Sim, Hyunsub
    • Phonetics and Speech Sciences
    • /
    • v.9 no.2
    • /
    • pp.125-135
    • /
    • 2017
  • The current study examines PCC (percentage of correct consonant), speech intelligibility, and oral motor function between the group of typically developing children and the group of children with Down's syndrome. To 15 children with Down's syndrome (mean CA: 9;7) and 15 typically developing children on receptive language age, the following tests were administered: K-WPPSI (2001), Picture Vocabulary Test (Kim et al., 1995), Oral and Speech Motor Control Protocol for total oral functional score (Robbins et al., 1987), DDK and Assessment of Phonology and Articulation for Children (APAC, Kim et al., 2007) for PCC and speech intelligibility. Pearson correlation coefficients were computed for the total oral functional score, PCC and DDK of each group. The statistical analysis showed that there is no significant difference in total functional score and DDK when IQ was controlled. There was a significant correlation between total oral functional score and PCC in the Down's syndrome group and a significant correlation between total oral functional score and intelligibility in the Down's syndrome group whether IQ was controlled or not. The findings suggest that both cognitive ability and overall oral motor function need to be considered for the intervention to enhance PCC or speech intelligibility of children with Down's syndrome.

Syllable-Type-Based Phoneme Weighting Techniques for Listening Intelligibility in Noisy Environments (소음 환경에서의 명료한 청취를 위한 음절형태 기반 음소 가중 기술)

  • Lee, Young Ho;Joo, Jong Han;Choi, Seung Ho
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.165-169
    • /
    • 2014
  • Intelligibility of speech transmitted to listeners can significantly be degraded in noisy environments such as in auditorium and in train station due to ambient noises. Noise-masked speech signal is hard to be recognized by listeners. Among the conventional methods to improve speech intelligibility, consonant-vowel intensity ratio (CVR) approach reinforces the powers of overall consonants. However, excessively reinforced consonant is not helpful in recognition. Furthermore, only some of consonants are improved by the CVR approach. In this paper, we propose the corrective weighting (CW) approach that reinforces the powers of consonants according to syllable-type such as consonant-vowel-consonant (CVC), consonant-vowel (CV) and vowel-consonant (VC) in Korean differently, considering the level of listeners' recognition. The proposed CW approach was evaluated by the subjective test, Comparison Category Rating (CCR) test of ITU-T P.800, showed better performance, that is, 0.18 and 0.24 higher than the unprocessed CVR approach, respectively.

Intelligibility Analysis on the Eavesdropping Sound of Glass Windows Using MTF-STI (MTF-STI를 이용한 유리창 도청음의 명료도 분석)

  • Kim, Hee-Dong;Kim, Yoon-Ho;Kim, Seock-Hyun
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.1
    • /
    • pp.8-15
    • /
    • 2007
  • Speech intelligibility of the eavesdropping sound is investigated on a acoustic cavity - glass window coupled system. Using MLS (Maximum Length Sequency) signal as a sound source, acceleration and velocity responses of the glass window are measured by accelerometer and laser doppler vibrometer. MTF (Modulation Transfer Function) is used to identify tile speech transmission characteristics of the cavity and window system. STI (Speech Transmission Index) based upon MTF is calculated and speech intelligibility of the vibration sound of the glass window is estimated. Speech intelligibilities by the acceleration signal and the velocity signal are compared. Finally, intelligibility of the conversation sound is confirmed by the subjective test.

The effect of articulation therapy using visual phonics to improve the speech intelligibility and vowel space of children with impaired hearing (비주얼파닉스를 활용한 조음중재가 청각장애아동의 말 명료도와 모음공간에 미치는 영향)

  • Shim, Hee-Jeong;Lee, Hyo-Joo;Seo, Chang-Won
    • Phonetics and Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.85-96
    • /
    • 2018
  • The purpose of this study was to investigate the effect of articulatory intervention using visual phonics to improve the speech intelligibility of children with impaired hearing. The subjects of the study were five hearing impaired children. As per the results of the UTAP articulation tests, five phonemes with the most frequent errors were selected for each child and a total of 10 sessions were provided. The methodology involved analyzing vowel space and related measures (vowel space area, vowel articulatory index, formant centralization ratio, and F2i/F2u ratio) before and after the visual phonics intervention. After the articulation intervention, every child's speech intelligibility improved, their vowel space area was widened, the FCR value decreased, and the F2ratio value increased. These results show that the use of visual phonics through symbolic images and hand clues has a positive effect in terms of improving the speech intelligibility of children with impaired hearing.