• Title/Summary/Keyword: Speech intelligibility

Search Result 257, Processing Time 0.024 seconds

Performance Evaluation of Novel AMDF-Based Pitch Detection Scheme

  • Kumar, Sandeep
    • ETRI Journal
    • /
    • v.38 no.3
    • /
    • pp.425-434
    • /
    • 2016
  • A novel average magnitude difference function (AMDF)-based pitch detection scheme (PDS) is proposed to achieve better performance in speech quality. A performance evaluation of the proposed PDS is carried out through both a simulation and a real-time implementation of a speech analysis-synthesis system. The parameters used to compare the performance of the proposed PDS with that of PDSs that are based on either a cepstrum, an autocorrelation function (ACF), an AMDF, or circular AMDF (CAMDF) methods are as follows: percentage gross pitch error (%GPE); a subjective listening test; an objective speech quality assessment; a speech intelligibility test; a synthesized speech waveform; computation time; and memory consumption. The proposed PDS results in lower %GPE and better synthesized speech quality and intelligibility for different speech signals as compared to the cepstrum-, ACF-, AMDF-, and CAMDF-based PDSs. The computational time of the proposed PDS is also less than that for the cepstrum-, ACF-, and CAMDF-based PDSs. Moreover, the total memory consumed by the proposed PDS is less than that for the ACF- and cepstrum-based PDSs.

The Text-to-Speech System Assessment Based on Word Frequency and Word Regularity Effects (단어빈도와 단어규칙성 효과에 기초한 합성음 평가)

  • Nam, Ki-Chun;Choi, Won-Il;Kim, Choong-Myung;Choi, Yang-Gyu;Kim, Jong-Jin
    • MALSORI
    • /
    • no.53
    • /
    • pp.61-74
    • /
    • 2005
  • In the present study, the intelligibility of the synthesized speech sounds was evaluated by using the psycholinguistic and fMRI techniques. In order to see the difference in recognizing words between the natural and synthesized speech sounds, word regularity and word frequency were varied. The results of Experiment1 and Experiment2 showed that the intelligibility difference of the synthesized speech comes from word regularity. In the case of the synthesized speech, the regular words were recognized slower than the irregular words, and there was smaller activation of the auditory areas in brain for the regular words than for the irregular words.

  • PDF

Comparison of Sound Pressure Level and Speech Intelligibility of Emergency Broadcasting System at Longitudinal Corridor (장방향 복도 공간의 비상방송설비에 대한 음압 레벨과 음성 명료도 비교)

  • Jeong, Jeong-Ho;Lee, Sung-Chan
    • Fire Science and Engineering
    • /
    • v.32 no.4
    • /
    • pp.42-49
    • /
    • 2018
  • In this study, in order to investigate whether or not the emergency broadcasting sound generated from an emergency broadcasting speaker is clearly transmitted to the occupant through architectural sound simulation, when the loudspeaker for emergency broadcasting is installed at intervals of 25 m according to NFSC 202 for a rectangular hallway. The sound pressure level and speech intelligibility index were analyzed according to changes in building finishing materials. With a reflective material finishing, sound pressure level satisfied the standard while speech intelligibility index was low. As a result of applying the sound absorbing material finishing, clarity and speech transmission index was improved to a level that could be understood by the occupant, whereas the sound pressure level delivered to the occupant decreased in the same space.

The interlanguage Speech Intelligibility Benefit for Korean Learners of English: Production of English Front Vowels

  • Han, Jeong-Im;Choi, Tae-Hwan;Lim, In-Jae;Lee, Joo-Kyeong
    • Phonetics and Speech Sciences
    • /
    • v.3 no.2
    • /
    • pp.53-61
    • /
    • 2011
  • The present work is a follow-up study to that of Han, Choi, Lim and Lee (2011), where an asymmetry in the source segments eliciting the interlanguage speech intelligibility benefit (ISIB) was found such that the vowels which did not match any vowel of the Korean language were likely to elicit more ISIB than matched vowels. In order to identify the source of the stronger ISIB in non-matched vowels, acoustic analyses of the stimuli were performed. Two pairs of English front vowels [i] vs. [I], and $[{\varepsilon}]$ vs. $[{\ae}]$ were recorded by English native talkers and two groups of Korean learners according to their English proficiency, and then their vowel duration and the frequencies of the first two formants (F1, F2) were measured. The results demonstrated that the non-matched vowels such as [I], and $[{\ae}]$ produced by Korean talkers seemed to show more deviated acoustic characteristics from those of the natives, with longer duration and with closer formant values to the matched vowels, [i] and $[{\varepsilon}]$, than those of the English natives. Combining the results of acoustic measurements in the present study and those of word identification in Han et al. (2011), we suggest that relatively better performance in word identification by Korean talkers/listeners than the native English talkers/listeners is associated with the shared interlanguage of Korean talkers and listeners.

  • PDF

Influence of SNR difference on the Korean speech intelligibility in classrooms (교실에서 신호대잡음비 변이가 한국어 음성명료도에 미치는 영향)

  • Park, Chan-Jae;Jo, Sung-Min;Haan, Chan-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.6
    • /
    • pp.651-660
    • /
    • 2019
  • The present study aims to find out the necessary speech sound level which can satisfy with the speech intelligibility in a noisy classroom environments. For this, auralized materials were made to undertake listening tests with 27 people. Speech intelligibility tests were carried out using both Consonant-Vowel-Consonant (CVC) and Phonetically Balanced Words (PBW) methods. Signal to noise ratio was changed by 5 dB for each test. As a result, it was found that speech intelligibilities are increasing with larger Signal to Noise Ratio (SNR). It was also found that there is a lot of difference of speech intelligibilities by SNR for syllables (CVC) with the Reverberation Time (RT) of 1.5 s. However, any significant difference was not found for words (PBW) in the case with RTs of below 0.8 s. Also, it was revealed through the 2-way analysis of variance (ANOVA) test that SNR is the only attentive factor which can affect the Korean speech intelligibilities for both PBW and CVC methods. Therefore, RTs below 0.8 s could be the acoustic criteria for classroom which can minimize the effects of noise. In the case with RTs larger than 0.8 s, much larger SNR is needed to give sufficient speech intelligibility.

Investigation of the Speech Intelligibility of Classrooms Depending on the Sound Source Location

  • Kim Jeong Tai;Haan Chan-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.4E
    • /
    • pp.139-143
    • /
    • 2005
  • The present study aims to investigate the effects of speaker location on the speech intelligibility in a classroom. In order to this, acoustic measurements were undertaken in a classroom with three different sound source locations such as center of front wall (FC), both sides of front wall (FS) and the center of ceiling (CC). SPL, RT, $D_{50}$, RASTI were measured in the 9 measurement points with same sound power level of sound source and MLS was used as the sound source signal. Also, subjective listening tests were carried out using Korean language listening materials which were recorded in an anechoic chamber. The recorded syllables were replayed and recorded again in the classroom with same sound source at three different locations and listening tests were undertaken to 20 respondents who were asked to write the correct syllables which were recorded in the classroom. The results show that higher sound intelligibility ($D_{50}$ of $47\%$, RASTI of 0.56) was obtained when sound source was located at the FS. The results also show that high sound intelligibility was obtained at the area nearby walls.

The Speech Characteristics of Korean Dysarthria: An Experimental Study with the Use of a Phonetic Contrast Intelligibility Test (음소대조 검사방법을 이용한 마비말장애인의 말소리 명료도 특성)

  • Kim Soo Jin;Kim Young Tae;Kim Gi Na
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.1E
    • /
    • pp.28-33
    • /
    • 2005
  • This study was designed to suggest an assessment tool for analyzing the characteristics of Korean phonetic contrast intelligibility among dysarthric individuals. The intelligibility deficit factors of phonetic contrast in Korean dysarthric patients were analyzed through stepwise regression analysis. The 19 acoustic-phonetic contrasts proposed by Kent et al. (1999) have been claimed to be useful for clinical assessment and research on dysarthria. However, the test cannot be directly applied to Korean patients due to linguistic differences between English and Korean. Thus, it is necessary to devise a Korean word intelligibility test that reflects the distinct characteristics of the Korean language. To identify the speech error characteristics of a Korean dysarthric group, a Korean word list was audio-recorded by 3 spastic, 4 flaccid, and 5 mixed type of dysarthric patients. The word list consisted of monosyllabic consonant-vowel-consonant (CVC) real word pairs. Stimulus words included 41 phonemic contrast pairs and six triplets. The results showed that the percentage of errors in final position contrast was higher than in any other position. Unlike the results of previous studies, the initial-position contrasts were crucial in predicting the overall intelligibility among Korean patients.

The Effect of the Speech Enhancement Algorithm for Sensorineural Hearing Impaired Listeners

  • Kim, Dong-Wook;Lee, Young-Woo;Lee, Jong-Shill;Chee, Young-Joon;Lee, Sang-Min;Kim, In-Young;Kim, Sun-I.
    • Journal of Biomedical Engineering Research
    • /
    • v.28 no.6
    • /
    • pp.732-743
    • /
    • 2007
  • Background noise is one of the major complaints of not only hearing impaired persons but also normal listeners. This paper describes the results of two experiments in which speech recognition performance was determined for listeners with normal hearing and sensorineural hearing loss in noise environment. First, we compared speech enhancement algorithms by evaluation speech recognition ability in various speech-to-noise ratios and types of noise. Next, speech enhancement algorithms by reducing background noise were presented and evaluated to improve speech intelligibility for sensorineural hearing impairment listeners. We tested three noise reduction methods using single-microphone, such as spectrum subtraction and companding, Wiener filter method, and maximum likelihood envelop estimation. Their responses in background noise were investigated and compared with those by the speech enhancement algorithm that presented in this paper. The methods improved speech recognition test score for the sensorineural hearing impaired listeners, but not for normal listeners. The results suggest the speech enhancement algorithm with the loudness compression can improve speech intelligibility for listeners with sensorineural hearing loss.

The Interlanguage Speech Intelligibility Benefit (ISIB) of English Prosody: The Case of Focal Prominence for Korean Learners of English and Natives

  • Lee, Joo-Kyeong;Han, Jeong-Im;Choi, Tae-Hwan;Lim, Injae
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.53-68
    • /
    • 2012
  • This study investigated the speech intelligibility of Korean-accented and native English focus speech for Korean and native English listeners. Three different types of focus in English, broad, narrow and contrastive, were naturally induced in semantically optimal dialogues. Seven high and seven low proficiency Korean speakers and seven native speakers participated in recording the stimuli with another native speaker. Fifteen listeners from each of Korean high & low proficiency and native groups judged audio signals of focus sentences. Results showed that Korean listeners were more accurate at identifying the focal prominence for Korean speakers' narrow focus speech than that of native speakers, and this suggests that the interlanguage speech intelligibility benefit-talker (ISIB-T) held true for narrow focus regardless of Korean speakers' and listeners' proficiency. However, Korean listeners did not outperform native listeners for Korean speakers' production of narrow focus, which did not support for the ISIB-listener (L). Broad and contrastive focus speech did not provide evidence for either the ISIB-T or ISIB-L. These findings are explained by the interlanguage shared by Korean speakers and listeners where they have established more L1-like common phonetic features and phonological representations. Once semantically and syntactically interpreted in a higher level processing in Korean narrow focus speech, the narrow focus was phonetically realized in a more intelligible way to Korean listeners due to the interlanguage. This may elicit ISIB. However, Korean speakers did not appear to make complete semantic/syntactic access to either broad or contrastive focus, which might lead to detrimental effects on lower level phonetic outputs in top-down processing. This is, therefore, attributed to the fact that Korean listeners did not take advantage over native listeners for Korean talkers and vice versa.

The text-to-speech system assessment based on word frequency and word regularity effects (단어빈도와 단어규칙성 효과에 기초한 합성음 평가)

  • Nam Kichun;Choi Wonil;Lee Donghoon;Koo Minmo;Kim Jongjin
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.105-108
    • /
    • 2002
  • In the present study, the intelligibility of the synthesized speech sounds was evaluated by using the psycholinguistic and fMRI techniques, In order to see the difference in recognizing words between the natural and synthesized speech sounds, word regularity and word frequency were varied. The results of Experiment1 and Experiment2 showed that the intelligibility difference of the synthesized speech comes from word regularity. There were smaller activation of the auditory areas in brain and slower recognition time for the regular words.

  • PDF