• 제목/요약/키워드: speech

검색결과 7,763건 처리시간 0.027초

Comparison of Speech Rate and Long-Term Average Speech Spectrum between Korean Clear Speech and Conversational Speech

  • Yoo, Jeeun;Oh, Hongyeop;Jeong, Seungyeop;Jin, In-Ki
    • Journal of Audiology & Otology
    • /
    • 제23권4호
    • /
    • pp.187-192
    • /
    • 2019
  • Background and Objectives: Clear speech is an effective communication strategy used in difficult listening situations that draws on techniques such as accurate articulation, a slow speech rate, and the inclusion of pauses. Although too slow speech and improperly amplified spectral information can deteriorate overall speech intelligibility, certain amplitude of increments of the mid-frequency bands (1 to 3 dB) and around 50% slower speech rates of clear speech, when compared to those in conversational speech, were reported as factors that can improve speech intelligibility positively. The purpose of this study was to identify whether amplitude increments of mid-frequency areas and slower speech rates were evident in Korean clear speech as they were in English clear speech. Subjects and Methods: To compare the acoustic characteristics of the two methods of speech production, the voices of 60 participants were recorded during conversational speech and then again during clear speech using a standardized sentence material. Results: The speech rate and longterm average speech spectrum (LTASS) were analyzed and compared. Speech rates for clear speech were slower than those for conversational speech. Increased amplitudes in the mid-frequency bands were evident for the LTASS of clear speech. Conclusions:The observed differences in the acoustic characteristics between the two types of speech production suggest that Korean clear speech can be an effective communication strategy to improve speech intelligibility.

Comparison of Speech Rate and Long-Term Average Speech Spectrum between Korean Clear Speech and Conversational Speech

  • Yoo, Jeeun;Oh, Hongyeop;Jeong, Seungyeop;Jin, In-Ki
    • 대한청각학회지
    • /
    • 제23권4호
    • /
    • pp.187-192
    • /
    • 2019
  • Background and Objectives: Clear speech is an effective communication strategy used in difficult listening situations that draws on techniques such as accurate articulation, a slow speech rate, and the inclusion of pauses. Although too slow speech and improperly amplified spectral information can deteriorate overall speech intelligibility, certain amplitude of increments of the mid-frequency bands (1 to 3 dB) and around 50% slower speech rates of clear speech, when compared to those in conversational speech, were reported as factors that can improve speech intelligibility positively. The purpose of this study was to identify whether amplitude increments of mid-frequency areas and slower speech rates were evident in Korean clear speech as they were in English clear speech. Subjects and Methods: To compare the acoustic characteristics of the two methods of speech production, the voices of 60 participants were recorded during conversational speech and then again during clear speech using a standardized sentence material. Results: The speech rate and longterm average speech spectrum (LTASS) were analyzed and compared. Speech rates for clear speech were slower than those for conversational speech. Increased amplitudes in the mid-frequency bands were evident for the LTASS of clear speech. Conclusions:The observed differences in the acoustic characteristics between the two types of speech production suggest that Korean clear speech can be an effective communication strategy to improve speech intelligibility.

구강 개방 상태에 따른 말 명료도 및 말 용인도 특성 (Characteristics of speech intelligibility and speech acceptability connected with mouth opening condition)

  • 송윤경
    • 말소리와 음성과학
    • /
    • 제3권3호
    • /
    • pp.141-148
    • /
    • 2011
  • There are many factors that affect speech intelligibility and speech acceptability. Structural anomalies and neuromotor pathologies are known for the reasons of abnormal speech sounds. And there are minor variations related to oral mechanism. Speaking with restricted mouth opening related to therapeutic procedure or habitual speech pattern might affect the quality of speech sounds. So this study compared speech intelligibility and speech acceptability of recorded 24 words in two conditions (restricted mouth opening condition and normal mouth opening condition) by 30 normal hearing adults. The results showed that speech intelligibility and speech acceptability were significantly lower in restricted mouth opening condition. And speech acceptability was significantly lower than speech intelligibility in restricted mouth opening condition. Speech acceptability in restricted mouth opening condition was significantly lower especially in open vowel. These findings indicated that the mouth opening condition could affect vowel shape and could be an adverse effect on speech intelligibility and speech acceptability.

  • PDF

잡음음성인식을 위한 음성개선 방식들의 성능 비교 (Performance Comparison of the Speech Enhancement Methods for Noisy Speech Recognition)

  • 정용주
    • 말소리와 음성과학
    • /
    • 제1권2호
    • /
    • pp.9-14
    • /
    • 2009
  • Speech enhancement methods can be generally classified into a few categories and they have been usually compared with each other in terms of speech quality. For the successful use of speech enhancement methods in speech recognition systems, performance comparisons in terms of speech recognition accuracy are necessary. In this paper, we compared the speech recognition performance of some of the representative speech enhancement algorithms which are popularly cited in the literature and used widely. We also compared the performance of speech enhancement methods with other noise robust speech recognition methods like PMC to verify the usefulness of speech enhancement approaches in noise robust speech recognition systems.

  • PDF

방향성 마이크로폰과 음성 필터링을 이용한 통신 시스템의 음성 인지도 향상 (Performance Enhancement of Speech Intelligibility in Communication System Using Combined Beamforming (directional microphone) and Speech Filtering Method)

  • 신민철;왕세명
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2005년도 춘계학술대회논문집
    • /
    • pp.334-337
    • /
    • 2005
  • The speech intelligibility is one of the most important factors in communication system. The speech intelligibility is related with speech to noise ratio. To enhance the speech to noise ratio, background noise reduction techniques are being developed. As a part of solution to noise reduction, this paper introduces directional microphone using beamforming method and speech filtering method. The directional microphone narrows the spatial range of processing signal into the direction of the target speech signal. The noise signal located in the same direction with speech still remains in the processing signal. To sort this mixed signal into speech and noise, as a following step, a speech-filtering method is applied to pick up only the speech signal from the processed signal. The speech filtering method is based on the characteristics of speech signal itself. The combined directional microphone and speech filtering method gives enhanced performance to speech intelligibility in communication system.

  • PDF

네트워크 환경에서 서버용 음성 인식을 위한 MFCC 기반 음성 부호화기 설계 (A MFCC-based CELP Speech Coder for Server-based Speech Recognition in Network Environments)

  • 이길호;윤재삼;오유리;김홍국
    • 대한음성학회지:말소리
    • /
    • 제54호
    • /
    • pp.27-43
    • /
    • 2005
  • Existing standard speech coders can provide speech communication of high quality while they degrade the performance of speech recognition systems that use the reconstructed speech by the coders. The main cause of the degradation is that the spectral envelope parameters in speech coding are optimized to speech quality rather than to the performance of speech recognition. For example, mel-frequency cepstral coefficient (MFCC) is generally known to provide better speech recognition performance than linear prediction coefficient (LPC) that is a typical parameter set in speech coding. In this paper, we propose a speech coder using MFCC instead of LPC to improve the performance of a server-based speech recognition system in network environments. However, the main drawback of using MFCC is to develop the efficient MFCC quantization with a low-bit rate. First, we explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel error. As a result, we propose a 8.7 kbps MFCC-based CELP coder. It is shown from a PESQ test that the proposed speech coder has a comparable speech quality to 8 kbps G.729 while it is shown that the performance of speech recognition using the proposed speech coder is better than that using G.729.

  • PDF

동시발화에 나타나는 발화 속도 변이 분석 (Speech Rate Variation in Synchronous Speech)

  • 김미란;남호성
    • 말소리와 음성과학
    • /
    • 제4권4호
    • /
    • pp.19-27
    • /
    • 2012
  • When two speakers read a text together, the produced speech has been shown to reduce a high degree of variability (e.g., pause duration and placement, and speech rate). This paper provides a quantitative analysis of speech rate variation exhibited in synchronous speech by examining the global and local patterns in two dialects of Mandarin Chinese (Taiwan and Shanghai). We analyzed the speech data in terms of mean speech rate and the reference of "Just Noticeable difference (JND)" within a subject and across subjects. Our findings show that speakers show lower and less variable speech rates when they read a text synchronously than when they read alone. This global pattern is observed consistently across speakers and dialects maintaining the unique local variation patterns of speech rate for each dialect. We conclude that paired speakers lower their speech rates and decrease the variability in order to ensure the synchrony of their speech.

식도발성 남성 발화의 말 속도 (Speech Rates of Male Esophageal Speech)

  • 박원경;심희정;고도흥
    • 말소리와 음성과학
    • /
    • 제4권3호
    • /
    • pp.143-149
    • /
    • 2012
  • The purpose of this study is to investigate the speech rate of an esophageal speech group that is capable of vocalization after surgery. The subjects in this experiment were 10 male esophageal speakers and 10 male laryngeal speakers. Each group read a reading passage that was recorded by a DAT recorder (Rolando, EDIROL R-09). These records were analyzed by using CSL (Computerized Speech Lab, model 4150). The results were as follows: (1) the overall speech rate of esophageal speech was 2.50 SPS (syllable per second) while the overall speech rate of laryngeal speech was 4.23 SPS. (2) The articulatory rate of esophageal speech was 3.14 SPS (syllable per second) while the articulatory rate of laryngeal speech was 4.75 SPS. Speech rates as well as articulatory rates of esophageal speech were significantly lower than laryngeal speech. These differences between the two groups may be due to reduced efficiency of airflows across the pharyngeal-esophageal segment for esophageal speakers when compared to airflow through the glottis for laryngeal speakers. These results would provide a guideline in speech rates for esophageal speakers in clinical settings.

파킨슨 환자의 클리어 스피치 전후 음향학적 공기역학적 특성 (An aerodynamic and acoustic characteristics of Clear Speech in patients with Parkinson's disease)

  • 신희백;고도홍
    • 말소리와 음성과학
    • /
    • 제9권3호
    • /
    • pp.67-74
    • /
    • 2017
  • An increase in speech intelligibility has been found in Clear Speech compared to conversational speech. Clear Speech is defined by decreased articulation rates and increased frequency and length of pauses. The objective of the present study was to investigate improvement in immediate speech intelligibility in 10 patients with Parkinson's disease (age range: 46 to 75 years) using Clear Speech. This experiment has been performed using the Phonatory Aerodynamic System 6600 after the participants read the first sentence of a Sanchaek passage and the "List for Adults 1" in the Sentence Recognition Test (SRT) using casual speech and Clear Speech. Acoustic and aerodynamic parameters that affect speech intelligibility were measured, including mean F0, F0 range, intensity, speaking rate, mean airflow rate, and respiratory rate. In the Sanchaek passage, use of Clear Speech resulted in significant differences in mean F0, F0 range, speaking rate, and respiratory rate, compared with the use of casual speech. In the SRT list, significant differences were seen in mean F0, F0 range, and speaking rate. Based on these findings, it is claimed that speech intelligibility can be affected by adjusting breathing and tone in Clear Speech. Future studies should identify the benefits of Clear Speech through auditory-perceptual studies and evaluate programs that use Clear Speech to increase intelligibility.

말소리장애 아동의 말명료도와 음향학적 측정치 간 상관관계 (The Correlation between Speech Intelligibility and Acoustic Measurements in Children with Speech Sound Disorders)

  • 강은영
    • 대한통합의학회지
    • /
    • 제6권4호
    • /
    • pp.191-206
    • /
    • 2018
  • Purpose : This study investigated the correlation between speech intelligibility and acoustic measurements of speech sounds produced by the children with speech sound disorders and children without any diagnosed speech sound disorder. Methods : A total of 60 children with and without speech sound disorders were the subjects of this study. Speech samples were obtained by having the subjects? speak meaningful words. Acoustic measurements were analyzed on a spectrogram using the Multi-speech 3700 program. Speech intelligibility was determined according to a listener's perceptual judgment. Results : Children with speech sound disorders had significantly lower speech intelligibility than those without speech sound disorders. The intensity of the vowel /u/, the duration of the vowel /${\omega}$/, and the second formant of the vowel /${\omega}$/ were significantly different between both groups. There was no difference in voice onset time between the groups. There was a correlation between acoustic measurements and speech intelligibility. Conclusion : The results of this study showed that the speech intelligibility of children with speech sound disorders was affected by intensity, word duration, and formant frequency. It is necessary to complement clinical setting results using acoustic measurements in addition to evaluation of speech intelligibility.