• 제목/요약/키워드: clear speech

검색결과 115건 처리시간 0.028초

Overlapping of /o/ and /u/ in modern Seoul Korean: focusing on speech rate in read speech

  • Igeta, Takako;Hiroya, Sadao;Arai, Takayuki
    • 말소리와 음성과학
    • /
    • 제9권1호
    • /
    • pp.1-7
    • /
    • 2017
  • Previous studies have reported on the overlapping of $F_1$ and $F_2$ distribution for the vowels /o/ and /u/ produced by young Korean speakers of the Seoul dialect. It has been suggested that the overlapping of /o/ and /u/ occurs due to sound change. However, few studies have examined whether speech rate influences the overlapping of /o/ and /u/. On the other hand, previous studies have reported that the overlapping of /o/ and /u/ in syllable produced by male speakers is smaller than by female speakers. Few reports have investigated on the overlapping of the two vowels in read speech produced by male speakers. In the current study, we examined whether speech rates affect overlapping of /o/ and /u/ in read speech by male and female speakers. Read speech produced by twelve young adult native speakers of Seoul dialect were recorded in three speech rates. For female speakers, discriminant analysis showed that the discriminant rate became lower as the speech rate increases from slow to fast. Thus, this indicates that speech rate is one of the factors affecting the overlapping of /o/ and /u/. For male speakers, on the other hand, the discriminant rate was not correlated with speech rate, but the overlapping was larger than that of female speakers in read speech. Moreover, read speech by male speakers was less clear than by female speakers. This indicates that the overlapping may be related to unclear speech by sociolinguistic reasons for male speakers.

Relationship between Speech Perception in Noise and Phonemic Restoration of Speech in Noise in Individuals with Normal Hearing

  • Vijayasarathy, Srikar;Barman, Animesh
    • Journal of Audiology & Otology
    • /
    • 제24권4호
    • /
    • pp.167-173
    • /
    • 2020
  • Background and Objectives: Top-down restoration of distorted speech, tapped as phonemic restoration of speech in noise, maybe a useful tool to understand robustness of perception in adverse listening situations. However, the relationship between phonemic restoration and speech perception in noise is not empirically clear. Subjects and Methods: 20 adults (40-55 years) with normal audiometric findings were part of the study. Sentence perception in noise performance was studied with various signal-to-noise ratios (SNRs) to estimate the SNR with 50% score. Performance was also measured for sentences interrupted with silence and for those interrupted by speech noise at -10, -5, 0, and 5 dB SNRs. The performance score in the noise interruption condition was subtracted by quiet interruption condition to determine the phonemic restoration magnitude. Results: Fairly robust improvements in speech intelligibility was found when the sentences were interrupted with speech noise instead of silence. Improvement with increasing noise levels was non-monotonic and reached a maximum at -10 dB SNR. Significant correlation between speech perception in noise performance and phonemic restoration of sentences interrupted with -10 dB SNR speech noise was found. Conclusions: It is possible that perception of speech in noise is associated with top-down processing of speech, tapped as phonemic restoration of interrupted speech. More research with a larger sample size is indicated since the restoration is affected by the type of speech material and noise used, age, working memory, and linguistic proficiency, and has a large individual variability.

Relationship between Speech Perception in Noise and Phonemic Restoration of Speech in Noise in Individuals with Normal Hearing

  • Vijayasarathy, Srikar;Barman, Animesh
    • 대한청각학회지
    • /
    • 제24권4호
    • /
    • pp.167-173
    • /
    • 2020
  • Background and Objectives: Top-down restoration of distorted speech, tapped as phonemic restoration of speech in noise, maybe a useful tool to understand robustness of perception in adverse listening situations. However, the relationship between phonemic restoration and speech perception in noise is not empirically clear. Subjects and Methods: 20 adults (40-55 years) with normal audiometric findings were part of the study. Sentence perception in noise performance was studied with various signal-to-noise ratios (SNRs) to estimate the SNR with 50% score. Performance was also measured for sentences interrupted with silence and for those interrupted by speech noise at -10, -5, 0, and 5 dB SNRs. The performance score in the noise interruption condition was subtracted by quiet interruption condition to determine the phonemic restoration magnitude. Results: Fairly robust improvements in speech intelligibility was found when the sentences were interrupted with speech noise instead of silence. Improvement with increasing noise levels was non-monotonic and reached a maximum at -10 dB SNR. Significant correlation between speech perception in noise performance and phonemic restoration of sentences interrupted with -10 dB SNR speech noise was found. Conclusions: It is possible that perception of speech in noise is associated with top-down processing of speech, tapped as phonemic restoration of interrupted speech. More research with a larger sample size is indicated since the restoration is affected by the type of speech material and noise used, age, working memory, and linguistic proficiency, and has a large individual variability.

DSP를 이용한 자동차 소음에 강인한 음성인식기 구현 (Implementation of a Robust Speech Recognizer in Noisy Car Environment Using a DSP)

  • 정익주
    • 음성과학
    • /
    • 제15권2호
    • /
    • pp.67-77
    • /
    • 2008
  • In this paper, we implemented a robust speech recognizer using the TMS320VC33 DSP. For this implementation, we had built speech and noise database suitable for the recognizer using spectral subtraction method for noise removal. The recognizer has an explicit structure in aspect that a speech signal is enhanced through spectral subtraction before endpoints detection and feature extraction. This helps make the operation of the recognizer clear and build HMM models which give minimum model-mismatch. Since the recognizer was developed for the purpose of controlling car facilities and voice dialing, it has two recognition engines, speaker independent one for controlling car facilities and speaker dependent one for voice dialing. We adopted a conventional DTW algorithm for the latter and a continuous HMM for the former. Though various off-line recognition test, we made a selection of optimal conditions of several recognition parameters for a resource-limited embedded recognizer, which led to HMM models of the three mixtures per state. The car noise added speech database is enhanced using spectral subtraction before HMM parameter estimation for reducing model-mismatch caused by nonlinear distortion from spectral subtraction. The hardware module developed includes a microcontroller for host interface which processes the protocol between the DSP and a host.

  • PDF

T자형 복도 공간의 비상 방송용 확성기 배치별 음압 레벨과 음성 명료도 비교 (Comparison of Sound Pressure Level and Speech Intelligibility of Emergency Broadcasting System at T-junction Corridor Space)

  • 정정호;이성찬
    • 한국화재소방학회논문지
    • /
    • 제33권1호
    • /
    • pp.105-112
    • /
    • 2019
  • 본 연구에서는 T자형의 복도 연결 공간에서 비상 방송음이 명료하고 고르게 전달되는지를 건축음향 시뮬레이션을 이용하여 알아보았다. 복도 공간의 흡음성능 변화, 비상 방송용 확성기의 설치 위치와 간격을 변화시켜 보았으며 변화에 따른 음압 레벨 분포, 음성 전달 지수(STI, RASTI) 분포를 비교하였다. 시뮬레이션 결과 명료한 음성 전달을 위해서는 비상 방송용 확성기를 T자형 복도 연결부의 중심에서 약 10 m를 이격시켜 설치하는 것이 좋은 것으로 나타났다. NFSC의 25 m 설치 간격을 좁히는 경우 더욱 명료하고 충분한 음량을 갖는 비상 방송음이 고르게 전달될 수 있는 것으로 나타났다.

음성 신호 특징과 셉스트럽 특징 분포에서 묵음 특징 정규화를 융합한 음성 인식 성능 향상 (Voice Recognition Performance Improvement using the Convergence of Voice signal Feature and Silence Feature Normalization in Cepstrum Feature Distribution)

  • 황재천
    • 한국융합학회논문지
    • /
    • 제8권5호
    • /
    • pp.13-17
    • /
    • 2017
  • 음성 인식에서 기존의 음성 특징 추출 방법은 명확하지 않은 스레숄드 값으로 인해 부정확한 음성 인식률을 가진다. 본 연구에서는 음성과 비음성에 대한 특징 추출을 묵음 특징 정규화를 융합한 음성 인식 성능 향상을 위한 방법을 모델링 한다. 제안한 방법에서는 잡음의 영향을 최소화하여 모델을 구성하였고, 각 음성 프레임에 대해 음성 신호 특징을 추출하여 음성 인식 모델을 구성하였고, 이를 묵음 특징 정규화를 융합하여 에너지 스펙트럼을 엔트로피와 유사하게 표현하여 원래의 음성 신호를 생성하고 음성의 특징이 잡음을 적게 받도록 하였다. 셉스트럼에서 음성과 비음성 분류의 기준 값을 정하여 신호 대 잡음 비율이 낮은 신호에서 묵음 특징 정규화로 성능을 향상하였다. 논문에서 제시하는 방법의 성능 분석은 HMM과 CHMM을 비교하여 결과를 보였으며, 기존의 HMM과 CHMM을 비교한 결과 음성 종속 단계에서는 2.1%p의 인식률 향상이 있었으며, 음성 독립 단계에서는 0.7%p 만큼의 인식률 향상이 있었다.

텔레메틱스 단말용 음성 인식을 위한 음성향상 알고리듬 및 칩 구현 (Implementation of Chip and Algorithm of a Speech Enhancement for an Automatic Speech Recognition Applied to Telematics Device)

  • 김형국
    • 한국ITS학회 논문지
    • /
    • 제7권5호
    • /
    • pp.90-96
    • /
    • 2008
  • 본 논문은 텔레메틱스 단말용 음성인식을 위한 음성향상 단일 칩 알고리듬을 제시한다. 제안된 방법은 잡음제거와 에코제거의 두 단계로 구성되어 있으며, 첫 단계로 크로스 스펙트럼 추정에 기반한 적응필터를 통해 에코를 제거하고, 두번째 단계로 Generalized Gamma분포기반의 LSA 음성추정 방식 추정을 통해 외부 배경잡음을 제거하여 음성의 음질을 향상시킨다. 적은 계산량이 요구되는 제안된 알고리즘을 토대로 구현된 단일 칩의 성능은 다양한 잡음환경에서 신호 대잡음비율과 음성인식 평가에서 기존의 방법보다 향상된 결과를 나타내었다.

  • PDF

Acoustic Evidence for the Development of Aspiration Feature in Putonghua Stops

  • Han, Ji-Yeon
    • 음성과학
    • /
    • 제12권3호
    • /
    • pp.201-209
    • /
    • 2005
  • This study was investigated developmental temporal features in Putonghua-speaking children. The total of 212 children between the ages 2;6 and 6;5 participated in Shanghai. Speech materials were constructed according to aspiration feature in stop sounds of Putonghua. Six words were selected in this study. A voice onset time was measured. Non-parametric procedures were employed for all the analyses. The VOT value across bilabial, alveolar, and velar stops was significantly differed between aspirated and unaspirated stops for each age group. Effect of age is. significant for unaspirated stops. It is clear that each of Putonghua stops showed decreasing mean and standard deviation. The overshoot phenomenon of VOT was apparent from the age of 2;6-2;11 to 4;6-4;11. There was high variability in the production of lag time for aspirated stops.

  • PDF

개량된 음성매개변수를 사용한 지속시간이 짧은 잡음음성 중의 배경잡음 분류 (Background Noise Classification in Noisy Speech of Short Time Duration Using Improved Speech Parameter)

  • 최재승
    • 한국정보통신학회논문지
    • /
    • 제20권9호
    • /
    • pp.1673-1678
    • /
    • 2016
  • 음성인식처리 분야에서 배경잡음으로 인하여 음성입력이 배경잡음으로 잘못 판단되는 원인이 되어 음성인식율의 저하를 초래한다. 이러한 종류의 잡음대책은 단순하지 않으므로 보다 고도한 잡음처리기술이 필요하게 된다. 따라서 본 논문에서는 잡음환경 중에서 정상적인 배경잡음 혹은 비정상적인 배경잡음과 지속 시간이 짧은 음성을 구별하는 알고리즘에 대하여 기술한다. 본 알고리즘은 다른 종류의 잡음과 음성을 구별하는 중요한 수단으로서 개량된 음성의 특징파리미터를 사용한다. 다음으로 다층퍼셉트론 네트워크에 의하여 잡음의 종류를 추정하는 알고리즘에 대해서 기술한다. 본 실험에서는 잡음과 음성이 구별이 가능하도록 실험적으로 확인하였다.

일반 노년층 파열음의 음향학적 특성 (Acoustic Characteristics of Stop Consonants in Normal Elderly)

  • 유현지;김향희
    • 말소리와 음성과학
    • /
    • 제7권1호
    • /
    • pp.39-45
    • /
    • 2015
  • Changes in speech production in normal elderly might be subtle and gradual. Therefore, an acoustic analysis is appropriate to identify the effect of aging on speech. For this purpose, this study examined four speech parameters; voice onset time (VOT), VOT range, $f_0$ of following vowel($f_0FV$), and $f_0FV$ difference in two age groups, old (mean age 74.57 yrs.) and young (m: 27.43 yrs.). The results show that compared to the older group the younger demonstrated significantly shorter VOTs in lenis and longer in aspirated stop. VOT ranges were relatively broad and consequently overlapped between the phonation types (e.g., lenis, fortis, aspirated). The $f_0FV$ values in the older group which are an integral parameter with VOT were lower compared with the young group. The $f_0FV$ differences in the old female group were significantly narrower than the young female group, therefore, clear distinction became difficult. In conclusion, contrast in temporal information was obscured, and the domain of glottal information was diminished on stop consonants in Korean elderly. The findings suggest that central/peripheral changes by aging could lead to a deficit in coordination between phonation and articulation.