• 제목/요약/키워드: Connected speech

검색결과 146건 처리시간 0.021초

연속구어 내 발성 종결-개시의 음향학적 특징 - 말더듬 화자와 비말더듬 화자 비교 - (Acoustic Features of Phonatory Offset-Onset in the Connected Speech between a Female Stutterer and Non-Stutterers)

  • 한지연;이옥분
    • 음성과학
    • /
    • 제13권2호
    • /
    • pp.19-33
    • /
    • 2006
  • The purpose of this paper was to examine acoustical characteristics of phonatory offset-onset mechanism in the connected speech of female adults with stuttering and normal nonfluency. The phonatory offset-onset mechanism refers to the laryngeal articulatory gestures. Those gestures are required to mark word boundaries in phonetic contexts of the connected speech. This mechanism included 7 patterns based on the speech spectrogram. This study showed the acoustic features in the connected speech in the production of female adults with stuttering (n=1) and normal nonfluency (n=3). Speech tokens in V_V, V_H, and V_S contexts were selected for the analysis. Speech samples were recorded by Sound Forge, and the spectrographic analysis was conducted using Praat. Results revealed a stuttering (with a type of block) female exhibited more laryngealization gestures in the V_V context. Laryngealization gesture was more characterized by a complete glottal stop or glottal fry both in V_H and in V_S contexts. The results were discussed from theoretical and clinical perspectives.

  • PDF

한국어 연속 숫자음 전화 음성 인식에서의 오인식 유형 분석 (Analysis of Error Patterns in ]Korean Connected Digit Telephone Speech Recognition)

  • 김민성;정성윤;손종목;배건성;김상훈
    • 대한음성학회지:말소리
    • /
    • 제46호
    • /
    • pp.77-86
    • /
    • 2003
  • Channel distortion and coarticulation effect in the Korean connected digit telephone speech make it difficult to achieve high performance of connected digit recognition in the telephone environment. In this paper, as a basic research to improve the recognition performance of Korean connected digit telephone speech, recognition error patterns are investigated and analyzed. Korean connected digit telephone speech database released by SiTEC and HTK system are used for recognition experiments. Both DWFBA and MRTCN methods are used for feature extraction and channel compensation, respectively. Experimental results are discussed with our findings.

  • PDF

Visual Presentation of Connected Speech Test (CST)

  • Jeong, Ok-Ran;Lee, Sang-Heun;Cho, Tae-Hwan
    • 음성과학
    • /
    • 제3권
    • /
    • pp.26-37
    • /
    • 1998
  • The Connected Speech Test (CST) was developed to test hearing aid performance using realistic stimuli (Connected speech) presented in a background of noise with a visible speaker. The CST has not been investigated as a measure of speech reading ability using the visual portion of the CST only. Thirty subjects were administered the 48 test lists of the CST using visual presentation mode only. Statistically significant differences were found between the 48 test lists and between the 12 passages of the CST (48 passages divided into 12 groups of 4 lists which were averaged.). No significant differences were found between male and female subjects; however, in all but one case, females scored better than males. No significant differences were found between students in communication disorders and students in other departments. Intra- and inter-subject variability across test lists and passages was high. Suggestions for further research include changing the scoring of the CST to be more contextually based and changing the speaker for the CST.

  • PDF

연결 숫자음 인식기 학습용 음성DB 녹음을 위한 최적의 대본 작성 (The Optimal and Complete Prompts Lists for Connected Spoken Digit Speech Corpus)

  • 유하진
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.131-134
    • /
    • 2003
  • This paper describes an efficient algorithm to generate compact and complete prompts lists for connected spoken digits database. In building a connected spoken digit recognizer, we have to acquire speech data in various contexts. However, in many speech databases the lists are made by using random generators. We provide an efficient algorithm that can generate compact and complete lists of digits in various contexts. This paper includes the proof of optimality and completeness of the algorithm.

  • PDF

연결 단어 음성 인식기 학습용 음성DB 녹음을 위한 최적의 대본 작성 알고리즘 (The Optimal and Complete Prompts Lists Generation Algorithm for Connected Spoken Word Speech Corpus)

  • 유하진
    • 한국음향학회지
    • /
    • 제23권2호
    • /
    • pp.187-191
    • /
    • 2004
  • 연결 단어 인식기, 특히 연결 숫자음 인식기를 제작하기 위한 음성 데이터베이스를 구축하는데 있어서 완전하고 효율적인 발성목록을 작성하기 위한 알고리즘을 제안한다. 기존의 음성 DB에서 사용되는 목록은 주로 난수 발생기에 의하여 만들어지거나 사용자의 전화번호, 우편번호 등을 이용하여 만들어져 왔으므로 다양한 환경의 음소 또는 단어를 균일하게 포함하고 있지 못하다. 따라서 본 논문에서는 하나의 단어에 대하여 전후에 모든 단어가 연결되는 조합을 모두 한번씩 포함하는 목록을 만드는 효율적인 알고리즘을 제안한다. 본 알고리즘으로 7연 숫자 목록을 만들면 200개의 문장으로 모든 조합을 포함할 수 있게 된다. 본 논문에서는 알고리즘 예제와 본 알고리즘의 완전성과 효율성에 대하여 기술하였다.

음성장애 연속구어의 음향학적 분석 (A Study of Acoustic Measurement in Connected Speech with Dysphonia)

  • 이명순
    • 말소리와 음성과학
    • /
    • 제3권4호
    • /
    • pp.109-115
    • /
    • 2011
  • The purposes of this study were to identify acoustic parameters of connected speech and to contribute to acoustic analysis of dysphonic voice about patient's natural speech voice as well as sustained phonation of vowels. Acoustic parameters of sentences included LTAS (long-term average spectrum) mean and spectral slope over frequence ranges such as 0-4kHz, 0-6kHz, 0-8kHz, 0-12.5kHz as well as HNR. Acoustic parameters of the vowel 'a' included jitter, RAP, shimmer, NHR, and HNR. Based on 'G' of GRBAS for the severity of dysphonia, two experienced raters judged and classified as four groups including controls, mild, moderate and severe dysphonic group. Connected speech was two sentences extracted from 'stroll' passage. Parameters of the vowel and LTAS mean of the sentences were measured by CSL. The spectral slope of the sentences and HNR of the vowel and the sentences were measured by Praat. Data were statistically analyzed by Spearman correlation and Kruskal-Wallis test using SPSS 12.0. The results of this study are as follows: First, jitter, RAP, shimmer and NHR were significantly different between the groups. Second, for several frequencies, LTAS mean and spectral slope of the sentences were significantly different between the groups. Third, the HNR of the sentences were significantly different between the groups. Forth, there was a presence of correlation between HNR and NHR of the vowel and HNR of the sentences. Accordingly, this study concluded that LTAS, spectral slope, and HNR were predictive parameters of connected speech voice for dysphonic voice.

  • PDF

연결발화에서 마비말화자의 음질 특성 (Voice Quality of Dysarthric Speakers in Connected Speech)

  • 서인효;성철재
    • 말소리와 음성과학
    • /
    • 제5권4호
    • /
    • pp.33-41
    • /
    • 2013
  • This study investigated the perceptual and cepstral/spectral characteristics of phonation and their relationships in dysarthria in connected speech. Twenty-two participants were divided into two groups; the eleven dysarthric speakers were paired with matching age and gender healthy control participants. A perceptual evaluation was performed by three speech pathologists using the GRBAS scale to measure the cepstrual/spectral characteristics of phonation between the two groups' connected speech. Correlations showed dysarthric speakers scored significantly worse (with a higher rating) with severities in G (overall dysphonia grade), B (breathiness), and S (strain), while the smoothed prominence of the cepstral peak (CPPs) was significantly lower. The CPPs were significantly correlated with the perceptual ratings, including G, B, and S. The utility of CPPs is supported by its high relationship with perceptually rated dysphonia severity in dysarthric speakers. The receiver operating characteristic (ROC) analysis showed that the threshold of 5.08 dB for the CPPs achieved a good classification for dysarthria, with 63.6% sensitivity and the perfect specificity (100%). Those results indicate the CPPs reliably distinguished between healthy controls and dysarthric speakers. However, the CPP frequency (CPP F0) and low-high spectral ratio (L/H ratio) were not significantly different between the two groups.

채널보상기법을 사용한 전화 음성 연속숫자음의 인식 성능향상 (Performance Improvement of Connected Digit Recognition with Channel Compensation Method for Telephone speech)

  • 김민성;정성윤;손종목;배건성
    • 대한음성학회지:말소리
    • /
    • 제44호
    • /
    • pp.73-82
    • /
    • 2002
  • Channel distortion degrades the performance of speech recognizer in telephone environment. It mainly results from the bandwidth limitation and variation of transmission channel. Variation of channel characteristics is usually represented as baseline shift in the cepstrum domain. Thus undesirable effect of the channel variation can be removed by subtracting the mean from the cepstrum. In this paper, to improve the recognition performance of Korea connected digit telephone speech, channel compensation methods such as CMN (Cepstral Mean Normalization), RTCN (Real Time Cepatral Normalization), MCMN (Modified CMN) and MRTCN (Modified RTCN) are applied to the static MFCC. Both MCMN and MRTCN are obtained from the CMN and RTCN, respectively, using variance normalization in the cepstrum domain. Using HTK v3.1 system, recognition experiments are performed for Korean connected digit telephone speech database released by SITEC (Speech Information Technology & Industry Promotion Center). Experiments have shown that MRTCN gives the best result with recognition rate of 90.11% for connected digit. This corresponds to the performance improvement over MFCC alone by 1.72%, i.e, error reduction rate of 14.82%.

  • PDF

훈련음성 데이터에 적응시킨 필터뱅크 기반의 MFCC 특징파라미터를 이용한 전화음성 연속숫자음의 인식성능 향상에 관한 연구 (A study on the recognition performance of connected digit telephone speech for MFCC feature parameters obtained from the filter bank adapted to training speech database)

  • 정성윤;김민성;손종목;배건성;강점자
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.119-122
    • /
    • 2003
  • In general, triangular shape filters are used in the filter bank when we get the MFCCs from the spectrum of speech signal. In [1], a new feature extraction approach is proposed, which uses specific filter shapes in the filter bank that are obtained from the spectrum of training speech data. In this approach, principal component analysis technique is applied to the spectrum of the training data to get the filter coefficients. In this paper, we carry out speech recognition experiments, using the new approach given in [1], for a large amount of telephone speech data, that is, the telephone speech database of Korean connected digit released by SITEC. Experimental results are discussed with our findings.

  • PDF

대화체 억양구말 형태소의 경계성조 연구 (Boundary Tones of Intonational Phrase-Final Morphemes in Dialogues)

  • 한선희
    • 음성과학
    • /
    • 제7권4호
    • /
    • pp.219-234
    • /
    • 2000
  • The study of boundary tones in connected speech or dialogues is one of the most underdeveloped areas of Korean prosody. This. paper concerns the boundary tones of intonational phrase-final morphemes which are shown in the speech corpus of dialogues. Results of phonetic analysis show that different kinds of boundary tones are realized, depending on the positions of the intonational phrase-final morphemes in the sentences.. This study has also shown that boundary tone patterning is somewhat related to the sentence structure, and for better speech recognition and speech synthesis, it presents a simple model of boundary tones based on the fundamental frequency contour. The results of this study will contribute to our understanding of the prosodic pattern of Korean connected speech or dialogues.

  • PDF