• Title/Summary/Keyword: Speech Analysis

Search Result 1,585, Processing Time 0.029 seconds

Acoustic Characteristics of Patients with Maxillary Complete Dentures (상악 총의치 장착 환자 언어의 음향학적 특성 연구)

  • Ko, Sok-Min;Hwang, Byung-Nam
    • Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.139-156
    • /
    • 2001
  • Speech intelligibility in patients with complete dentures is an important clinical problem depending on the material used. The objective of this study was to investigate the speech of two edentulous subjects fitted with a complete maxillary prosthesis made of two different palatal materials: chrome-cobalt alloy and acrylic resin. Three patients with complete dentures in the experiment group and ten people in the controls groups participated in the experiment. CSL, Visi-Pitch were used to measure speech characteristics. The test words consisted of a simple vowel /e/, meaningless three syllabic words containing fricative, affricated and stops sounds, and sustained fricative sounds /s/ and /$\int$/. The analysis speech parameters were vowel and lateral formants, VOT, sound durations, sound pressure level and fricative frequency. Data analysis was conducted by a series of paired T-test. The findings like the following: (1) Vowel formant one of patients with complete denture is higher than that of the control group (p<0.05), while lateral formant three of patients with complete denture is lower than that of the control group (p<0.0l). (2) Patients with complete denture produced lower speech intelligibility with low fricative frequency (/$\int$/) than control group (p<0.0). The speech intelligibility of patients with metal prosthesis was higher than that of those with resin prosthesis (p<0.05). (3) Fricative, lateral and stop sound durations of patients with complete denture were longer than those of the control group (p<0.01 and p<0.05), respectively. Total sound durations of patients with metal prosthesis were similar to that of the control group (p<0.05), while those with resin prosthesis had a shorter duration (p<0.01). This implied that those with metal prosthesis had higher speech intelligibility than those with resin prosthesis. (4) Patients with complete denture had higher sound pressure levels /t/ and /c/ than the control group (p<0.01). However, sound pressure levels for /c/ of patients with metal prosthesis or resin prosthesis was similar to the control group (p<0.05). (5) Patients with complete denture had higher fundamental frequency than the control group (p<0.01).

  • PDF

Research on Construction of the Korean Speech Corpus in Patient with Velopharyngeal Insufficiency (구개인두부전증 환자의 한국어 음성 코퍼스 구축 방안 연구)

  • Lee, Ji-Eun;Kim, Wook-Eun;Kim, Kwang Hyun;Sung, Myung-Whun;Kwon, Tack-Kyun
    • Korean Journal of Otorhinolaryngology-Head and Neck Surgery
    • /
    • v.55 no.8
    • /
    • pp.498-507
    • /
    • 2012
  • Background and Objectives We aimed to develop a Korean version of the velopharyngeal insufficiency (VPI) speech corpus system. Subjects and Method After developing a 3-channel simultaneous speech recording device capable of recording nasal/oral and normal compound speech separately, voice data were collected from VPI patients aged more than 10 years with/without the history of operation or prior speech therapy. This was compared to a control group for which VPI was simulated by using a french-3 nelaton tube inserted via both nostril through nasopharynx and pulling the soft palate anteriorly in varying degrees. The study consisted of three transcriptors: a speech therapist transcribed the voice file into text, a second transcriptor graded speech intelligibility and severity and the third tagged the types and onset times of misarticulation. The database were composed of three main tables regarding (1) speaker's demographics, (2) condition of the recording system and (3) transcripts. All of these were interfaced with the Praat voice analysis program, which enables the user to extract exact transcribed phrases for analysis. Results In the simulated VPI group, the higher the severity of VPI, the higher the nasalance score was obtained. In addition, we could verify the vocal energy that characterizes hypernasality and compensation in nasal/oral and compound sounds spoken by VPI patients as opposed to that characgerizes the normal control group. Conclusion With the Korean version of VPI speech corpus system, patients' common difficulties and speech tendencies in articulation can be objectively evaluated. Comparing these data with those of the normal voice, mispronunciation and dysarticulation of patients with VPI can be corrected.

Characteristics of speech rate and pause in children with spastic cerebral palsy and their relationships with speech intelligibility (경직형 뇌성마비 아동의 하위그룹별 말속도와 쉼의 특성 및 말명료도와의 관계)

  • Jeong, Pil Yeon;Sim, Hyun Sub
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.95-103
    • /
    • 2020
  • The current study aimed to identify the characteristics of speech rate and pause in children with spastic cerebral palsy (CP) and their relationships with speech intelligibility. In all, 26 children with CP, 4 with no speech motor involvement and age-appropriate language ability (NSMI-LCT), 6 with no speech motor involvement and impaired language ability (NSMI-LCI), 6 with speech motor involvement and age-appropriate language ability (SMI- LCT), and 10 with speech motor involvement and impaired language ability (SMI-LCI) participated in the study. Speech samples for the speech rate and pause analysis were extracted using a sentence repetition task. Acoustic analysis were made in Praat. First, it was found that regardless of the presence of language impairment, significant group differences between the NSMI and SMI groups were found in speech rate and articulation rate. Second, the SMI groups showed a higher ratio of pause time to sentence production time, more frequent pauses, and longer durations of pauses than the NSMI groups. Lastly, there were significant correlations among speech rate, articulation rate, and intelligibility. These findings suggest that slow speech rate is the main feature in SMI groups, and that both speech rate and articulation rate play important roles in the intelligibility of children with spastic CP.

A Study on Image Recommendation System based on Speech Emotion Information

  • Kim, Tae Yeun;Bae, Sang Hyun
    • Journal of Integrative Natural Science
    • /
    • v.11 no.3
    • /
    • pp.131-138
    • /
    • 2018
  • In this paper, we have implemented speeches that utilized the emotion information of the user's speech and image matching and recommendation system. To classify the user's emotional information of speech, the emotional information of speech about the user's speech is extracted and classified using the PLP algorithm. After classification, an emotional DB of speech is constructed. Moreover, emotional color and emotional vocabulary through factor analysis are matched to one space in order to classify emotional information of image. And a standardized image recommendation system based on the matching of each keyword with the BM-GA algorithm for the data of the emotional information of speech and emotional information of image according to the more appropriate emotional information of speech of the user. As a result of the performance evaluation, recognition rate of standardized vocabulary in four stages according to speech was 80.48% on average and system user satisfaction was 82.4%. Therefore, it is expected that the classification of images according to the user's speech information will be helpful for the study of emotional exchange between the user and the computer.

A Study on Combining Bimodal Sensors for Robust Speech Recognition (강인한 음성인식을 위한 이중모드 센서의 결합방식에 관한 연구)

  • 이철우;계영철;고인선
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.6
    • /
    • pp.51-56
    • /
    • 2001
  • Recent researches have been focusing on jointly using lip motions and speech for reliable speech recognitions in noisy environments. To this end, this paper proposes the method of combining the visual speech recognizer and the conventional speech recognizer with each output properly weighted. In particular, we propose the method of autonomously determining the weights, depending on the amounts of noise in the speech. The correlations between adjacent speech samples and the residual errors of the LPC analysis are used for this determination. Simulation results show that the speech recognizer combined in this way provides the recognition performance of 83 % even in severely noisy environments.

  • PDF

Analysis of the Relationship Between Sasang Constitutional Groups and Speech Features Based on a Listening Evaluation of Voice Characteristics (목소리 특성의 청취 평가에 기초한 사상체질과 음성 특징의 상관관계 분석)

  • Kwon, Chulhong;Kim, Jongyeol;Kim, Keunho;Jang, Junsu
    • Phonetics and Speech Sciences
    • /
    • v.4 no.4
    • /
    • pp.71-77
    • /
    • 2012
  • Sasang constitution experts utilize voice characteristics as an auxiliary measure for deciding a person's constitutional group. This study aims at establishing a relationship between speech features and the constitutional groups by subjective listening evaluation of voice characteristics. A speech database of 841 speakers whose constitutional groups have been already diagnosed by Sasang constitution experts was constructed. Speech features related to speech source and vocal tract filter were extracted from five vowels and one sentence. Statistically significant speech features for classifying the groups were analyzed using SPSS. The features contributed to constitution classification were speaking rate, Energy, A1, A2, A3, H1, H2, H4, CPP for males in their 20s, F0_mean, CPP, SPI, HNR, Shimmer, Energy, A1, A2, A3, H1, H2, H4 for females in their 20s, Energy, A1, A2, A3, H1, H2, H4, CPP for male in the 60s, and Jitter, HNR, CPP, SPI for females in their 60s. Experimental results show that speech technology is useful in classifying constitutional groups.

Implementation of Variable Threshold Dual Rate ADPCM Speech CODEC Considering the Background Noise (배경잡음을 고려한 가변임계값 Dual Rate ADPCM 음성 CODEC 구현)

  • Yang, Jae-Seok;Han, Kyong-Ho
    • Proceedings of the KIEE Conference
    • /
    • 2000.07d
    • /
    • pp.3166-3168
    • /
    • 2000
  • This paper proposed variable threshold dual rate ADPCM coding method which is modified from the standard ADPCM of ITU G.726 for speech quality improvement. The speech quality of variable threshold dual rate ADPCM is better than single rate ADPCM at noisy environment without increasing the complexity by using ZCR(Zero Crossing Rate). In this case, ZCR is used to divide input signal samples into two categories(noisy & speech). The samples with higher ZCR is categorized as the noisy region and the samples with lower ZCR is categorized as the speech region. Noisy region uses higher threshold value to be compressed by 16Kbps for reduced bit rates and the speech region uses lower threshold value to be compressed by 40Kbps for improved speech quality. Comparing with the conventional ADPCM, which adapts the fixed coding rate. the proposed variable threshold dual rate ADPCM coding method improves noise character without increasing the bit rate. For real time applications, ZCR calculation was considered as a simple method to obtain the background noise information for preprocess of speech analysis such as FFT and the experiment showed that the simple calculation of ZCR can be used without complexity increase. Dual rate ADPCM can decrease the amount of transferred data efficiently without increasing complexity nor reducing speech quality. Therefore result of this paper can be applied for real-time speech application such as the internet phone or VoIP.

  • PDF

A Correlation Study among Pitch, Nasalance, and Voice Quality (정상 성인의 음도, 비성도, 음질 간의 상관 연구)

  • Park, Sung-Jong;Yoo, Jae-Yeon
    • Phonetics and Speech Sciences
    • /
    • v.1 no.4
    • /
    • pp.159-163
    • /
    • 2009
  • The purpose of this study is to conduct a correlational analysis among pitch, nasalance, and acoustic quality parameters estimated by two speech analysis softwares NasalView(version 1.31), Dr. Speech 4.5(Tiger Electronics). Thirty females and 25 males with normal voice participated in the study. The Pearson correlation coefficient was determined through a statistical analysis. The results came out as follows; Firstly, there was a correlation between $F_0$ and voice quality parameters, however there was no correlation between $F_0$ and nasalance. Secondly, nasalance showed a correlation with voice quality parameters.

  • PDF

The Voice Dialing System Using Dynamic Hidden Markov Models and Lexical Analysis (DHMM과 어휘해석을 이용한 Voice dialing 시스템)

  • 최성호;이강성;김순협
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.28B no.7
    • /
    • pp.548-556
    • /
    • 1991
  • In this paper, Korean spoken continuous digits are ercognized using DHMM(Dynamic Hidden Markov Model) and lexical analysis to provide the base of developing voice dialing system. After segmentation by phoneme unit, it is recognized. This system can be divided into the segmentation section, the design of standard speech section, the recognition section, and the lexical analysis section. In the segmentation section, it is segmented using the ZCR, O order LPC cepstrum, and Ai, parameter of voice speech dectaction, which is changed according to time. In the standard speech design section, 19 phonemes or syllables are trained by DHMM and designed as a standard speech. In the recognition section, phomeme stream are recognized by the Viterbi algorithm.In the lexical decoder section, finally recognized continuous digits are outputed. This experiment shiwed the recognition rate of 85.1% using data spoken 7 times of 21 classes of 7 continuous digits which are combinated all of the occurence, spoken by 10 man.

  • PDF

A Study on the Development of Inflected Words of Korean based on the analysis of 3 to 8 year-old Children's speech (3세에서 8세 아동의 용언 발달 연구)

  • Choi Eunah;Shin Jiyoung;Kim Soojin
    • Proceedings of the KSPS conference
    • /
    • 2003.10a
    • /
    • pp.89-93
    • /
    • 2003
  • The aim of this paper is to investigate the development of inflected words of Korean based on the analysis of 3 to 8 year-old children's spontaneous speech. For this purpose, the authors transcribe the spontaneous speech of 10 Korean children for each age and classified inflected word. The result of the analysis is as follows : $\circled1$ In the verbs simple words are occupied 62%, derivative words 18% and complex words 20%. In the adjectives simple words are 82%, derivative words 7% and complex words 11%. $\circled2$ The children's getting older, derivative and complex words are increased, in spite of simple words are reduced. $\circled3$ 4 year-old children get to start the ability of word formation and then since the children become 8 year-old, the children complete that ability almost all we think.

  • PDF