• Title/Summary/Keyword: Normal speech

Search Result 626, Processing Time 0.024 seconds

Acoustic Masking Effect That Can Be Occurred by Speech Contrast Enhancement in Hearing Aids (보청기에서 음성 대비 강조에 의해 발생할 수 있는 마스킹 현상)

  • Jeon, Y.Y.;Yang, D.G.;Bang, D.H.;Kil, S.K.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.1 no.1
    • /
    • pp.21-28
    • /
    • 2007
  • In most of hearing aids, amplification algorithms are used to compensate hearing loss, noise and feedback reduction algorithms are used and to increase the perception of speeches contrast enhancement algorithms are used. However, acoustic masking effect is occurred between formants if contrast is enhanced excessively. To confirm the masking effect in speeches, the experiment are composed of 6 tests; test pure tone test, speech reception test, word recognition test, pure tone masking test, formant pure tone masking test and speech masking test, and for objective evaluation, LLR is introduced. As a result of normal hearing subjects and hearing impaired subjects, more making is occurred in hearing impaired subjects than normal hearing subjects when using pure tone, and in the speech masking test, speech reception is also lower in hearing impaired subjects than in normal hearing subjects. This means that acoustic masking effect rather than distortion influences speech perception. So it is required to check the characteristics of masking effect before wearing a hearing aid and to apply this characteristics to fitting curve.

  • PDF

New Parameter on Speech and EGG; Glottal Closure Delay Ratio (음성신호와 전기성문파를 이용하는 새로운 매개변수 ; 성대 폐쇄 지연비율(Glottal Closure Delay Ratio))

  • Choi, Jong-Min;Kwon, Tack-Kyun;Jung, Eun-Jung;Lee, Myung-Chul;Kim, Kwang-Hyun;Sung, Myung-Whun;Park, Kwang-Suk
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.18 no.1
    • /
    • pp.22-25
    • /
    • 2007
  • Background and Objectives: Biomedical signals have been usually used for the diagnosis of the laryngeal function such as speech, electroglottograph(EGG), airflow and other signals. But, in most cases these signals were analysed separately. Here, we propose a new interchannel parameter Glottal Closure Delay Ratio(GCDR) which is estimated from speech and EGG measured simultaneously. Materials and Method: Speech and EGG signal were recorded simultaneously from 13 normal subjects, 39 patients. The patients' data included 16 polyps and 23 vocal folds palsy. Time difference between glottal closing instance on EGG and the first maximum peak on speech in a pitch period was calculated. Glottal closing instance was defined as the maximum peak on the first derivative of EGG signal(dEGG). Results: The standard deviation and jitter were calculated using 20-30 GCDRs extracted from each data, and they are significant different between normal and vocal fold paralysis group. Conclusion: The GCDR may be the first index reflecting speech and EGG characteristics and the perturbation of this parameter was significant different between normal and vocal fold paralysis group.

  • PDF

Study of Emotion in Speech (감정변화에 따른 음성정보 분석에 관한 연구)

  • 장인창;박미경;김태수;박면웅
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2004.10a
    • /
    • pp.1123-1126
    • /
    • 2004
  • Recognizing emotion in speech is required lots of spoken language corpus not only at the different emotional statues, but also in individual languages. In this paper, we focused on the changes speech signals in different emotions. We compared the features of speech information like formant and pitch according to the 4 emotions (normal, happiness, sadness, anger). In Korean, pitch data on monophthongs changed in each emotion. Therefore we suggested the suitable analysis techniques using these features to recognize emotions in Korean.

  • PDF

On a Reduction of Computation Time of FFT Cepstrum (FFT 켑스트럼의 처리시간 단축에 관한 연구)

  • Jo, Wang-Rae;Kim, Jong-Kuk;Bae, Myung-Jin
    • Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.57-64
    • /
    • 2003
  • The cepstrum coefficients are the most popular feature for speech recognition or speaker recognition. The cepstrum coefficients are also used for speech synthesis and speech coding but has major drawback of long processing time. In this paper, we proposed a new method that can reduce the processing time of FFT cepstrum analysis. We use the normal ordered inputs for FFT function and the bit-reversed inputs for IFFT function. Therefore we can omit the bit-reversing process and reduce the processing time of FFT ceptrum analysis.

  • PDF

A study on the Visible Speech Processing System for the Hearing Impaired (청각 장애자를 위한 시각 음성 처리 시스템에 관한 연구)

  • 김원기;김남현
    • Journal of Biomedical Engineering Research
    • /
    • v.11 no.1
    • /
    • pp.75-82
    • /
    • 1990
  • The purpose of this study is to help the hearing Impaired's speech training with a visible speech processing system. In brief, this system converts the features of speech signals into graphics on monitor, and adjusts the features of hearing impaired to normal ones. There are formant and pitch in the features used for this system. They are extracted using the digital signal processing such as linear predictive method or AMDF(Average Magnitude Difference Function). In order to effectively train for the hearing impaired's abnormal speech, easilly visible feature has been being studied.

  • PDF

Acoustic Characteristics of Some Vowels Produced by the CI Children of Various Age Groups (인공와우 이식 시기에 따른 모음의 음향음성학적 특성)

  • Kim, Go-Eun;Ko, Do-Heung
    • Speech Sciences
    • /
    • v.14 no.4
    • /
    • pp.203-212
    • /
    • 2007
  • This study was to compare some acoustic characteristics of vowels produced by children with cochlear implant (CI) and the children with normal hearing. 20 subjects under ten years old were further classified into two groups (one group of CI children under four years old and the other group of CI children over four years old). For the normal hearing group, 20 subjects are participated in the experiment. Some acoustic parameters including fundamental frequency (F0) and formant frequencies (F1, F2) were measured in the two groups according to the age of cochlear implant operation. For the CI group, three comer vowels (/a/, /i/, /u/) were recorded five times in isolation and analyzed with Multi-Speech (Kay Elemetrics, model 3700), and two independent t-tests on their formant data were conducted using SPSS 11.5. The result showed that the implanted group over four years had a significant difference in F0 and F1 comparing with the implanted group under four years of age as well as the normal hearing group. Those values of the children with the implanted group under four years old were closer to those of the children with the normal hearing. As to the F2, there was no significant difference among implanted groups. However, it was shown that the vowel space for the implanted groups regardless the operation age indicated much smaller than that for the normal hearing children. This acoustic results suggest that CI surgery would be much more effective if it is done under the age of four years old.

  • PDF

Analysis of Feature Extraction Methods for Distinguishing the Speech of Cleft Palate Patients (구개열 환자 발음 판별을 위한 특징 추출 방법 분석)

  • Kim, Sung Min;Kim, Wooil;Kwon, Tack-Kyun;Sung, Myung-Whun;Sung, Mee Young
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1372-1379
    • /
    • 2015
  • This paper presents an analysis of feature extraction methods used for distinguishing the speech of patients with cleft palates and people with normal palates. This research is a basic study on the development of a software system for automatic recognition and restoration of speech disorders, in pursuit of improving the welfare of speech disabled persons. Monosyllable voice data for experiments were collected for three groups: normal speech, cleft palate speech, and simulated clef palate speech. The data consists of 14 basic Korean consonants, 5 complex consonants, and 7 vowels. Feature extractions are performed using three well-known methods: LPC, MFCC, and PLP. The pattern recognition process is executed using the acoustic model GMM. From our experiments, we concluded that the MFCC method is generally the most effective way to identify speech distortions. These results may contribute to the automatic detection and correction of the distorted speech of cleft palate patients, along with the development of an identification tool for levels of speech distortion.

Temporal Variation Due to Tense vs. Lax Consonants in Korean

  • Yun, II-Sung
    • Speech Sciences
    • /
    • v.11 no.3
    • /
    • pp.23-36
    • /
    • 2004
  • Many languages show reverse durational variation between preceding vowel and following voiced/voiceless (lax/tense) consonants. This study investigated the likely effects of phoneme type (tense vs. lax) on the timing structure (duration of syllable, word, phrase and sentence) of Korean. Three rates of speech (fast, normal, slow) applied to stimuli with the target word /a-Ca/ where /C/ is one of /p, p', $p^h$/. The type (tense/lax) of /C/ caused marked inverse durational variations in the two syllables /a/ and /Ca/ and highly different durational ratios between them. Words with /p', $p^h$/ were significantly longer than that with /p/, which contrasts with many other languages where such pairs of words have a similar duration. The differentials between words remained up to the phrase and sentence level, but in general the higher linguistic units did not statistically differ within each level. Thus, the phrase is suggested as a compensatory unit of phoneme type effects in Korean. Different rates did not affect the general tendency. Distribution of time variations (from normal to fast and slow) to each syllable (/a/ and /Ca/) was also observed.

  • PDF

Performance of GMM and ANN as a Classifier for Pathological Voice

  • Wang, Jianglin;Jo, Cheol-Woo
    • Speech Sciences
    • /
    • v.14 no.1
    • /
    • pp.151-162
    • /
    • 2007
  • This study focuses on the classification of pathological voice using GMM (Gaussian Mixture Model) and compares the results to the previous work which was done by ANN (Artificial Neural Network). Speech data from normal people and patients were collected, then diagnosed and classified into two different categories. Six characteristic parameters (Jitter, Shimmer, NHR, SPI, APQ and RAP) were chosen. Then the classification method based on the artificial neural network and Gaussian mixture method was employed to discriminate the data into normal and pathological speech. The GMM method attained 98.4% average correct classification rate with training data and 95.2% average correct classification rate with test data. The different mixture number (3 to 15) of GMM was used in order to obtain an optimal condition for classification. We also compared the average classification rate based on GMM, ANN and HMM. The proper number of mixtures on Gaussian model needs to be investigated in our future work.

  • PDF

The Comparison of Pitch Production Between Children with Cochlear Implants and Normal Hearing Children

  • Yoo, Hyun-Soo;Ko, Do-Heung
    • Speech Sciences
    • /
    • v.15 no.1
    • /
    • pp.87-98
    • /
    • 2008
  • This study compares the pitch production of children using cochlear implants (CI) with that of children with normal hearing. Twenty subjects from six to eight years old participated in the study. Three kinds of sentences were read and analyzed using Visi-Pitch $\blacktriangleright$(KAY Elemetrics, Model 3300). There were no considerable differences between the two groups regarding pitch, mean fundamental frequency (F0) and pitch range. In the cases of the slope value of F0 and duration, however, there were significant differences. Thus, it is concluded that duration and pitch control can be crucial factors in determining the intonation treatment of the children with cochlear implants.

  • PDF