• Title/Summary/Keyword: plosive

Search Result 34, Processing Time 0.03 seconds

A Phonetic Study of Russian Soft Plosives (러시아어 파열음에 나타나는 연자음의 음향음성학적 연구)

  • Byun, Koon-Hyuk
    • MALSORI
    • /
    • no.61
    • /
    • pp.15-29
    • /
    • 2007
  • The present study investigates acoustic cues of russian soft plosive consonants. In previous studies, russian soft consonants are distinguished from hard consonants by F1, F2 of following vowels. The result showed: (1) that F0 of soft plosive consonants in following vowels were lower than those of hard plosive consonants; (2) and that VOT of soft plosive consonants were longer than those of hard plosive consonants. Hence, the present that, in addition to F1, F2, VOT and F0 are detected as acoustic cues that differentiate soft plosive consonants from hard plosive consonant in Russian.

  • PDF

A STUDY ON THE IMPLEMENTATION OF ARTIFICIAL NEURAL NET MODELS WITH FEATURE SET INPUT FOR RECOGNITION OF KOREAN PLOSIVE CONSONANTS (한국어 파열음 인식을 위한 피쳐 셉 입력 인공 신경망 모델에 관한 연구)

  • Kim, Ki-Seok;Kim, In-Bum;Hwang, Hee-Yeung
    • Proceedings of the KIEE Conference
    • /
    • 1990.07a
    • /
    • pp.535-538
    • /
    • 1990
  • The main problem in speech recognition is the enormous variability in acoustic signals due to complex but predictable contextual effects. Especially in plosive consonants it is very difficult to find invariant cue due to various contextual effects, but humans use these contextual effects as helpful information in plosive consonant recognition. In this paper we experimented on three artificial neural net models for the recognition of plosive consonants. Neural Net Model I used "Multi-layer Perceptron ". Model II used a variation of the "Self-organizing Feature Map Model". And Model III used "Interactive and Competitive Model" to experiment contextual effects. The recognition experiment was performed on 9 Korean plosive consonants. We used VCV speech chains for the experiment on contextual effects. The speech chain consists of Korean plosive consonants /g, d, b, K, T, P, k, t, p/ (/ㄱ, ㄷ, ㅂ, ㄲ, ㄸ, ㅃ, ㅋ, ㅌ, ㅍ/) and eight Korean monothongs. The inputs to Neural Net Models were several temporal cues - duration of the silence, transition and vot -, and the extent of the VC formant transitions to the presence of voicing energy during closure, burst intensity, presence of asperation, amount of low frequency energy present at voicing onset, and CV formant transition extent from the acoustic signals. Model I showed about 55 - 67 %, Model II showed about 60%, and Model III showed about 67% recognition rate.

  • PDF

A Study On The Automatic Discrimination Of The Korean Alveolar Stops (한국어 파열음의 자동 인식에 대한 연구 : 한국어 치경 파열음의 자동 분류에 관한 연구)

  • Choi, Yun-Seok;Kim, Ki-Seok;Hwang, Hee-Yeung
    • Proceedings of the KIEE Conference
    • /
    • 1987.11a
    • /
    • pp.330-333
    • /
    • 1987
  • This paper is the study on the automatic discrimination of the Korean alveolar stops. In Korean, it is necessary to discriminate the asperate/tense plosive for the automatic speech recognition system because we, Korean, distinguish asperate/tense plosive allphones from tense and lax plosive. In order to detect acoustic cues for automatic recognition of the [ㄲ, ㄸ, ㅃ], we have experimented the discrimination of [ㄷ,ㄸ,ㅌ]. We used temporal cues like VOT and Silence Duration, etc., and energy cues like ratio of high frequency energy and low frequency energy as the acoustic parameters. The VCV speech data where V is the 8 Simple Vowels and C is the 3 alevolar stops, are used for experiments. The 192 speech data are experimented on and the recognition rate is resulted in about 82%-95%.

  • PDF

Acoustic Properties Associated with the Plosive Production of Adults with Cerebral Palsy (경직형과 불수의운동형 뇌성마비 성인의 파열음 산출의 음향음성학적 특성)

  • Kim, Jung-Yeon;Hwang, Min-A;Park, Chang-Il;Zhi, Min-Je
    • Speech Sciences
    • /
    • v.8 no.3
    • /
    • pp.209-224
    • /
    • 2001
  • The purpose of this study was to identify the acoustic properties of 9 word initial Korean plosives in the speech of adults with cerebral palsy. Normal adults and two groups of adults with cerebral palsy (athetoid group and spastic group) participated in this study. Speech material included monosyllabic CVC real word pairs. Among the various acoustic properties of plosives, the aspiration duration was measured. Adults with cerebral palsy exhibited different patterns of aspiration duration for triplets of Korean plosives compared to normal adults. In addition, the plosive production of spastic group was distinguished from that of athetoid group. Such acoustic characteristics of plosives of adults with cerebral palsy may negatively affect the intelligibility of their speech.

  • PDF

Acoustic Comparisons of Vowel and Plosive Productions between the Normal and the Hearing-Impaired Children (청각장애아동과 건청아동의 모음 및 파열음 산출의 음향음성학적 특성 비교)

  • Oh, Y.J.;Zhi, M.Z.;Kim, Y.T.
    • Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.51-70
    • /
    • 2000
  • Twenty normal and 20 severe-to-profound hearing-impaired subjects participated in the present study. The two groups are matched by their chronological age. Each subject made a recording of three vowels of /i/, /a/, and /u/, and nine $VC_{plosive}V$ (hereafter, VCV) disyllables of /epe/, /ep'e/, /$ep^{h}e$/, /ete/, /et'e/, /$et^{h}e$/, /eke/, /ek'e/, and /$ek^{h}e$/, each five times. Formant frequencies of $F_1,\;F_2,\;and\;F_3$ were measured for the three vowels and six measures were made for the nine disyllables. The six measures were (1) the total duration of the disyllable, (2) the duration of the first vowel, (3) the duration of the closed period, (4) the ratio of the first vowel over the first vowel plus the closure period of the consonant, (5) the duration of the aspiration, and (6) the duration of the second vowel. Results shows that the three formants and each of the measures were significantly different between the two groups of subjects.

  • PDF

A Study on a Searching, Extraction and Approximation-Synthesis of Transition Segment in Continuous Speech (연속음성에서 천이구간의 탐색, 추출, 근사합성에 관한 연구)

  • Lee, Si-U
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.4
    • /
    • pp.1299-1304
    • /
    • 2000
  • In a speed coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and an unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including UnVoiced Consonant) searching, extraction ad approximation-synthesis method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This method based on a zerocrossing rate and pitch detector using FIR-STREAK Digital Filter. As a result, the extraction rates of TSIUVC are 84.8% (plosive), 94.9%(fricative), 92.3%(affricative) in female voice, and 88%(plosive), 94.9%(fricative), 92.3%(affricative) in male voice respectively, Also, I obain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. This method has the capability of being applied to speech coding of low bit rate, speech analysis and speech synthesis.

  • PDF

Korean plain plosive produced by Chinese female speakers: Sentence vs. Paragraph (중국인 여성 화자의 한국어 평음 파열음 발음: 독립 문장과 문단의 비교)

  • Jiang, Pan;Kim, Ji-Eun;Lee, Choong-Woo
    • Phonetics and Speech Sciences
    • /
    • v.7 no.2
    • /
    • pp.111-117
    • /
    • 2015
  • The purpose of this study is to investigate how Chinese learners of Korean produce Korean plain plosives differently in a reading passage and isolated sentences. There are several studies on Korean plosives produced by Chinese speakers, but the study comparing the production of reading passage and isolated sentences are rare. For these purposes, ten Chinese speakers' VOT values of Korean plain plosives were measured using Speech Analyzer. The results show that there is no significant difference between the plain plosive production of a reading passage and that of isolated sentences. In the further studies, the measurement of pitch with VOT is needed.

A Study on the Speech Recognition of Korean Phonemes Using Recurrent Neural Network Models (순환 신경망 모델을 이용한 한국어 음소의 음성인식에 대한 연구)

  • 김기석;황희영
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.40 no.8
    • /
    • pp.782-791
    • /
    • 1991
  • In the fields of pattern recognition such as speech recognition, several new techniques using Artifical Neural network Models have been proposed and implemented. In particular, the Multilayer Perception Model has been shown to be effective in static speech pattern recognition. But speech has dynamic or temporal characteristics and the most important point in implementing speech recognition systems using Artificial Neural Network Models for continuous speech is the learning of dynamic characteristics and the distributed cues and contextual effects that result from temporal characteristics. But Recurrent Multilayer Perceptron Model is known to be able to learn sequence of pattern. In this paper, the results of applying the Recurrent Model which has possibilities of learning tedmporal characteristics of speech to phoneme recognition is presented. The test data consist of 144 Vowel+ Consonant + Vowel speech chains made up of 4 Korean monothongs and 9 Korean plosive consonants. The input parameters of Artificial Neural Network model used are the FFT coefficients, residual error and zero crossing rates. The Baseline model showed a recognition rate of 91% for volwels and 71% for plosive consonants of one male speaker. We obtained better recognition rates from various other experiments compared to the existing multilayer perceptron model, thus showed the recurrent model to be better suited to speech recognition. And the possibility of using Recurrent Models for speech recognition was experimented by changing the configuration of this baseline model.

A Study on Multi-Pulse Speech Coding Method by using Selected Information in a Frequency Domain (주파수 영역의 선택정보를 이용한 멀티펄스 음성부호화 방식에 관한 연구)

  • Lee See-Woo
    • Journal of Internet Computing and Services
    • /
    • v.7 no.4
    • /
    • pp.57-66
    • /
    • 2006
  • In this paper, I propose a new method of Multi-Pulse Speech Coding(FBD-MPC: Frequency Band Division MPC) by using TSIUVC(Transition Segment Including UnVoiced Consonant) searching, extraction and approximation-synthesis method in a frequency domain. As, a result. the extraction rates of TSIUVC are 84.8%(plosive), 94.9%(fricative) and 92.3%(affricative) in female voice, 88%(plosive), 94.9%(fricative) and 92.3%(affricative) in male voice respectively. Also, I obtain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. I evaluate MPC by using switching information of voiced/unvoiced and FBD-MPC by using switching information of voiced/Silence/TSIUVC. As, a result, I knew that synthesis speech of FBD-MPC was better in speech quality than synthesis speech of the MPC.

  • PDF

A Study for Acoustic Cues of Pyoung-An Do Dialect Using LPC (LPC를 이용한 평안방언의 음향지표에 관한 연구)

  • Song, Chul-Gyu;Lee, Myoung-Ho;Kim, Young-Bae
    • Journal of Biomedical Engineering Research
    • /
    • v.13 no.3
    • /
    • pp.195-200
    • /
    • 1992
  • This paper deal with the acoustic cues of Pyoung-An Do dialect using linear prediction. Also, this paper descrbes a statistical comparison between standard tone speech data and Pyoung-An Do dia lects. The analysis done mainly focused on the distribution of formants and pitch periods accord to ac- cents variation. For the purpose of objective comparison, the experiments are performed by extracts for- mant LPC spectrum and pithch periods from average magnitude difference function waveforms. Summing up the results, In disyllable words (VCV pattern) , prepositioned vowels have longer phona lion time than postpositioned vowels and the intrin, iii phonation time is whore longer in the low vowels than in the high ones. The africative consonants show the mixed characteristics of the plosive and frlc ative consonants. The remarkable acoustic cues are the low frequency noise-like waves just before the 1st formants in the plosive consonants, the high frequency noise-like waves in the fricative consonants, and phonation time is not affected by the kinds of prepositioned or postpositioned vowels.

  • PDF