• Title/Summary/Keyword: acoustic silence

Search Result 16, Processing Time 0.019 seconds

A STUDY ON THE IMPLEMENTATION OF ARTIFICIAL NEURAL NET MODELS WITH FEATURE SET INPUT FOR RECOGNITION OF KOREAN PLOSIVE CONSONANTS (한국어 파열음 인식을 위한 피쳐 셉 입력 인공 신경망 모델에 관한 연구)

  • Kim, Ki-Seok;Kim, In-Bum;Hwang, Hee-Yeung
    • Proceedings of the KIEE Conference
    • /
    • 1990.07a
    • /
    • pp.535-538
    • /
    • 1990
  • The main problem in speech recognition is the enormous variability in acoustic signals due to complex but predictable contextual effects. Especially in plosive consonants it is very difficult to find invariant cue due to various contextual effects, but humans use these contextual effects as helpful information in plosive consonant recognition. In this paper we experimented on three artificial neural net models for the recognition of plosive consonants. Neural Net Model I used "Multi-layer Perceptron ". Model II used a variation of the "Self-organizing Feature Map Model". And Model III used "Interactive and Competitive Model" to experiment contextual effects. The recognition experiment was performed on 9 Korean plosive consonants. We used VCV speech chains for the experiment on contextual effects. The speech chain consists of Korean plosive consonants /g, d, b, K, T, P, k, t, p/ (/ㄱ, ㄷ, ㅂ, ㄲ, ㄸ, ㅃ, ㅋ, ㅌ, ㅍ/) and eight Korean monothongs. The inputs to Neural Net Models were several temporal cues - duration of the silence, transition and vot -, and the extent of the VC formant transitions to the presence of voicing energy during closure, burst intensity, presence of asperation, amount of low frequency energy present at voicing onset, and CV formant transition extent from the acoustic signals. Model I showed about 55 - 67 %, Model II showed about 60%, and Model III showed about 67% recognition rate.

  • PDF

The Comparison of the Acoustic and Aerodynamic Characteristics of $PROVOX^{(R)}$ Voice and Esophageal Voice Produced by the Same Laryngectomee (동일 후적자가 산출하는 기관식도 발성($PROVOX^{(R)}$ 발성)과 식도 발성에 대한 음향학적 및 공기역학적 특성 비교)

  • Pyo, H.Y.;Choi, H.S.;Lim, S.E.;Choi, S.H.
    • Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.121-139
    • /
    • 1999
  • Our experimental subject was a laryngectomee who had undergone total laryngectomy with $PROVOX^{(R)}$ insertion, and learned esophageal speech after the surgery, so he could produce both $PROVOX^{(R)}$ voice and esophageal voice. With this subject's production of $PROVOX^{(R)}$ and esophageal voice, we are to compare the acoustic and aerodynamic characteristics of the two voices, under the same physical conditions of the same person. As a result, the fundamental frequency of esophageal voice was 137.2 Hz, and that of $PROVOX^{(R)}$ was 97.5 Hz. $PROVOX^{(R)}$ voice showed lower jitter, shimmer and NHR than esophageal voice, which means that $PROVOX^{(R)}$ voice showed better voice quality than esophageal voice. In spectrographic analysis, the formation of formants and pseudoformants were more distinct in esophageal voice and several temporal aspects of acoutic features such as VOT and closure duration were more similar with normal voice in $PROVOX^{(R)}$ voice. During the sentence utterance, esophageal voice showed longer pause or silence duration than $PROVOX^{(R)}$ voice. Maximum phonation time and mean flow rate of $PROVOX^{(R)}$ voice were much longer and larger than esophageal voice, but mean and range of sound pressure level, subglottic pressure and voice efficiency were similar in the two voices. Glottal resistance of esophageal voice was much larger than $PROVOX^{(R)}$ voice which showed still larger glottal resistance than normal voice.

  • PDF

Speech synthesis using acoustic Doppler signal (초음파 도플러 신호를 이용한 음성 합성)

  • Lee, Ki-Seung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.2
    • /
    • pp.134-142
    • /
    • 2016
  • In this paper, a method synthesizing speech signal using the 40 kHz ultrasonic signals reflected from the articulatory muscles was introduced and performance was evaluated. When the ultrasound signals are radiated to articulating face, the Doppler effects caused by movements of lips, jaw, and chin observed. The signals that have different frequencies from that of the transmitted signals are found in the received signals. These ADS (Acoustic-Doppler Signals) were used for estimating of the speech parameters in this study. Prior to synthesizing speech signal, a quantitative correlation analysis between ADS and speech signals was carried out on each frequency bin. According to the results, the feasibility of the ADS-based speech synthesis was validated. ADS-to-speech transformation was achieved by the joint Gaussian mixture model-based conversion rules. The experimental results from the 5 subjects showed that filter bank energy and LPC (Linear Predictive Coefficient) cepstrum coefficients are the optimal features for ADS, and speech, respectively. In the subjective evaluation where synthesized speech signals were obtained using the excitation sources extracted from original speech signals, it was confirmed that the ADS-to-speech conversion method yielded 72.2 % average recognition rates.

Automatic pronunciation assessment of English produced by Korean learners using articulatory features (조음자질을 이용한 한국인 학습자의 영어 발화 자동 발음 평가)

  • Ryu, Hyuksu;Chung, Minhwa
    • Phonetics and Speech Sciences
    • /
    • v.8 no.4
    • /
    • pp.103-113
    • /
    • 2016
  • This paper aims to propose articulatory features as novel predictors for automatic pronunciation assessment of English produced by Korean learners. Based on the distinctive feature theory, where phonemes are represented as a set of articulatory/phonetic properties, we propose articulatory Goodness-Of-Pronunciation(aGOP) features in terms of the corresponding articulatory attributes, such as nasal, sonorant, anterior, etc. An English speech corpus spoken by Korean learners is used in the assessment modeling. In our system, learners' speech is forced aligned and recognized by using the acoustic and pronunciation models derived from the WSJ corpus (native North American speech) and the CMU pronouncing dictionary, respectively. In order to compute aGOP features, articulatory models are trained for the corresponding articulatory attributes. In addition to the proposed features, various features which are divided into four categories such as RATE, SEGMENT, SILENCE, and GOP are applied as a baseline. In order to enhance the assessment modeling performance and investigate the weights of the salient features, relevant features are extracted by using Best Subset Selection(BSS). The results show that the proposed model using aGOP features outperform the baseline. In addition, analysis of relevant features extracted by BSS reveals that the selected aGOP features represent the salient variations of Korean learners of English. The results are expected to be effective for automatic pronunciation error detection, as well.

Analytic Verification of Optimal Degaussing Technique using a Scaled Model Ship (축소 모델 함정을 이용한 소자 최적화 기법의 해석적 검증)

  • Cho, Dong-Jin
    • Journal of the Korean Magnetics Society
    • /
    • v.27 no.2
    • /
    • pp.63-69
    • /
    • 2017
  • Naval ships are particularly required to maintain acoustic and magnetic silence due to their operational characteristics. Among them, underwater magnetic field signals derived by ships are likely to be detected by threats such as surveillance systems and mine systems at close distance. In order to increase the survivability of the vessels, various techniques for reducing the magnetic field signal are being studied and it is necessary to consider not only the magnitude of the magnetic field signal but also the gradient of it. In this paper, we use the commercial electromagnetic finite element analysis tool to predict the induced magnetic field signal of ship's scaled model, and arrange the degaussing coil. And the optimum degaussing current of the coil was derived by applying the particle swarm optimization algorithm considering the gradient constraint. The validity of the optimal degaussing technique is verified analytically by comparing the magnetic field signals after the degaussing with or without gradient constraint.

A New Relationship between Poetry and Music - music as Creative Principle of Poetry in Mallarmé's World (시와 음악 간의 새로운 관계 - 말라르메에게 있어 시 창작원리로서의 음악)

  • Do, Yoon-Jung
    • Cross-Cultural Studies
    • /
    • v.44
    • /
    • pp.211-237
    • /
    • 2016
  • This paper seeks to explore the new relationship between music and poetry established in the beginning of the Modern Era. This was a period when reading silently was the dominant culture rather than reading aloud and orality was limited due to the emergence of literacy and print culture. A poet sensitive to the characteristics of the period, $Mallarm{\acute{e}}$ created his own concept of music and new creative principles of poetry from it. We analyze his "Divigation" and letters, in particular, the "Crisis of vers", "Music and Literature", "Mystery in the letters", and "About the book." Firstly, $Mallarm{\acute{e}}$ connects music with the mystery and the sacred: the mystery surrounds the music and the music is oriented with the sacred. The sanctity is that of the human race and has existed within humans since the beginning. Transposing the characteristics of this music to the poetry is his first creative principle of poetry. However, $Mallarm{\acute{e}}$ called music a totality of relationships that exist between objects without reducing the dimension to only the instruments or the sound. His definition is abstract, regarding music as a complete rhythm, the atmosphere and the air. Secondly, we have the question of how to realize music in a poem. As the music is surrounded by the mystery, $Mallarm{\acute{e}}$ can transpose the sacred to a poem in mysterious ways. This leads to his second principle of poetry: make a poem as a structure. In other words, 'musically', based on the disappearance of real objects and the initiative of the poet, he created a structure with only the words. We can create an acoustic structure but $Mallarm{\acute{e}}$ created a visible structure to overcome the incompleteness of the sound of a word in the diffusion of print culture. In this manner, the use of silence as much as sound and the use of visual as much as aural components were introduced in poetry as important motifs and the essentials of creation. This new relationship between poetry and music and the creative principles drawn from it appear to be the areas to which attention should be focused in the research of poetry.