• Title/Summary/Keyword: Acoustic Characteristics

Search Result 2,195, Processing Time 0.027 seconds

Compromised feature normalization method for deep neural network based speech recognition (심층신경망 기반의 음성인식을 위한 절충된 특징 정규화 방식)

  • Kim, Min Sik;Kim, Hyung Soon
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.65-71
    • /
    • 2020
  • Feature normalization is a method to reduce the effect of environmental mismatch between the training and test conditions through the normalization of statistical characteristics of acoustic feature parameters. It demonstrates excellent performance improvement in the traditional Gaussian mixture model-hidden Markov model (GMM-HMM)-based speech recognition system. However, in a deep neural network (DNN)-based speech recognition system, minimizing the effects of environmental mismatch does not necessarily lead to the best performance improvement. In this paper, we attribute the cause of this phenomenon to information loss due to excessive feature normalization. We investigate whether there is a feature normalization method that maximizes the speech recognition performance by properly reducing the impact of environmental mismatch, while preserving useful information for training acoustic models. To this end, we introduce the mean and exponentiated variance normalization (MEVN), which is a compromise between the mean normalization (MN) and the mean and variance normalization (MVN), and compare the performance of DNN-based speech recognition system in noisy and reverberant environments according to the degree of variance normalization. Experimental results reveal that a slight performance improvement is obtained with the MEVN over the MN and the MVN, depending on the degree of variance normalization.

A study of the prosodic patterns of autism and normal children in the imitating declarative and interrogative sentences (따라말하기 과제를 통한 자폐범주성 장애 아동과 일반 아동의 평서문과 의문문의 음향학적 특성 비교)

  • Lee, Jinhyung;Seong, Cheoljae
    • Phonetics and Speech Sciences
    • /
    • v.12 no.2
    • /
    • pp.39-49
    • /
    • 2020
  • The prosody of children with autism spectrum disorders (ASD) has several abnormal features, including monotonous speech. The purpose of this study was to compare acoustic features between an ASD group and a typically developing (TD) group and within the ASD group. The study also examined audience perceptions of the lengthening effect of increasing the number of syllables. 50 participants were divided into two groups (20 with ASD and 30 TD), and they were asked to imitate a total of 28 sentences. In the auditory-perceptual evaluation, seven participants chose sentence types in 115 sentences. Pitch, intensity, speech rate, and pitch slope were used to analyze the significant differences. In conclusion, the ASD group showed higher pitch and intensity and a lower overall speaking rate than the TD group. Moreover, there were significant differences in s2 slope of interrogative sentences. Finally, based on the auditory-perceptual evaluation, only 4.3% of interrogative sentences produced by participants with ASD were perceived as declarative sentences. The cause of this abnormal prosody has not been clearly identified; however, pragmatic ability and other characteristics of autism are related to ASD prosody. This study identified prosodic ASD patterns and suggested the need to develop treatments to improve prosody.

Efficient Prediction of Broadband Noise of a Centrifugal Fan Using U-FRPM Technique (U-FRPM 기법을 이용한 원심팬 광대역소음의 효율적 예측)

  • Heo, Seung;Cheong, Chulung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.34 no.1
    • /
    • pp.36-45
    • /
    • 2015
  • Recently, a lot of studies have been made about the methods used to generate turbulent velocity fields stochastically in order to effectively predict broadband flow noise. Among them, the FRPM (Fast Random Particle Mesh) method which generates turbulence with specific statistical properties using turbulence kinetic energy and dissipation obtained from the steady solution of the RANS (Reynolds Averaged Navier-Stokes) equations has been successfully applied. However, the FRPM method cannot be applied to the flow noise problems involving intrinsic unsteady characteristics such as centrifugal fan. In this paper, to effectively predict the broadband noise generated by centrifugal fan, U-FRPM (unsteady FRPM) method is developed by extending the FRPM method to be combined with the unsteady numerical solutions of the unsteady RANS equations to generate the turbulence considered as broadband noise sources. Firstly, an unsteady flow field is obtained from the unsteady RANS equations through CFD (Computational Fluid Dynamics). Then, noise sources are generated using the U-FRPM method combined with acoustic analogy. Finally, the linear propagation model which is realized through BEM (Boundary Element Method) is combined with the generated sources to predict broadband noise at the listeners' position. The proposed technique is validated to compare its prediction result with the measured data.

The Characteristics of the Wafer Bonding between InP Wafers and $\textrm{Si}_3\textrm{N}_4$/InP (Direct Wafer Bonding법에 의한 InP 기판과 $\textrm{Si}_3\textrm{N}_4$/InP의 접합특성)

  • Kim, Seon-Un;Sin, Dong-Seok;Lee, Jeong-Yong;Choe, In-Hun
    • Korean Journal of Materials Research
    • /
    • v.8 no.10
    • /
    • pp.890-897
    • /
    • 1998
  • The direct wafer bonding between n-InP(001) wafer and the ${Si}_3N_4$(200 nm) film grown on the InP wafer by PECVD method was investigated. The surface states of InP wafer and ${Si}_3N_4$/InP which strongly depend upon the direct wafer bonding strength between them when they are brought into contact, were characterized by the contact angle measurement technique and atomic force microscopy. When InP wafer was etched by $50{\%}$ HF, contact angle was $5^{\circ}$ and RMS roughness was $1.54{\AA}$. When ${Si}_3N_4$ was etched by ammonia solution, RMS roughness was $3.11{\AA}$. The considerable amount of initial bonding strength between InP wafer and ${Si}_3N_4$/InP was observed when the two wafer was contacted after the etching process by $50{\%}$ HF and ammonia solution respectively. The bonded specimen was heat treated in $H^2$ or $N^2$, ambient at the temperature of $580^{\circ}C$-$680^{\circ}C$ for lhr. The bonding state was confirmed by SAT(Scannig Acoustic Tomography). The bonding strength was measured by shear force measurement of ${Si}_3N_4$/InP to InP wafer increased up to the same level of PECVD interface. The direct wafer bonding interface and ${Si}_3N_4$/InP PECVD interface were chracterized by TEM and AES.

  • PDF

Fabrication of FBAR (SMR) using Reflector (반사층을 이용한 FBAR(SMR)의 제조)

  • Lee, Jae-Bin;Kwak, Sang-Hyon;Kim, Hyeong-Joon;Park, Hee-Dae;Kim, Young-Sik
    • Korean Journal of Materials Research
    • /
    • v.9 no.12
    • /
    • pp.1263-1269
    • /
    • 1999
  • An FBAR(Solidly Mounted Resonator) was fabricated using reflector layers which prohibit the penetration of bulk acoustic wave into substrate. The SMR consisted of top and bottom electrodes(Al films), a piezoelectric layer (ZnO film), reflector layers(W/$Si_2$ films) and Si substrate. The electrodes were deposited by dc sputtering. The piezoelectric layer and the reflector layers were deposited by rf magnetron sputtering. The control of crystallinity, microstructures and electric properties of each layer was essential for attaining the optimum FBAR characteristics. Under the best deposition conditions for FBAR devices, the ZnO films had highly c-axis preferred orientation(${\sigma}=2.17^{\circ}$), resistivity of $10^4\;{\omega}cm$, and surface roughness of 10.6 ${\AA}$. On the other hand, the surface roughness of W and $Si_2$ films was 16 ${\AA}$ and 33 ${\AA}$, respectively, and the resistivity of Al film was $5.1{\times}10^{-6}\;{\Omega}cm$. The SMR devices were fabricated by the conventional semiconductor processes. In the resonance conditions of the SMR, the series resonance frequency (fs) and the parallel resonance frequency(fp) were 1.244 GHz and 1.251 GHz, respectively and the quality factor(Q) was 1200.

  • PDF

Study on Characteristics of SCC and AE Signals for Weld HAZ of HT-60 Steel (HT-60강 용접부의 SCC및 AE신호특성에 관한 연구)

  • Na, Eui-Gyun;Yu, Hyo-Sun;Kim, Hoon
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.21 no.1
    • /
    • pp.62-68
    • /
    • 2001
  • In order to characterize the microscopic fracture behaviour of the weldment din stress corrosion cracking(SCC) phenomena, SCC and acoustic emission(AE) tests were carried out simultaneously and the correlation between mechanical paramenters obtained from SCC and AE tests was investigated. In the case of base metal, much more AE events were produced at -0.5V than at -0.8V because of the dissolution mechanism before the maximum load. Regardless of the applied voltages to the specimens, however, AE events decreased after the maximum load. In the case of weldment, lots of AE events with larger amplitude $range(40{\sim}100dB)$ were produced because of the singularities of weld HAZ in comparision to the base metal and post-weld heat-treated(PWHT) specimens. Numerous and larger cracks for the weldment were observed on the fractured surfaces by SEM examination. From these results, it was concluded that SCC for the weldment appeared most severely in synthetic seawater. Weld HAZ was softened by PWHT which also contributed to the reduced susceptibility to corrosive environment in comparison to the weldment.

  • PDF

Characterization of Stress Corrosion Cracking at the Welded Region of High Strength Steel using Acoustic Emission Method (음향방출법에 의한 고 장력강 용접부의 부식손상 특성 평가)

  • Na, Eui-Gyun;Kim, Hoon
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.23 no.3
    • /
    • pp.212-219
    • /
    • 2003
  • This study is to evaluate the characteristics of SCC at the welded region of high strength steel using acoustic emission(AE) method. Specimens were loaded by a slow strain rate method in synthetic seawater and the damage process was monitored simultaneously by AE method. Corrosive environment was controlled using the potentiostat, in which -0.8V and -1.1V were applied to the specimens. In the case of one-pass weldment subjected to -0.8V, much more AE counts were detected compared with the PWHT specimen. It was verified through the cumulative counts that coalescence of micro cracks and cracks for the one pass weldment with -0.8V were mostly detected. In case of the one pass weldment subjected to -1.1V, time to failure became shorter and AE counts were produced considerably as compared with that of the two pass weldment. It was shown that AE counts and range of AE amplitude have close relations with the number and size as well as width of the cracks which were formed during the SCC.

Application of a Fiber Fabry-Pérot Interferometer Sensor for Receiving SH-EMAT Signals (SH-EMAT의 신호 수신을 위한 광섬유 패브리-페롯 간섭계 센서의 적용)

  • Lee, Jin-Hyuk;Kim, Dae-Hyun;Park, Ik-Keun
    • Journal of the Korean Society for Nondestructive Testing
    • /
    • v.34 no.2
    • /
    • pp.165-170
    • /
    • 2014
  • Shear horizontal (SH) waves propagate as a type of plate wave in a thin sheet. The dispersion characteristics of SH waves can be used for signal analysis. Therefore, SH-waves are useful for monitoring the structural health of a thin-sheet-structure. An electromagnetic acoustic transducer (EMAT), which is a non-contact ultrasonic transducer, can generate SH-waves easily by varying the shape and array of magnets and coils. Therefore, an EMAT can be applied to an automated ultrasonic testing system for structural health monitoring. When used as a sensor, however, the EMAT has a weakness in that electromagnetic interference (EMI) noise can occur easily in the automated system because of motors and electric devices. Alternatively, a fiber optic sensor works well in the same environment with EMI noise because it uses a light signal instead of an electric signal. In this paper, a fiber Fabry-P$\acute{e}$rot interferometer (FFPI) was proposed as a sensor to receive the SH-waves generated by an EMAT. A simple test was performed to verify the performance of the FFPI sensor. It is thus shown that the FFPI can receive SH-wave signals clearly.

Speech Visualization of Korean Vowels Based on the Distances Among Acoustic Features (음성특징의 거리 개념에 기반한 한국어 모음 음성의 시각화)

  • Pok, Gouchol
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.12 no.5
    • /
    • pp.512-520
    • /
    • 2019
  • It is quite useful to represent speeches visually for learners who study foreign languages as well as the hearing impaired who cannot directly hear speeches, and a number of researches have been presented in the literature. They remain, however, at the level of representing the characteristics of speeches using colors or showing the changing shape of lips and mouth using the animation-based representation. As a result of such approaches, those methods cannot tell the users how far their pronunciations are away from the standard ones, and moreover they make it technically difficult to develop such a system in which users can correct their pronunciation in an interactive manner. In order to address these kind of drawbacks, this paper proposes a speech visualization model based on the relative distance between the user's speech and the standard one, furthermore suggests actual implementation directions by applying the proposed model to the visualization of Korean vowels. The method extract three formants F1, F2, and F3 from speech signals and feed them into the Kohonen's SOM to map the results into 2-D screen and represent each speech as a pint on the screen. We have presented a real system implemented using the open source formant analysis software on the speech of a Korean instructor and several foreign students studying Korean language, in which the user interface was built using the Javascript for the screen display.

Investigation of the listening environment for lower grade students in elementary school using subjective tests (주관적 평가법을 이용한 초등학교 저학년 교실의 청취환경 조사)

  • Park, Chan-Jae;Haan, Chan-Hoon
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.3
    • /
    • pp.201-212
    • /
    • 2021
  • The present study was conducted as a pilot investigation to suggest the standards of acoustic performance for classrooms suitable for incomplete hearing people such as children under 9 years of age. Subjective evaluations such as questionnaire and speech intelligibility test were conducted to 264 students at two elementary schools in Cheong-ju in order to analyze the characteristics of the listening environment in the classrooms of the lower grades in elementary school. The survey was undertaken with a total of 264 students at two elementary schools in Cheong-ju, and investigated their satisfaction with the classroom listening environment. As a result, students responded that the most helpful information type for understanding class content is the voice of teacher. In addition, the volume of the current teacher's voice is normal, and the level of clarity is highly satisfactory. As for the acoustic performance of the classroom, the opinion that the noise was normal and the reverberation was very short was found to be dominant in overall satisfaction with the listening environment. Meanwhile, as a result of speech intelligibility test using the word list selected for the lower grade students of elementary school, it could be inferred that the longitudinal axis distance from the sound source in the case of 8-year-olds is a factor that affects speech recognition.