• Title/Summary/Keyword: 음향 파라미터

Search Result 387, Processing Time 0.028 seconds

α-feature map scaling for raw waveform speaker verification (α-특징 지도 스케일링을 이용한 원시파형 화자 인증)

  • Jung, Jee-weon;Shim, Hye-jin;Kim, Ju-ho;Yu, Ha-Jin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.441-446
    • /
    • 2020
  • In this paper, we propose the α-Feature Map Scaling (α-FMS) method which extends the FMS method that was designed to enhance the discriminative power of feature maps of deep neural networks in Speaker Verification (SV) systems. The FMS derives a scale vector from a feature map and then adds or multiplies them to the features, or sequentially apply both operations. However, the FMS method not only uses an identical scale vector for both addition and multiplication, but also has a limitation that it can only add a value between zero and one in case of addition. In this study, to overcome these limitations, we propose α-FMS to add a trainable parameter α to the feature map element-wise, and then multiply a scale vector. We compare the performance of the two methods: the one where α is a scalar, and the other where it is a vector. Both α-FMS methods are applied after each residual block of the deep neural network. The proposed system using the α-FMS methods are trained using the RawNet2 and tested using the VoxCeleb1 evaluation set. The result demonstrates an equal error rate of 2.47 % and 2.31 % for the two α-FMS methods respectively.

Effects of Ultrasonic Scanner Setting Parameters on the Quality of Ultrasonic Images (초음파 진단기의 설정 파라미터가 영상의 질에 미치는 효과)

  • Yang, Jeong-Hwa;Lee, Kyung-Sung;Kang, Gwan-Suk;Paeng, Dong-Guk;Choi, Min-Joo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.2
    • /
    • pp.57-65
    • /
    • 2008
  • Setting parameters of Ultrasonic scanners influence the quality of ultrasonic images. In order to obtain optimized images sonographers need to understand the effects of the setting parameters on ultrasonic images. The present study considered typical four parameters including TGC (Time Gain Control), Gain, Frequency, DR (Dynamic Range). LCS (low contrast sensitivity) was chosen to quantitatively compare the quality of the images. In the present experiment LCS targets of a standard ultrasonic test phantom (539, ATS, USA) were imaged using a clinical ultrasonic scanner (SA-9000 PRIME, Medison, Korea). Altering the settings in the parameters of the ultrasonic scanner, 6 LCS target images (+15 dB, +6 dB, +3 dB, -3 dB, -6 dB, -15 dB) to each setting were obtained, and their LCS values were calculated. The results show that the mean pixel value (LCS) is the highest at the max setting in TGC, mid to max in gain and pen mode in frequency and 40-66 dB in DR. Among all images, the image being the highest in LCS was obtained at the setting of DR 40 dB. It is expected that the results will be of use in setting the parameters when ultrasonically examining masses often clinically found In either solid lesions (similar to +15, +6, +3 dB targets) or cystic lesions (similar to -15, -6, -3 dB targets).

Numerical Analysis of Nonlinear Longitudinal Combustion Instability in LRE Using Pressure-Sensitive Time-Lag Hypothesis (시간지연 모델을 이용한 액체로켓엔진의 축방향 비선형 연소불안정 해석)

  • Kim Seong-Ku;Choi Hwan Seok;Park Tae Seon;Kim Yong-Mo
    • Proceedings of the Korean Society of Propulsion Engineers Conference
    • /
    • v.y2005m4
    • /
    • pp.281-287
    • /
    • 2005
  • Nonlinear behaviors such as steep-fronted wave motions and a finite amplitude limit cycle often accompanying combustion instabilities have been numerically investigated using a characteristic-based approximate Riemann solver and the well-known ${\eta}-{\tau}$ model. A resonant pipe initially subjected to a harmonic pressure disturbance described the natural steepening process that leads to a shocked N-wave. For a linearly unstable regime, pressure oscillations reach a limit cycle which is independent of the characteristics of the initial disturbances and depends only on combustion parameters and operating conditions. For the 1.5 MW gas generator under development in KARI, the numerical results show good agreement with experimental data from hot-firing tests.

  • PDF

A Study on Performance Evaluation of Hidden Markov Network Speech Recognition System (Hidden Markov Network 음성인식 시스템의 성능평가에 관한 연구)

  • 오세진;김광동;노덕규;위석오;송민규;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.4 no.4
    • /
    • pp.30-39
    • /
    • 2003
  • In this paper, we carried out the performance evaluation of HM-Net(Hidden Markov Network) speech recognition system for Korean speech databases. We adopted to construct acoustic models using the HM-Nets modified by HMMs(Hidden Markov Models), which are widely used as the statistical modeling methods. HM-Nets are carried out the state splitting for contextual and temporal domain by PDT-SSS(Phonetic Decision Tree-based Successive State Splitting) algorithm, which is modified the original SSS algorithm. Especially it adopted the phonetic decision tree to effectively express the context information not appear in training speech data on contextual domain state splitting. In case of temporal domain state splitting, to effectively represent information of each phoneme maintenance in the state splitting is carried out, and then the optimal model network of triphone types are constructed by in the parameter. Speech recognition was performed using the one-pass Viterbi beam search algorithm with phone-pair/word-pair grammar for phoneme/word recognition, respectively and using the multi-pass search algorithm with n-gram language models for sentence recognition. The tree-structured lexicon was used in order to decrease the number of nodes by sharing the same prefixes among words. In this paper, the performance evaluation of HM-Net speech recognition system is carried out for various recognition conditions. Through the experiments, we verified that it has very superior recognition performance compared with the previous introduced recognition system.

  • PDF

Compromised feature normalization method for deep neural network based speech recognition (심층신경망 기반의 음성인식을 위한 절충된 특징 정규화 방식)

  • Kim, Min Sik;Kim, Hyung Soon
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.65-71
    • /
    • 2020
  • Feature normalization is a method to reduce the effect of environmental mismatch between the training and test conditions through the normalization of statistical characteristics of acoustic feature parameters. It demonstrates excellent performance improvement in the traditional Gaussian mixture model-hidden Markov model (GMM-HMM)-based speech recognition system. However, in a deep neural network (DNN)-based speech recognition system, minimizing the effects of environmental mismatch does not necessarily lead to the best performance improvement. In this paper, we attribute the cause of this phenomenon to information loss due to excessive feature normalization. We investigate whether there is a feature normalization method that maximizes the speech recognition performance by properly reducing the impact of environmental mismatch, while preserving useful information for training acoustic models. To this end, we introduce the mean and exponentiated variance normalization (MEVN), which is a compromise between the mean normalization (MN) and the mean and variance normalization (MVN), and compare the performance of DNN-based speech recognition system in noisy and reverberant environments according to the degree of variance normalization. Experimental results reveal that a slight performance improvement is obtained with the MEVN over the MN and the MVN, depending on the degree of variance normalization.

Crack Source location Technique for nam Concrete Beam using Acoustic Emission (음향방출을 이용한 무근콘크리트 보의 균열 발생원 탐사기법)

  • 한상훈;이웅종;조홍동;김동규
    • Journal of the Korea Concrete Institute
    • /
    • v.13 no.2
    • /
    • pp.107-113
    • /
    • 2001
  • This study was conducted preliminarily to develop the crack source location technique for plain concrete beam using acoustic emission(AE). Before the main experiment, the test of virtual An source location was achieved in plain concrete block. The sensor layout was mutually compared between triangular layout and rectangular layout. As the results of test, AE source location by triangular layout was evaluated more effective than that by rectangular layout. The specimen to apply he source location technique was man in total nine specimens (each three in 40 %, 50%, 60% of W/C ratio) which the experiment variable was the compressive strength level(W/C ratio). The bending loading method is selected by cyclic loadings to evaluate the degree of concrete damage. It is seen that Kaiser effect and Felicity effect exists through analysis of AE parameters in coming failure experiment. As a result of analyzing the felicity ratio(FR) values, it is shown that this values can be used for evaluating the degree of concerto damage. AE activity is started highly at the 70% of failure load without the compressive strength level. Thus considered by a index in constructing the system of the failure warning at application of the field structure. And the results compared the real cracking location with the source location has perceived by AE monitoring before it is appeared the primary crack by visual observation.

Speech Recognition Using Linear Discriminant Analysis and Common Vector Extraction (선형 판별분석과 공통벡터 추출방법을 이용한 음성인식)

  • 남명우;노승용
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.4
    • /
    • pp.35-41
    • /
    • 2001
  • This paper describes Linear Discriminant Analysis and common vector extraction for speech recognition. Voice signal contains psychological and physiological properties of the speaker as well as dialect differences, acoustical environment effects, and phase differences. For these reasons, the same word spelled out by different speakers can be very different heard. This property of speech signal make it very difficult to extract common properties in the same speech class (word or phoneme). Linear algebra method like BT (Karhunen-Loeve Transformation) is generally used for common properties extraction In the speech signals, but common vector extraction which is suggested by M. Bilginer et at. is used in this paper. The method of M. Bilginer et al. extracts the optimized common vector from the speech signals used for training. And it has 100% recognition accuracy in the trained data which is used for common vector extraction. In spite of these characteristics, the method has some drawback-we cannot use numbers of speech signal for training and the discriminant information among common vectors is not defined. This paper suggests advanced method which can reduce error rate by maximizing the discriminant information among common vectors. And novel method to normalize the size of common vector also added. The result shows improved performance of algorithm and better recognition accuracy of 2% than conventional method.

  • PDF