• Title/Summary/Keyword: Acoustic mismatch

Search Result 42, Processing Time 0.034 seconds

Bit Error Characteristics of Passive Phase Conjugation Underwater Acoustic Communication Due to a Drifting Source

  • Lin Chun-Dan;Ro Yong Ju;Rouseff Daniel;Yoon Jong Rak
    • The Journal of the Acoustical Society of Korea
    • /
    • v.24 no.2E
    • /
    • pp.61-66
    • /
    • 2005
  • Experimental work in underwater acoustic communications using passive phase conjugation has shown that the demodulation error depends on the relative drift rate between the source and receiver [Rouseff et al., IEEE J. Oceanic Eng. 26, 821-831 (2001)]. The observed effect involves the mismatch between the initial impulse response and the subsequent response after the source or receiver has changed locations. In the present work, the effect of drifting source is analyzed by numerical simulations and compared to the experimental results. The communications bit error rate is qualified as a function of drift rate, drifting direction, and source-receiver range.

Compromised feature normalization method for deep neural network based speech recognition (심층신경망 기반의 음성인식을 위한 절충된 특징 정규화 방식)

  • Kim, Min Sik;Kim, Hyung Soon
    • Phonetics and Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.65-71
    • /
    • 2020
  • Feature normalization is a method to reduce the effect of environmental mismatch between the training and test conditions through the normalization of statistical characteristics of acoustic feature parameters. It demonstrates excellent performance improvement in the traditional Gaussian mixture model-hidden Markov model (GMM-HMM)-based speech recognition system. However, in a deep neural network (DNN)-based speech recognition system, minimizing the effects of environmental mismatch does not necessarily lead to the best performance improvement. In this paper, we attribute the cause of this phenomenon to information loss due to excessive feature normalization. We investigate whether there is a feature normalization method that maximizes the speech recognition performance by properly reducing the impact of environmental mismatch, while preserving useful information for training acoustic models. To this end, we introduce the mean and exponentiated variance normalization (MEVN), which is a compromise between the mean normalization (MN) and the mean and variance normalization (MVN), and compare the performance of DNN-based speech recognition system in noisy and reverberant environments according to the degree of variance normalization. Experimental results reveal that a slight performance improvement is obtained with the MEVN over the MN and the MVN, depending on the degree of variance normalization.

Comparison of score-penalty method and matched-field processing method for acoustic source depth estimation (음원 심도 추정을 위한 스코어-패널티 기법과 정합장 처리 기법의 비교)

  • Keunhwa Lee;Wooyoung Hong;Jungyong Park;Su-Uk Son;Ho Seuk Bae;Joung-Soo Park
    • The Journal of the Acoustical Society of Korea
    • /
    • v.43 no.3
    • /
    • pp.314-323
    • /
    • 2024
  • Recently, a score-penalty method has been used for the acoustic passive tracking of marine mammals. The interesting aspect of this technique lies in the loss function, which has a penalty term representing the mismatch between the measured signal and the modeled signal, while the traditional time-domain matched-field processing is positively considering the match between them. In this study, we apply the score-penalty method into the depth estimation of a passive target with a known source waveform. Assuming deep ocean environments with uncertainties in the sound speed profile, we evaluate the score-penalty method, comparing it with the time-domain matched field processing method. We shows that the score-penalty method is more accurate than the time-domain matched field processing method in the ocean environment with weak mismatch of sound speed profile, and has better efficiency. However, in the ocean enviroment with strong mismatch of the sound speed profile, the score-penalty method also fails in the depth estimation of a target, similar to the time-domain matched-field processing method.

Sound manipulation: Theory and Applications (음장 제어의 이론과 그 적용)

  • Kim, Yang-Hann
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 2008.04a
    • /
    • pp.468-471
    • /
    • 2008
  • Sound manipulation is to control sound field using multiple sound sources for appropriate purposes. In linear acoustics, a sound can be constructed by superimposing several fundamental sound fields such as a planewave and sphere shape sound field. That is how we manipulate sound field. In this paper, we introduce the theory of sound manipulation and its applications from the examples of the generation of fundamental sound field: a circle, a ring shape sound field and a planewave field.

  • PDF

Class-Based Histogram Equalization for Robust Speech Recognition

  • Suh, Young-Joo;Kim, Hoi-Rin
    • ETRI Journal
    • /
    • v.28 no.4
    • /
    • pp.502-505
    • /
    • 2006
  • A new class-based histogram equalization method is proposed for robust speech recognition. The proposed method aims at not only compensating the acoustic mismatch between training and test environments, but also at reducing the discrepancy between the phonetic distributions of training and test speech data. The algorithm utilizes multiple class-specific reference and test cumulative distribution functions, classifies the noisy test features into their corresponding classes, and equalizes the features by using their corresponding class-specific reference and test distributions. Experiments on the Aurora 2 database proved the effectiveness of the proposed method by reducing relative errors by 18.74%, 17.52%, and 23.45% over the conventional histogram equalization method and by 59.43%, 66.00%, and 50.50% over mel-cepstral-based features for test sets A, B, and C, respectively.

  • PDF

Robust Speech Recognition by Utilizing Class Histogram Equalization (클래스 히스토그램 등화 기법에 의한 강인한 음성 인식)

  • Suh, Yung-Joo;Kim, Hor-Rin;Lee, Yun-Keun
    • MALSORI
    • /
    • no.60
    • /
    • pp.145-164
    • /
    • 2006
  • This paper proposes class histogram equalization (CHEQ) to compensate noisy acoustic features for robust speech recognition. CHEQ aims to compensate for the acoustic mismatch between training and test speech recognition environments as well as to reduce the limitations of the conventional histogram equalization (HEQ). In contrast to HEQ, CHEQ adopts multiple class-specific distribution functions for training and test environments and equalizes the features by using their class-specific training and test distributions. According to the class-information extraction methods, CHEQ is further classified into two forms such as hard-CHEQ based on vector quantization and soft-CHEQ using the Gaussian mixture model. Experiments on the Aurora 2 database confirmed the effectiveness of CHEQ by producing a relative word error reduction of 61.17% over the baseline met-cepstral features and that of 19.62% over the conventional HEQ.

  • PDF

Ultrasonic Test Criterion for the Explosively Welded Fe-Naval Brass Bonding Quality (초음파법에 의한 폭발접합 이종금속 접합품질 판정레벨 설정에 관한 연구)

  • 장영권;백영남
    • Journal of Welding and Joining
    • /
    • v.19 no.1
    • /
    • pp.40-48
    • /
    • 2001
  • An ultrasonic test method, as a nondestructive test is applied to ensure the clad interface quality assessment. According to the reference codes and standards, not only korea Industrial Standard(KS) but also American Society for Testing and Materials (ASTM) Standard, ultrasonic examination procedures use the pulse-echo, A-scan, back reflection signal drop method and/or side drilled reference hole used to establish the acceptance criteria of clad material test. But the variety of bonding materials and sizes makes it difficult to produce the reference blocks, or thus the criteria. In order to overcome these practical difficulties, new ultrasonic testing criterion is suggested. In this new method, the theoretical interface reflection signal amplitude level is calculated and suggested as an acceptance criteria with the back reflection signal set to 100% FSH(Full Screen Height) which is based on acoustic impedance mismatch at the clad interface for the explosive clad ultrasonic inspection. Applicability of suggested criterion, for the explosive clad Fe-Naval Brass with different bonding quality is confirmed to the pre-existed KS and ASTM specifications and verified by using SEM (Seanning Electron Microscope) micrograph. The results obtained by the suggested method is more conservative than the results according to the KS B 0234 and ASTM A 578 specifications The suggested method could be applicable to any other combination of explosive clad ultrasonic inspection.

  • PDF

Speech Recognition based on Environment Adaptation using SNR Mapping (SNR 매핑을 이용한 환경적응 기반 음성인식)

  • Chung, Yong-Joo
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.9 no.5
    • /
    • pp.543-548
    • /
    • 2014
  • Multiple-model based speech recognition framework (MMSR) has been known to be very successful in speech recognition. Since it uses multiple hidden Markov modes (HMMs) that corresponds to various noise types and signal-to-noise ratio (SNR) values, the selected acoustic model can have a close match with the test noisy speech. However, since the number of HMM sets is limited in practical use, the acoustic mismatch still remains as a problem. In this study, we experimentally determined the optimal SNR mapping between the test noisy speech and the HMM set to mitigate the mismatch between them. Improved performance was obtained by employing the SNR mapping instead of using the estimated SNR from the test noisy speech. When we applied the proposed method to the MMSR, the experimental results on the Aurora 2 database show that the relative word error rate reduction of 6.3% and 9.4% was achieved compared to a conventional MMSR and multi-condition training (MTR), respectively.

Performance of Carrier Frequency Offset Compensation using CAZAC Code in Time and Spatial Variant Underwater Acoustic Channel (시·공간 변동 수중음향 채널에서 CAZAC 코드를 적용한 반송파 주파수 옵셋 보상 기법의 성능평가)

  • Park, Jihyun;Bae, Minja;Kim, Jongju;Yoon, Jong Rak
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.7
    • /
    • pp.1229-1236
    • /
    • 2016
  • In underwater acoustic multipath channel, a performance of underwater acoustic (UWA) communication systems is affected by dynamic variation of boundary and high temporal and spatial variability of the channel conditions. Time and spatial variations of UWA channel induce a carrier frequency offset (CFO) since a phase and a frequency of received signal mismatch with a transmitting signal. Therefore, a performance of a phase shift keying underwater acoustic communication system is degraded. In this study, we have analyzed a performance of CFO estimation and compensation using a phase code in time and spatial variation channel. A constant amplitude zero autocorrelation (CAZAC) signal is applied as a phase code signal and its performance is evaluated in water tank. The bit error rate of a quadrature phase shift keying (QPSK) system with a phase code is improved about 4 to 10 times better than that without a phase code.

A Phase-related Feature Extraction Method for Robust Speaker Verification (열악한 환경에 강인한 화자인증을 위한 위상 기반 특징 추출 기법)

  • Kwon, Chul-Hong
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.14 no.3
    • /
    • pp.613-620
    • /
    • 2010
  • Additive noise and channel distortion strongly degrade the performance of speaker verification systems, as it introduces distortion of the features of speech. This distortion causes a mismatch between the training and recognition conditions such that acoustic models trained with clean speech do not model noisy and channel distorted speech accurately. This paper presents a phase-related feature extraction method in order to improve the robustness of the speaker verification systems. The instantaneous frequency is computed from the phase of speech signals and features from the histogram of the instantaneous frequency are obtained. Experimental results show that the proposed technique offers significant improvements over the standard techniques in both clean and adverse testing environments.