• Title/Summary/Keyword: 비정상 잡음환경

Search Result 50, Processing Time 0.039 seconds

Adaptive Threshold for Speech Enhancement in Nonstationary Noisy Environments (비정상 잡음환경에서 음질향상을 위한 적응 임계 치 알고리즘)

  • Lee, Soo-Jeong;Kim, Sun-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.7
    • /
    • pp.386-393
    • /
    • 2008
  • This paper proposes a new approach for speech enhancement in highly nonstationary noisy environments. The spectral subtraction (SS) is a well known technique for speech enhancement in stationary noisy environments. However, in real world, noise is mostly nonstationary. The proposed method uses an auto control parameter for an adaptive threshold to work well in highly nonstationary noisy environments. Especially, the auto control parameter is affected by a linear function associated with an a posteriori signal to noise ratio (SNR) according to the increase or the decrease of the noise level. The proposed algorithm is combined with spectral subtraction (SS) using a hangover scheme (HO) for speech enhancement. The performances of the proposed method are evaluated ITU-T P.835 signal distortion (SIG) and the segment signal to-noise ratio (SNR) in various and highly nonstationary noisy environments and is superior to that of conventional spectral subtraction (SS) using a hangover (HO) and SS using a minimum statistics (MS) methods.

Noise-Biased Compensation of Minimum Statistics Method using a Nonlinear Function and A Priori Speech Absence Probability for Speech Enhancement (음질향상을 위해 비선형 함수와 사전 음성부재확률을 이용한 최소통계법의 잡음전력편의 보상방법)

  • Lee, Soo-Jeong;Lee, Gang-Seong;Kim, Sun-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.1
    • /
    • pp.77-83
    • /
    • 2009
  • This paper proposes a new noise-biased compensation of minimum statistics(MS) method using a nonlinear function and a priori speech absence probability(SAP) for speech enhancement in non-stationary noisy environments. The minimum statistics(MS) method is well known technique for noise power estimation in non-stationary noisy environments. It tends to bias the noise estimate below that of true noise level. The proposed method is combined with an adaptive parameter based on a sigmoid function and a priori speech absence probability (SAP) for biased compensation. Specifically. we apply the adaptive parameter according to the a posteriori SNR. In addition, when the a priori SAP equals unity, the adaptive biased compensation factor separately increases ${\delta}_{max}$ each frequency bin, and vice versa. We evaluate the estimation of noise power capability in highly non-stationary and various noise environments, the improvement in the segmental signal-to-noise ratio (SNR), and the Itakura-Saito Distortion Measure (ISDM) integrated into a spectral subtraction (SS). The results shows that our proposed method is superior to the conventional MS approach.

Robust Speech Enhancement Based on Soft Decision Employing Spectral Deviation (스펙트럼 변이를 이용한 Soft Decision 기반의 음성향상 기법)

  • Choi, Jae-Hun;Chang, Joon-Hyuk;Kim, Nam-Soo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.5
    • /
    • pp.222-228
    • /
    • 2010
  • In this paper, we propose a new approach to noise estimation incorporating spectral deviation with soft decision scheme to enhance the intelligibility of the degraded speech signal in non-stationary noisy environments. Since the conventional noise estimation technique based on soft decision scheme estimates and updates the noise power spectrum using a fixed smoothing parameter which was assumed in stationary noisy environments, it is difficult to obtain the robust estimates of noise power spectrum in non-stationary noisy environments that spectral characteristics of noise signal such as restaurant constantly change. In this paper, once we first classify the stationary noise and non-stationary noise environments based on the analysis of spectral deviation of noise signal, we adaptively estimate and update the noise power spectrum according to the classified noise types. The performances of the proposed algorithm are evaluated by ITU-T P. 862 perceptual evaluation of speech quality (PESQ) under various ambient noise environments and show better performances compared with the conventional method.

CASA Based Approach to Estimate Acoustic Transfer Function Ratios (CASA 기반의 마이크간 전달함수 비 추정 알고리즘)

  • Shin, Minkyu;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.1
    • /
    • pp.54-59
    • /
    • 2014
  • Identification of RTF (Relative Transfer Function) between sensors is essential to multichannel speech enhancement system. In this paper, we present an approach for estimating the relative transfer function of speech signal. This method adapts a CASA (Computational Auditory Scene Analysis) technique to the conventional OM-LSA (Optimally-Modified Log-Spectral Amplitude) based approach. Evaluation of the proposed approach is performed under simulated stationary and nonstationary WGN (White Gaussian Noise). Experimental results confirm advantages of the proposed approach.

Intelligent Adaptive Active Noise Control in Non-stationary Noise Environments (비정상 잡음환경에서의 지능형 적응 능동소음제어)

  • Mu, Xiangbin;Ko, JinSeok;Rheem, JaeYeol
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.5
    • /
    • pp.408-414
    • /
    • 2013
  • The famous filtered-x least mean square (FxLMS) algorithm for active noise control (ANC) systems may become unstable in non-stationary noise environment. To solve this problem, Sun's algorithm and Akhtar's algorithm are developed based on modifying the reference signal in update of FxLMS algorithm, but these two algorithms have dissatisfactory stability in dealing with sustaining impulsive noise. In proposed algorithm, probability estimation and zero-crossing rate (ZCR) control are used to improve the stability and performance, at the same time, an optimal parameter selection based on fuzzy system is utilized. Computer simulation results prove the proposed algorithm has faster convergence and better stability in non-stationary noise environment.

Speech enhancement based on reinforcement learning (강화학습 기반의 음성향상기법)

  • Park, Tae-Jun;Chang, Joon-Hyuk
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.05a
    • /
    • pp.335-337
    • /
    • 2018
  • 음성향상기법은 음성에 포함된 잡음이나 잔향을 제거하는 기술로써 마이크로폰으로 입력된 음성신호는 잡음이나 잔향에 의해 왜곡되어지므로 음성인식, 음성통신 등의 음성신호처리 기술의 핵심 기술이다. 이전에는 음성신호와 잡음신호 사이의 통계적 정보를 이용하는 통계모델 기반의 음성향상기법이 주로 사용되었으나 통계 모델 기반의 음성향상기술은 정상 잡음 환경과는 달리 비정상 잡음 환경에서 성능이 크게 저하되는 문제점을 가지고 있었다. 최근 머신러닝 기법인 심화신경망 (DNN, deep neural network)이 도입되어 음성 향상 기법에서 우수한 성능을 내고 있다. 심화신경망을 이용한 음성 향상 기법은 다수의 은닉 층과 은닉 노드들을 통하여 잡음이 존재하는 음성 신호와 잡음이 존재하지 않는 깨끗한 음성 신호 사이의 비선형적인 관계를 잘 모델링하였다. 이러한 심화신경망 기반의 음성향상기법을 향상 시킬 수 있는 방법 중 하나인 강화학습을 적용하여 기존 심화신경망 대비 성능을 향상시켰다. 강화학습이란 대표적으로 구글의 알파고에 적용된 기술로써 특정 state에서 최고의 reward를 받기 위해 어떠한 policy를 통한 action을 취해서 다음 state로 나아갈지를 매우 많은 경우에 대해 학습을 통해 최적의 action을 선택할 수 있도록 학습하는 방법을 말한다. 본 논문에서는 composite measure를 기반으로 reward를 설계하여 기존 PESQ (Perceptual Evaluation of Speech Quality) 기반의 reward를 설계한 기술 대비 음성인식 성능을 높였다.

Background Noise Classification in Noisy Speech of Short Time Duration Using Improved Speech Parameter (개량된 음성매개변수를 사용한 지속시간이 짧은 잡음음성 중의 배경잡음 분류)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.20 no.9
    • /
    • pp.1673-1678
    • /
    • 2016
  • In the area of the speech recognition processing, background noises are caused the incorrect response to the speech input, therefore the speech recognition rates are decreased by the background noises. Accordingly, a more high level noise processing techniques are required since these kinds of noise countermeasures are not simple. Therefore, this paper proposes an algorithm to distinguish between the stationary background noises or non-stationary background noises and the speech signal having short time duration in the noisy environments. The proposed algorithm uses the characteristic parameter of the improved speech signal as an important measure in order to distinguish different types of the background noises and the speech signals. Next, this algorithm estimates various kinds of the background noises using a multi-layer perceptron neural network. In this experiment, it was experimentally clear the estimation of the background noises and the speech signals.

Whitening Method for Performance Improvement of the Matched Filter in the Non-White Noise Environment (비백색 잡음 환경에서 정합필터 성능개선을 위한 백색화 기법)

  • Kim Jeong-Goo
    • Proceedings of the Korea Society for Industrial Systems Conference
    • /
    • 2006.05a
    • /
    • pp.111-114
    • /
    • 2006
  • 비백색잡음(non-white noise)인 잔향(reverberation)이 신호탐지(signal detection)의 주 방해신호인 천해 능동소나(active sonar) 환경에서의 표적탐지는 선백색화기(pre-whitening filter)를 사용하여 수신신호를 백색화한 후 백색잡음에서 최적 탐지기(optimum detector)인 정합필터를 사용한다. 그러나 이 방법은 잔향이 비정상(non-stationary) 특성을 가지기 때문에 구현이 매우 힘들다. 기존의 연구에 따르면 이러한 잔향은 지역적 정상상태(local stationary)라고 가정할 수 있다. 본 논문에서는 먼저 잔향신호의 지역적 정상상태의 범위를 추정(estimation)하고, 이 추정을 바탕으로 천해와 같은 비백색 잔향신호 환경에서 선백색화 블럭 정규화 정합필터(pre-whitening block normalized matched filter)의 성능을 개선할 수 있는 선백색화 기법을 제안하였다. 제안된 잔향신호의 백색화 기법은 표적신호 전 후의 잔향신호를 사용하여 처리블록(processing block)을 백색화하기 때문에 기존의 백색화 기법보다 우수한 성능을 보였다. 제안된 백색화 기법을 이용한 탐지기의 성능을 평가하기 위해 우리나라 인근해역에서 실측된 데이터를 이용하여 컴퓨터 모의실험을 수행하였다. 모의실험 결과 제안된 기법을 사용한 탐지기는 기존의 백색화 기법을 사용한 탐지기보다 우수한 탐지성능을 보였다.

  • PDF

Estimation and Weighting of Sub-band Reliability for Multi-band Speech Recognition (다중대역 음성인식을 위한 부대역 신뢰도의 추정 및 가중)

  • 조훈영;지상문;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.6
    • /
    • pp.552-558
    • /
    • 2002
  • Recently, based on the human speech recognition (HSR) model of Fletcher, the multi-band speech recognition has been intensively studied by many researchers. As a new automatic speech recognition (ASR) technique, the multi-band speech recognition splits the frequency domain into several sub-bands and recognizes each sub-band independently. The likelihood scores of sub-bands are weighted according to reliabilities of sub-bands and re-combined to make a final decision. This approach is known to be robust under noisy environments. When the noise is stationary a sub-band SNR can be estimated using the noise information in non-speech interval. However, if the noise is non-stationary it is not feasible to obtain the sub-band SNR. This paper proposes the inverse sub-band distance (ISD) weighting, where a distance of each sub-band is calculated by a stochastic matching of input feature vectors and hidden Markov models. The inverse distance is used as a sub-band weight. Experiments on 1500∼1800㎐ band-limited white noise and classical guitar sound revealed that the proposed method could represent the sub-band reliability effectively and improve the performance under both stationary and non-stationary band-limited noise environments.

A Land and Maritime Unified Tourism Information Guide System Based on Robust Speech Recognition in Ship Noise Environments (선박 잡음 환경에서의 강건한 음성 인식 기반 육해상 통합 관광 정보 안내 시스템)

  • Jeon, Kwang Myung;Lee, Jang Won;Park, Ji Hun;Lee, Seong Ro;Lee, Yeonwoo;Maeng, Se Young;Kim, Hong Kook
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.38C no.2
    • /
    • pp.189-195
    • /
    • 2013
  • In this paper, a land and maritime unified tourism information guide system is proposed which employs robust speech recognition in ship noise environments. Most of conventional front-ends for speech recognition have used a Wiener filter to compensate for stationary noise such as car or babble noises. However, such the conventional front-ends have limitation in reducing non-stationary noise that are occurred inside the ship on voyage. To overcome such a limitation, the proposed system incorporates nonlinear multi-band spectral subtraction to provide highly accurate tourism route recognition. It is shown from the experiment that compared to a conventional system the proposed system achieves relative improvement of a tourism route recognition rate by 5.54% under a noise condition of 10 dB signal-to-noise ratio (SNR).