• 제목/요약/키워드: Speech Enhancement

검색결과 340건 처리시간 0.029초

2차원 이진 마스크를 이용한 적응형 음성향상 잡음 제거기 (Adaptive Noise Canceller for Speech Enhancement Using 2-D Binary Mask)

  • 이기현;이정현;조진호;김명남
    • 한국멀티미디어학회논문지
    • /
    • 제19권7호
    • /
    • pp.1127-1136
    • /
    • 2016
  • Speech enhancement algorithm plays an important role in numerous speech signal processing applications. Over the last few decades, many algorithms have been studied for speech enhancement. The algorithms are based on spectral subtraction, Wiener filter, and subspace method etc. They have good performance of speech enhancement, but the performance can be deteriorated in specific noises or low SNR environment. In this paper, a new speech enhancement algorithms are proposed based on adaptive noise canceller. And the proposed algorithm improved performance of adaptive noise cancelling using 2-D binary mask. From objective experimental index, it is confirmed that the proposed algorithm is useful and has better performance than recently proposed speech enhancement algorithms.

코드북 기반 음성향상 기법을 위한 게인 보상 방법 (Gain Compensation Method for Codebook-Based Speech Enhancement)

  • 정승모;김무영
    • 전자공학회논문지
    • /
    • 제51권9호
    • /
    • pp.165-170
    • /
    • 2014
  • 음성 인식을 위한 전처리기로 주변 잡음을 제거해 주는 음성향상 기법이 강조되고 있다. 다양한 음성향상 기법들 중 코드북 기반 음성향상 기법은 nonstationary 잡음 환경에서도 효율적으로 동작한다. 하지만, 기존 코드북 기반 음성향상 기법에서는 입력 신호와 음성 및 잡음 코드벡터 간에 미스매치가 발생하여 부정확한 게인이 추정되는 문제가 있다. 본 논문에서는 부정확한 게인을 보상하기 위해 long-term 잡음 추정 알고리즘을 사용하여 매 프레임 별로 신호 대 잡음비기반의 Normalized Weighting Factor (NWF)를 구하고, 이것을 기존 게인에 보상하는 방식을 제안한다. 제안된 코드북 기반 음성향상 기법은 기존 코드북 기반 음성향상 기법에 비해 향상된 성능을 보였다.

자동 음성 인식기를 위한 단채널 음질 향상 알고리즘의 성능 분석 (Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition)

  • 송명석;이창헌;이석필;강홍구
    • The Journal of the Acoustical Society of Korea
    • /
    • 제29권2E호
    • /
    • pp.86-99
    • /
    • 2010
  • This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.

Two-Microphone Generalized Sidelobe Canceller with Post-Filter Based Speech Enhancement in Composite Noise

  • Park, Jinsoo;Kim, Wooil;Han, David K.;Ko, Hanseok
    • ETRI Journal
    • /
    • 제38권2호
    • /
    • pp.366-375
    • /
    • 2016
  • This paper describes an algorithm to suppress composite noise in a two-microphone speech enhancement system for robust hands-free speech communication. The proposed algorithm has four stages. The first stage estimates the power spectral density of the residual stationary noise, which is based on the detection of nonstationary signal-dominant time-frequency bins (TFBs) at the generalized sidelobe canceller output. Second, speech-dominant TFBs are identified among the previously detected nonstationary signal-dominant TFBs, and power spectral densities of speech and residual nonstationary noise are estimated. In the final stage, the bin-wise output signal-to-noise ratio is obtained with these power estimates and a Wiener post-filter is constructed to attenuate the residual noise. Compared to the conventional beamforming and post-filter algorithms, the proposed speech enhancement algorithm shows significant performance improvement in terms of perceptual evaluation of speech quality.

Filtering of a Dissonant Frequency for Speech Enhancement

  • Kang, Sang-Ki;Baek, Seong-Joon;Lee, Ki-Yong;Sun, Koeng-Mo
    • The Journal of the Acoustical Society of Korea
    • /
    • 제22권3E호
    • /
    • pp.110-112
    • /
    • 2003
  • There have been numerous studies on the enhancement of the noisy speech signal. In this paper, we propose a completely new speech enhancement scheme, that is, a filtering of a dissonant frequency (especially F# in each octave of the tempered scale) based on the fundamental frequency which is developed in frequency domain. In order to evaluate the performance of the proposed enhancement scheme, subjective tests (MOS tests) were conducted. The subjective test results indicate that the proposed method provides a significant gain in audible improvement especially for speech contaminated by colored noise and speaking in a husky voice. Therefore when the filter is employed as a pre-filter for speech enhancement, the output speech quality and intelligibility is greatly enhanced.

Adaptive Wavelet Based Speech Enhancement with Robust VAD in Non-stationary Noise Environment

  • Sungwook Chang;Sungil Jung;Younghun Kwon;Yang, Sung-il
    • The Journal of the Acoustical Society of Korea
    • /
    • 제22권4E호
    • /
    • pp.161-166
    • /
    • 2003
  • We present an adaptive wavelet packet based speech enhancement method with robust voice activity detection (VAD) in non-stationary noise environment. The proposed method can be divided into two main procedures. The first procedure is a VAD with adaptive wavelet packet transform. And the other is a speech enhancement procedure based on the proposed VAD method. The proposed VAD method shows remarkable performance even in low SNRs and non-stationary noise environment. And subjective evaluation shows that the performance of the proposed speech enhancement method with wavelet bases is better than that with Fourier basis.

딥 뉴럴 네트워크 기반의 음성 향상을 위한 데이터 증강 (Data Augmentation for DNN-based Speech Enhancement)

  • 이승관;이상민
    • 한국멀티미디어학회논문지
    • /
    • 제22권7호
    • /
    • pp.749-758
    • /
    • 2019
  • This paper proposes a data augmentation algorithm to improve the performance of DNN(Deep Neural Network) based speech enhancement. Many deep learning models are exploring algorithms to maximize the performance in limited amount of data. The most commonly used algorithm is the data augmentation which is the technique artificially increases the amount of data. For the effective data augmentation algorithm, we used a formant enhancement method that assign the different weights to the formant frequencies. The DNN model which is trained using the proposed data augmentation algorithm was evaluated in various noise environments. The speech enhancement performance of the DNN model with the proposed data augmentation algorithm was compared with the algorithms which are the DNN model with the conventional data augmentation and without the data augmentation. As a result, the proposed data augmentation algorithm showed the higher speech enhancement performance than the other algorithms.

A User friendly Remote Speech Input Unit in Spontaneous Speech Translation System

  • 이광석;김흥준;송진국;추연규
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국해양정보통신학회 2008년도 춘계종합학술대회 A
    • /
    • pp.784-788
    • /
    • 2008
  • In this research, we propose a remote speech input unit, a new method of user-friendly speech input in speech recognition system. We focused the user friendliness on hands-free and microphone independence in speech recognition applications. Our module adopts two algorithms, the automatic speech detection and speech enhancement based on the microphone array-based beamforming method. In the performance evaluation of speech detection, within-200msec accuracy with respect to the manually detected positions is about 97percent under the noise environments of 25dB of the SNR. The microphone array-based speech enhancement using the delay-and-sum beamforming algorithm shows about 6dB of maximum SNR gain over a single microphone and more than 12% of error reduction rate in speech recognition.

  • PDF

Smoothed Global Soft Decision에 근거한 음성 향상 기법 (Speech Enhancement based on Smoothed Global Soft Decision)

  • 조규행;박윤식;장준혁
    • 대한전자공학회논문지SP
    • /
    • 제44권6호
    • /
    • pp.118-123
    • /
    • 2007
  • 본 논문에서는 잡음 환경에서의 음성 향상을 위해 향상된 Global Soft Decision (GSD) 기법을 제안한다. 통계적 모델을 바탕으로 한 음성 향상과 관련한 연구에서 GSD는 음성의 꼬리 부분에서 취약하다고 알려져 있으며, 이를 개선하기 위해 Smoothed Global Likelihood Ratio (SGLR)를 바탕으로 한 새로운 음성 향상 기법을 GSD에 적용한다. 제안된 방법은 다양한 잡음 환경에서 MOS 실험을 바탕으로 기존의 연구와 비교하였으며 우수한 성능을 보여주었다.

적응디지털필터를 사용한 음질향상 방법 (A New Speech Enhancement Method Using Adaptive Digital Filter)

  • 임용훈;김완구;차일환;윤대희
    • 전자공학회논문지B
    • /
    • 제30B권10호
    • /
    • pp.35-41
    • /
    • 1993
  • In this paper, a new speech enhancement method for speech signal corrupted by environmental noise is proposed. Two signals are obtained from the microphone and from the accelerometer attached to the neck, respectively. Since two signals are generated from same source signal, both signals are closely correlated. And environmental noise has no effect on the accelerometer signal. The speech enhancement system identifies the optimum linear system between two signals on the basis of the dependence between the signals. The enhanced speech can be obtained by filtering the noise-free accelerometer signal. Since the characteristcs of the speech signal and environmental noise are changing with time, adaptive filtering system has to be used for characterizing the time-varing system. Simulation results show 7dB enhancement with 0dB speech signal level relative to the white noise.

  • PDF