• Title/Summary/Keyword: speech enhancement

Search Result 340, Processing Time 0.029 seconds

Two-Microphone Binary Mask Speech Enhancement in Diffuse and Directional Noise Fields

  • Abdipour, Roohollah;Akbari, Ahmad;Rahmani, Mohsen
    • ETRI Journal
    • /
    • v.36 no.5
    • /
    • pp.772-782
    • /
    • 2014
  • Two-microphone binary mask speech enhancement (2mBMSE) has been of particular interest in recent literature and has shown promising results. Current 2mBMSE systems rely on spatial cues of speech and noise sources. Although these cues are helpful for directional noise sources, they lose their efficiency in diffuse noise fields. We propose a new system that is effective in both directional and diffuse noise conditions. The system exploits two features. The first determines whether a given time-frequency (T-F) unit of the input spectrum is dominated by a diffuse or directional source. A diffuse signal is certainly a noise signal, but a directional signal could correspond to a noise or speech source. The second feature discriminates between T-F units dominated by speech or directional noise signals. Speech enhancement is performed using a binary mask, calculated based on the proposed features. In both directional and diffuse noise fields, the proposed system segregates speech T-F units with hit rates above 85%. It outperforms previous solutions in terms of signal-to-noise ratio and perceptual evaluation of speech quality improvement, especially in diffuse noise conditions.

An Enhanced Clarity of Husky Voice by Dissonant Frequency Filtering

  • Kang, Sang-Ki;Baek, Seong-Joon
    • Speech Sciences
    • /
    • v.12 no.4
    • /
    • pp.71-76
    • /
    • 2005
  • There have been numerous studies on the enhancement of noisy speech signal. In this paper, we propose a new speech enhancement method, that is, a filtering of a dissonant frequency combined with noise suppression algorithm. The simulation results indicate that the proposed method provides a significant gain in voice clarity. Therefore if the proposed enhancement scheme is used as a pre-filter, the perceptual clarity of husky voice is greatly enhanced.

  • PDF

Speech Enhancement Using Receding Horizon FIR Filtering

  • Kim, Pyung-Soo;Kwon, Wook-Hyu;Kwon, Oh-Kyu
    • Transactions on Control, Automation and Systems Engineering
    • /
    • v.2 no.1
    • /
    • pp.7-12
    • /
    • 2000
  • A new speech enhancement algorithm for speech corrupted by slowly varying additive colored noise is suggested based on a state-space signal model. Due to the FIR structure and the unimportance of long-term past information, the receding horizon (RH) FIR filter known to be a best linear unbiased estimation (BLUE) filter is utilized in order to obtain noise-suppressed speech signal. As a special case of the colored noise problem, the suggested approach is generalized to perform the single blind signal separation of two speech signals. It is shown that the exact speech signal is obtained when an incoming speech signal is noise-free.

  • PDF

Minima Controlled Speech Presence Uncertainty Tracking Method for Speech Enhancement (음성 향상을 위한 최소값 제어 음성 존재 부정확성의 추적기법)

  • Lee, Woo-Jung;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.668-673
    • /
    • 2009
  • In this paper, we propose the minima controlled speech presence uncertainty tracking method to improve a speech enhancement. In the conventional tracking speech presence uncertainty, we propose a method for estimating distinct values of the a priori speech absence probability for different frames and channels. This estimation is inherently based on a posteriori SNR and used in estimating the speech absence probability (SAP). In this paper, we propose a novel estimation of distinct values of the a priori speech absence probability, which is based on minima controlled speech presence uncertainty tracking method, for different frames and channels. Subsequently, estimation is applied to the calculation of speech absence probability for speech enhancement. Performance of the proposed enhancement algorithm is evaluated by ITU-T P. 862 perceptual evaluation of speech quality (PESQ) under various noise environments. We show that the proposed algorithm yields better results compared to the conventional tracking speech presence uncertainty.

Speech Enhancement Using Microphone Array with MMSE-STSA Estimator Based Post-Processing (MMSE-STSA 추정치에 기반한 후처리를 갖는 마이크로폰 배열을 이용한 음성 개선)

  • Kwon Hong Seok;Son Jong Mok;Bae Keun Sung
    • Proceedings of the KSPS conference
    • /
    • 2002.11a
    • /
    • pp.187-190
    • /
    • 2002
  • In this paper, a speech enhancement system using microphone array with MMSE-STSA (Minimum Mean Square Error-Short Time Spectral Amplitude) estimator based post-processing is proposed. Speech enhancement is first carried out by conventional delay-and-sum beamforming (DSB). A new MMSE-STSA estimator is then obtained by refining MMSE-STSA estimators from each microphone, which is applied to the output of conventional DSB to obtain additional speech enhancement. Computer simulation for white and pink noises show that the proposed system is superior to other approaches.

  • PDF

Critical Banded Wavelet Packet-Based Spectral Subtractions for Speech Enhancement (음성신호개선을 위한 임계대역 웨이블렛 패킷 기반의 스펙트럼 차감법)

  • Chang, Sung-Wook;Yang, Sung-Il
    • The Journal of the Acoustical Society of Korea
    • /
    • v.23 no.4E
    • /
    • pp.125-133
    • /
    • 2004
  • In this paper, we propose a critical banded wavelet packet-based spectral subtraction for speech enhancement. Critical banded wavelet packet, which reflects the human auditory system, may lead to minimization of intelligibility loss and quality improvement of the enhanced speech in the spectral domain, when combined with an appropriate spectral subtraction gain function. The proposed method shows better performance than the conventional one in comparative assessments. We also show that, for effective evaluation of enhanced speech, it is essential to consider the characteristics of speech quality measures.

Enhancement of Excitation in Low-bit-rate Speech Coders (저 전송률 음성 부호화기를 위한 여기 신호 개선 알고리즘에 관한 연구)

  • 이미숙;김홍국;최승호;김도영
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.57-60
    • /
    • 2003
  • In this paper, we propose a new excitation enhancement technique to improve the speech quality of low bit rate speech coders. The proposed technique is based on a harmonic model and it is employed only in the decoding process of speech coders without any additional bits. We develop the procedure of harmonic model parameters estimation and harmonic generation. and apply the technique to a current state of the art low bit rate speech coder, ITU-T G.729 Annex D. Also its performance is measured by using the ITU-T P.862 PESQ score and compared to those of the phase dispersion filter and the long-term postfilter applied to the decoded excitation. It is shown that the proposed excitation enhancement technique can improve the quality of decoded speech and provide better quality for male speech than other techniques.

  • PDF

A Speech Enhancement Algorithm based on Human Psychoacoustic Property (심리음향 특성을 이용한 음성 향상 알고리즘)

  • Jeon, Yu-Yong;Lee, Sang-Min
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.59 no.6
    • /
    • pp.1120-1125
    • /
    • 2010
  • In the speech system, for example hearing aid as well as speech communication, speech quality is degraded by environmental noise. In this study, to enhance the speech quality which is degraded by environmental speech, we proposed an algorithm to reduce the noise and reinforce the speech. The minima controlled recursive averaging (MCRA) algorithm is used to estimate the noise spectrum and spectral weighting factor is used to reduce the noise. And partial masking effect which is one of the human hearing properties is introduced to reinforce the speech. Then we compared the waveform, spectrogram, Perceptual Evaluation of Speech Quality (PESQ) and segmental Signal to Noise Ratio (segSNR) between original speech, noisy speech, noise reduced speech and enhanced speech by proposed method. As a result, enhanced speech by proposed method is reinforced in high frequency which is degraded by noise, and PESQ, segSNR is enhanced. It means that the speech quality is enhanced.

Speech Enhancement Using Phase-Dependent A Priori SNR Estimator in Log-Mel Spectral Domain

  • Lee, Yun-Kyung;Park, Jeon Gue;Lee, Yun Keun;Kwon, Oh-Wook
    • ETRI Journal
    • /
    • v.36 no.5
    • /
    • pp.721-729
    • /
    • 2014
  • We propose a novel phase-based method for single-channel speech enhancement to extract and enhance the desired signals in noisy environments by utilizing the phase information. In the method, a phase-dependent a priori signal-to-noise ratio (SNR) is estimated in the log-mel spectral domain to utilize both the magnitude and phase information of input speech signals. The phase-dependent estimator is incorporated into the conventional magnitude-based decision-directed approach that recursively computes the a priori SNR from noisy speech. Additionally, we reduce the performance degradation owing to the one-frame delay of the estimated phase-dependent a priori SNR by using a minimum mean square error (MMSE)-based and maximum a posteriori (MAP)-based estimator. In our speech enhancement experiments, the proposed phase-dependent a priori SNR estimator is shown to improve the output SNR by 2.6 dB for both the MMSE-based and MAP-based estimator cases as compared to a conventional magnitude-based estimator.

Method for Spectral Enhancement by Binary Mask for Speech Recognition Enhancement Under Noise Environment (잡음환경에서 음성인식 성능향상을 위한 바이너리 마스크를 이용한 스펙트럼 향상 방법)

  • Choi, Gab-Keun;Kim, Soon-Hyob
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.7
    • /
    • pp.468-474
    • /
    • 2010
  • The major factor that disturbs practical use of speech recognition is distortion by the ambient and channel noises. Generally, the ambient noise drops the performance and restricts places to use. DSR (Distributed Speech Recognition) based speech recognition also has this problem. Various noise cancelling algorithms are applied to solve this problem, but loss of spectrum and remaining noise by incorrect noise estimation at low SNR environments cause drop of recognition rate. This paper proposes methods for speech enhancement. This method uses MMSE-STSA for noise cancelling and ideal binary mask to compensate damaged spectrum. According to experiments at noisy environment (SNR 15 dB ~ 0 dB), the proposed methods showed better spectral results and recognition performance.