• Title/Summary/Keyword: 음성 존재 확률

Search Result 32, Processing Time 0.026 seconds

An Optimally-Modified Multichannel Wiener Filter Using Speech Presence Probability (음성존재확률을 이용한 최적 변형 다채널 위너 필터)

  • Jeong, Sangbae;Kim, Youngil
    • Smart Media Journal
    • /
    • v.7 no.3
    • /
    • pp.9-15
    • /
    • 2018
  • This paper proposes an optimal gain modification method of the Multichannel Wiener filter (MWF) using speech presence probabilities. Conventional gain modification methods of MWFs have the problem of the increase of speech distortions while reducing residual noises with its relative heuristic approach. However, the proposed optimal gain modification method, derived by solving the unconstrained minimization problem of the probability-involved cost function, reduces amounts of residual noises and signal distortions simultaneously. Through an evaluation of the filtered waveforms and spectrograms, it is verified that the proposed method results in an improved SNR with less signal distortions compared to the conventional MWF.

Minima Controlled Speech Presence Uncertainty Tracking Method for Speech Enhancement (음성 향상을 위한 최소값 제어 음성 존재 부정확성의 추적기법)

  • Lee, Woo-Jung;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.7
    • /
    • pp.668-673
    • /
    • 2009
  • In this paper, we propose the minima controlled speech presence uncertainty tracking method to improve a speech enhancement. In the conventional tracking speech presence uncertainty, we propose a method for estimating distinct values of the a priori speech absence probability for different frames and channels. This estimation is inherently based on a posteriori SNR and used in estimating the speech absence probability (SAP). In this paper, we propose a novel estimation of distinct values of the a priori speech absence probability, which is based on minima controlled speech presence uncertainty tracking method, for different frames and channels. Subsequently, estimation is applied to the calculation of speech absence probability for speech enhancement. Performance of the proposed enhancement algorithm is evaluated by ITU-T P. 862 perceptual evaluation of speech quality (PESQ) under various noise environments. We show that the proposed algorithm yields better results compared to the conventional tracking speech presence uncertainty.

Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition (이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출)

  • Shin, Min-Hwa;Park, Ji-Hun;Kim, Hong-Kook
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2010.07a
    • /
    • pp.150-151
    • /
    • 2010
  • 본 논문에서는 잡음환경에서의 이중채널 음성인식을 위한 통계모델 기반 음성구간 검출 방법을 제안한다. 제안된 방법에서는 다채널 입력 신호로부터 얻어진 공간정보를 이용하여 음성 존재 및 부재 확률모델을 구하고 이를 통해 음성구간 검출을 행한다. 이때, 공간정보는 두 채널간의 상호 시간 차이와 상호 크기 차이로, 음성 존재 및 부재 확률은 가우시안 커널 밀도 기반의 확률모델로 표현된다. 그리고 음성구간은 각 시간 프레임 별 음성 존재 확률 대비 음성 부재 확률의 비를 추정하여 검출된다. 제안된 음성구간 검출 방법의 평가를 위해 검출된 구간만을 입력으로 하는 음성인식 성능을 측정한다. 실험결과, 제안된 공간정보를 이용하는 통계모델 기반의 음성구간 검출 방법이 주파수 에너지를 이용하는 통계모델 기반의 음성구간 검출 방법과 주파수 스펙트럼 밀도 기반 음성구간 검출 방법에 비해 각각 15.6%, 15.4%의 상대적 오인식률 개선을 보였다.

  • PDF

Determinant-based two-channel noise reduction method using speech presence probability (음성존재확률을 이용한 행렬식 기반 2채널 잡음제거기법)

  • Park, Jinuk;Hong, Jungpyo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.26 no.5
    • /
    • pp.649-655
    • /
    • 2022
  • In this paper, a determinant-based two-channel noise reduction method which utilizes speech presence probability (SPP) is proposed. The proposed method improves noise reduction performance from the conventional determinant-based two-channel noise reduction method in [7] by applying SPP to the Wiener filter gain. Consequently, the proposed method adaptively controls the amount of noise reduction depending on the SPP. For performance evaluation, the segmental signal-to-noise ratio (SNR), the perceptual evaluation of speech quality, the short time objective intelligibility, and the log spectral distance were measured in the simulated noisy environments considered various types of noise, reverberation, SNR, and the direction and number of noise sources. The experimental results presented that determinant-based methods outperform phase difference-based methods in most cases. In particular, the proposed method achieved the best noise reduction performance maintaining minimum speech distortion.

Speech Enhancement based on Minima Controlled Recursive Averaging Technique Incorporating Second-order Conditional Maximum a posteriori Criterion (2차 조건 사후 최대 확률 기반 최소값 제어 재귀평균기법을 이용한 음성향상)

  • Kum, Jong-Mo;Chang, Joon-Hyuk
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.4
    • /
    • pp.132-138
    • /
    • 2009
  • In this paper, we propose a novel approach to improve the performance of minima controlled recursive averaging (MCRA) which is based on the second-order conditional maximum a posteriori (CMAP). From an investigation of the MCRA scheme, it is discovered that the MCRA method cannot take full consideration of the inter-frame correlation of voice activity since the noise power estimate is adjusted by the speech presence probability depending on an observation of the current frame. To avoid this phenomenon, the proposed MCRA approach incorporates the second-order CMAP criterion in which the noise power estimate is obtained using the speech presence probability conditioned on both the current observation and the speech activity decisions in the previous two frames. Experimental results show that the proposed MCRA technique based on second-order conditional MAP yields better results compared to the conventional MCRA method.

Speech Enhancement Based on Minima Controlled Recursive Averaging Technique Incorporating Conditional MAP (조건 사후 최대 확률 기반 최소값 제어 재귀평균기법을 이용한 음성향상)

  • Kum, Jong-Mo;Park, Yun-Sik;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.5
    • /
    • pp.256-261
    • /
    • 2008
  • In this paper, we propose a novel approach to improve the performance of minima controlled recursive averaging (MCRA) which is based on the conditional maximum a posteriori criterion. A crucial component of a practical speech enhancement system is the estimation of the noise power spectrum. One state-of-the-art approach is the minima controlled recursive averaging (MCRA) technique. The noise estimate in the MCRA technique is obtained by averaging past spectral power values based on a smoothing parameter that is adjusted by the signal presence probability in frequency subbands. We improve the MCRA using the speech presence probability which is the a posteriori probability conditioned on both the current observation the speech presence or absence of the previous frame. With the performance criteria of the ITU-T P.862 perceptual evaluation of speech quality (PESQ) and subjective evaluation of speech quality, we show that the proposed algorithm yields better results compared to the conventional MCRA-based scheme.

Global Soft Decision Based on Improved Speech Presence Uncertainty Tracking Method Incorporating Spectral Gradient (스펙트럼 변이 기반의 향상된 음성 존재 불확실성 추적 기법을 이용한 Global Soft Decision)

  • Kim, Jong-Woong;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.32 no.3
    • /
    • pp.279-285
    • /
    • 2013
  • In this paper, we propose a novel speech enhancement method to improve the performance of the conventional global soft decision which is based on the spectral gradient method applied to the ratio of a priori speech absence and presence probability value (q). Conventional global soft decision scheme used a fixed value of q in accordance with the hypothesis assumed, but the proposed algorithm is a technique for improving the speech absence probability which is applied adaptively variable value of q according to the speech presence or absence in the previous two frames and the conditions of the spectral gradient value. Experimental results show that the proposed improved global soft decision method based on the spectral gradient method yields better results compared to the conventional global soft decision technique based on the performance criteria of the ITU-T P. 862 PESQ (Perceptual Evaluation of Speech Quality).

Statistical Model-Based Voice Activity Detection Using the Second-Order Conditional Maximum a Posteriori Criterion with Adapted Threshold (적응형 문턱값을 가지는 2차 조건 사후 최대 확률을 이용한 통계적 모델 기반의 음성 검출기)

  • Kim, Sang-Kyun;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.1
    • /
    • pp.76-81
    • /
    • 2010
  • In this paper, we propose a novel approach to improve the performance of a statistical model-based voice activity detection (VAD) which is based on the second-order conditional maximum a posteriori (CMAP). In our approach, the VAD decision rule is expressed as the geometric mean of likelihood ratios (LRs) based on adapted threshold according to the speech presence probability conditioned on both the current observation and the speech activity decisions in the pervious two frames. Experimental results show that the proposed approach yields better results compared to the statistical model-based and the CMAP-based VAD using the LR test.

Improved speech enhancement of multi-channel Wiener filter using adjustment of principal subspace vector (다채널 위너 필터의 주성분 부공간 벡터 보정을 통한 잡음 제거 성능 개선)

  • Kim, Gibak
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.5
    • /
    • pp.490-496
    • /
    • 2020
  • We present a method to improve the performance of the multi-channel Wiener filter in noisy environment. To build subspace-based multi-channel Wiener filter, in the case of single target source, the target speech component can be effectively estimated in the principal subspace of speech correlation matrix. The speech correlation matrix can be estimated by subtracting noise correlation matrix from signal correlation matrix based on the assumption that the cross-correlation between speech and interfering noise is negligible compared with speech correlation. However, this assumption is not valid in the presence of strong interfering noise and significant error can be induced in the principal subspace accordingly. In this paper, we propose to adjust the principal subspace vector using speech presence probability and the steering vector for the desired speech source. The multi-channel speech presence probability is derived in the principal subspace and applied to adjust the principal subspace vector. Simulation results show that the proposed method improves the performance of multi-channel Wiener filter in noisy environment.

Improved Global-Soft Decision Incorporating Second-Order Conditional MAP for Speech Enhancement (음성향상을 위한 2차 조건 사후 최대 확률기법 기반 Global Soft Decision)

  • Kum, Jong-Mo;Chang, Joon-Hyuk
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.6C
    • /
    • pp.588-592
    • /
    • 2009
  • In this paper, we propose a novel method to improve the performance of the global soft decision which is based on the second-order conditional maximum a posteriori (CMAP). Conventional global soft decision scheme has an disadvantage in that the speech absence probability adjusted by a fixed-parameter was sensitive to the various noise environments. In proposed approach using the second-order CMAP, speech absence probability value is more flexible which exploit not only the current observation but also the speech activity decisions in the previous two frames. Experimental results show that the proposed improved global soft decision method based on second-order conditional MAP yields better results compared to the conventional global soft decision technique with the performance criteria of the ITU-T P. 862 perceptual evaluation of speech quality (PESQ).