• Title/Summary/Keyword: 음성검출기

Search Result 137, Processing Time 0.043 seconds

Design of a Statistical Model Based Voice Activity Detector (통계적 모델에 근거한 음성 검출기의 설계)

  • 손종서
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.465-469
    • /
    • 1998
  • 가변 전송율 음성 부호화기를 위한 음성 검출기를 통계적 모델을 적용하여 설계한다. 제안된 음성 검출기는 음성 파라미터를 decision-directed 방식으로 추정함으로써 LRT를 이용하여 동작 특성이 우수한 판정 규칙을 유도한다. 또한 음성 발생 사건들을 1차의 Markov process 로 모델링 함으로써 과거의 관찰들을 현재 프레임의 음성 검출 과정에서 고려할 수 있는 행오버 알고리즘을 개발한다. 개발된 음성 검출기는 고려된 실험환경에서 ITU-T 표준인 G.729 Annex B 음성 검출기보다 맹 우수한 성능을 나타내었다.

  • PDF

Statistical Model-Based Voice Activity Detection Using the Second-Order Conditional Maximum a Posteriori Criterion with Adapted Threshold (적응형 문턱값을 가지는 2차 조건 사후 최대 확률을 이용한 통계적 모델 기반의 음성 검출기)

  • Kim, Sang-Kyun;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.29 no.1
    • /
    • pp.76-81
    • /
    • 2010
  • In this paper, we propose a novel approach to improve the performance of a statistical model-based voice activity detection (VAD) which is based on the second-order conditional maximum a posteriori (CMAP). In our approach, the VAD decision rule is expressed as the geometric mean of likelihood ratios (LRs) based on adapted threshold according to the speech presence probability conditioned on both the current observation and the speech activity decisions in the pervious two frames. Experimental results show that the proposed approach yields better results compared to the statistical model-based and the CMAP-based VAD using the LR test.

Voice Activity Detection based on DBN using the Likelihood Ratio (우도비를 이용한 DBN 기반의 음성 검출기)

  • Kim, S.K.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.8 no.3
    • /
    • pp.145-150
    • /
    • 2014
  • In this paper, we propose a novel scheme to improve the performance of a voice activity detection(VAD) which is based on the deep belief networks(DBN) with the likelihood ratio(LR). The proposed algorithm applies the DBN learning method which is trained in order to minimize the probability of detection error instead of the conventional decision rule using geometric mean. Experimental results show that the proposed algorithm yields better results compared to the conventional VAD algorithm in various noise environments.

  • PDF

Voice Activity Detection Based on SVM Classifier Using Likelihood Ratio Feature Vector (우도비 특징 벡터를 이용한 SVM 기반의 음성 검출기)

  • Jo, Q-Haing;Kang, Sang-Ki;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.8
    • /
    • pp.397-402
    • /
    • 2007
  • In this paper, we apply a support vector machine(SVM) that incorporates an optimized nonlinear decision rule over different sets of feature vectors to improve the performance of statistical model-based voice activity detection(VAD). Conventional method performs VAD through setting up statistical models for each case of speech absence and presence assumption and comparing the geometric mean of the likelihood ratio (LR) for the individual frequency band extracted from input signal with the given threshold. We propose a novel VAD technique based on SVM by treating the LRs computed in each frequency bin as the elements of feature vector to minimize classification error probability instead of the conventional decision rule using geometric mean. As a result of experiments, the performance of SVM-based VAD using the proposed feature has shown better results compared with those of reported VADs in various noise environments.

Voice Activity Detection Based on Discriminative Weight Training with Feedback (궤환구조를 가지는 변별적 가중치 학습에 기반한 음성검출기)

  • Kang, Sang-Ick;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.8
    • /
    • pp.443-449
    • /
    • 2008
  • One of the key issues in practical speech processing is to achieve robust Voice Activity Deteciton (VAD) against the background noise. Most of the statistical model-based approaches have tried to employ equally weighted likelihood ratios (LRs), which, however, deviates from the real observation. Furthermore voice activities in the adjacent frames have strong correlation. In other words, the current frame is highly correlated with previous frame. In this paper, we propose the effective VAD approach based on a minimum classification error (MCE) method which is different from the previous works in that different weights are assigned to both the likelihood ratio on the current frame and the decision statistics of the previous frame.

Voice Activity Detection Algorithm base on Radial Basis Function Networks with Dual Threshold (Radial Basis Function Networks를 이용한 이중 임계값 방식의 음성구간 검출기)

  • Kim Hong lk;Park Sung Kwon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.12C
    • /
    • pp.1660-1668
    • /
    • 2004
  • This paper proposes a Voice Activity Detection (VAD) algorithm based on Radial Basis Function (RBF) network using dual threshold. The k-means clustering and Least Mean Square (LMS) algorithm are used to upade the RBF network to the underlying speech condition. The inputs for RBF are the three parameters in a Code Exited Linear Prediction (CELP) coder, which works stably under various background noise levels. Dual hangover threshold applies in BRF-VAD for reducing error, because threshold value has trade off effect in VAD decision. The experimental result show that the proposed VAD algorithm achieves better performance than G.729 Annex B at any noise level.

A New Statistical Voice Activity Detector Based on UMP Test (UMP 테스트에 근거한 새로운 통계적 음성검출기)

  • Jang, Keun-Won;Chang, Joon-Hyuk;Kim, Dong-Kook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.1
    • /
    • pp.16-24
    • /
    • 2007
  • Voice activity detectors (VADs) are important in wireless communication and speech signal processing. In the conventional VAD methods. an expression for the likelihood ratio test (LRT) based on statistical models is derived. Then, speech or noise is decided by comparing the value of the expression with a threshold. We propose a new method with the modified decision rule based on the Gaussian distribution and the uniformly most power (UMP) test. This method requires the distribution of the absolute value of the incoming speech signal. Then we can obtain the final decision through the relation between the Rayleigh distributions. This VAD method can detect speech without a priori signal-to-noise ratio (SNR) which is required in the conventional VAD algorithms. Additionally, in the various VAD performance tests, the proposed VAD method is shown to be more effective than the traditional scheme.

Robust End Point Detection for Robot Speech Recognition Using Double Talk Detection (음성인식 로봇을 위한 동시통화검출 기반의 강인한 음성 끝점 검출)

  • Moon, Sung-Kyu;Park, Jin-Soo;Ko, Han-Seok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.3
    • /
    • pp.161-169
    • /
    • 2012
  • This paper presents a robust speech end-point detector using double talk detection in echoic conditioned speech recognition robot. The proposed method consists of combining conventional end-point detector result and double talk detector result. We have tested the proposed method in isolated word recognition system under echoic conditioned environment. As a result, the proposed algorithm shows superior performance of 30 % to the available techniques in the points of speech recognition rates.

Discriminative Weight Training for a Statistical Model-Based Voice Activity Detection (통계적 모델 기반의 음성 검출기를 위한 변별적 가중치 학습)

  • Kang, Sang-Ick;Jo, Q-Haing;Park, Seung-Seop;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.5
    • /
    • pp.194-198
    • /
    • 2007
  • In this paper, we apply a discriminative weight training to a statistical model-based voice activity detection(VAD). In our approach, the VAD decision rule is expressed as the geometric mean of optimally weighted likelihood ratios(LRs) based on a minimum classification error(MCE) method which is different from the previous works in that different weights are assigned to each frequency bin which is considered more realistic. According to the experimental results, the proposed approach is found to be effective for the statistical model-based VAD using the LR test.

Statistical Voice Activity Defector Based on Signal Subspace Model (신호 준공간 모델에 기반한 통계적 음성 검출기)

  • Ryu, Kwang-Chun;Kim, Dong-Kook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.7
    • /
    • pp.372-378
    • /
    • 2008
  • Voice activity detectors (VAD) are important in wireless communication and speech signal processing, In the conventional VAD methods, an expression for the likelihood ratio test (LRT) based on statistical models is derived in discrete Fourier transform (DFT) domain, Then, speech or noise is decided by comparing the value of the expression with a threshold, This paper presents a new statistical VAD method based on a signal subspace approach, The probabilistic principal component analysis (PPCA) is employed to obtain a signal subspace model that incorporates probabilistic model of noisy signal to the signal subspace method, The proposed approach provides a novel decision rule based on LRT in the signal subspace domain, Experimental results show that the proposed signal subspace model based VAD method outperforms those based on the widely used Gaussian distribution in DFT domain.