• Title/Summary/Keyword: Voice activity detection (VAD)

Search Result 60, Processing Time 0.018 seconds

A Comparative Study of Voice Activity Detection Algorithms in Adverse Environments (잡음 환경에서의 음성 검출 알고리즘 비교 연구)

  • Yang Kyong-Chul;Yook Dong-Suk
    • Proceedings of the KSPS conference
    • /
    • 2006.05a
    • /
    • pp.45-48
    • /
    • 2006
  • As the speech recognition systems are used in many emerging applications, robust performance of speech recognition systems under extremely noisy conditions become more important. The voice activity detection (VAD) has been taken into account as one of the important factors for robust speech recognition. In this paper, we investigate conventional VAD algorithms and analyze the weak and the strong points of each algorithm.

  • PDF

VAD By Neural Network Under Wireless Communication Systems (Neural Network을 이용한 무선 통신시스템에서의 VAD)

  • Lee Hosun;Kim Sukyung;Park Sung-Kwon
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.12C
    • /
    • pp.1262-1267
    • /
    • 2005
  • Elliptical basis function (EBF) neural network works stably under high-level background noise environment and makes the nonlinear processing possible. It can be adapted real time VAD with simple design. This paper introduces VAD implementation using EBF and the experimental results show that EBF VAD outperforms G729 Annex B and RBF neural networks. The best error rates achieved by the EBF networks were improved more than $70\%$ in speech and $50\%$ in silence while that achieved by G.729 Annex B and RBF networks respectively.

A Single Channel Voice Activity Detection for Noisy Environments Using Wavelet Packet Decomposition and Teager Energy (웨이블렛 패킷 변환과 Teager 에너지를 이용한 잡음 환경에서의 단일 채널 음성 판별)

  • Koo, Boneung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.33 no.2
    • /
    • pp.139-145
    • /
    • 2014
  • In this paper, a feature parameter is obtained by applying the Teager energy to the WPD(Wavelet Packet Decomposition) coefficients. The threshold value is obtained based on means and standard deviations of nonspeech frames. Experimental results by using TIMIT speech and NOISEX-92 noise databases show that the proposed algorithm is superior to the typical VAD algorithm. The ROC(Receiver Operating Characteristics) curves are used to compare performance of VAD's for SNR values of ranging from 10 to -10 dB.

Discriminative Weight Training for a Statistical Model-Based Voice Activity Detection (통계적 모델 기반의 음성 검출기를 위한 변별적 가중치 학습)

  • Kang, Sang-Ick;Jo, Q-Haing;Park, Seung-Seop;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.5
    • /
    • pp.194-198
    • /
    • 2007
  • In this paper, we apply a discriminative weight training to a statistical model-based voice activity detection(VAD). In our approach, the VAD decision rule is expressed as the geometric mean of optimally weighted likelihood ratios(LRs) based on a minimum classification error(MCE) method which is different from the previous works in that different weights are assigned to each frequency bin which is considered more realistic. According to the experimental results, the proposed approach is found to be effective for the statistical model-based VAD using the LR test.

An Efficient Voice Activity Detection Method using Bi-Level HMM (Bi-Level HMM을 이용한 효율적인 음성구간 검출 방법)

  • Jang, Guang-Woo;Jeong, Mun-Ho
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.8
    • /
    • pp.901-906
    • /
    • 2015
  • We presented a method for Vad(Voice Activity Detection) using Bi-level HMM. Conventional methods need to do an additional post processing or set rule-based delayed frames. To cope with the problem, we applied to VAD a Bi-level HMM that has an inserted state layer into a typical HMM. And we used posterior ratio of voice states to detect voice period. Considering MFCCs(: Mel-Frequency Cepstral Coefficients) as observation vectors, we performed some experiments with voice data of different SNRs and achieved satisfactory results compared with well-known methods.

Voice Activity Detection Using Modified Power Spectral Deviation Based on Teager Energy (Teager Energy 기반의 수정된 파워 스펙트럼 편차를 이용한 음성 검출)

  • Song, J.H.;Song, Y.R.;Shim, H.M.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.8 no.1
    • /
    • pp.41-46
    • /
    • 2014
  • In this paper, we propose a novel voice activity detection (VAD) algorithm using feature vectors based on TE (teager energy). Specifically, power spectral deviation (PSD), which is used as the feature for the VAD in the IS-127 noise suppression algorithm, is obtained after the input signal is transfomed by Teager energy operator. In addition, the TE-based likelihhod ratio are derived in each frame to modifiy the PSD for further VAD. The performance of our proposed VAD algorithm are evaluated by objective testing (total error rate, receiver operating characteristics, perceptual evaluation of speech quality) under various environments, and it is found that the proposed method yields better results than conventional VAD algorithms in the non-stationary noise environments under 5 dB SNR (total error rate = 2.6% decrease, PESQ score = 0.053 improvement).

  • PDF

A Study on a Robust Voice Activity Detector Under the Noise Environment in the G,723.1 Vocoder (G.723.1 보코더에서 잡음환경에 강인한 음성활동구간 검출기에 관한 연구)

  • 이희원;장경아;배명진
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.2
    • /
    • pp.173-181
    • /
    • 2002
  • Generally the one of serious problems in Voice Activity Detection (VAD) is speech region detection in noise environment. Therefore, this paper propose the new method using energy, lsp varation. As a result of processing time and speech quality of the proposed algorithm, the processing time is reduced due to the accurate detection of inactive period, and there is almot no difference in the subjective quality test. As a result of bit rate, proposed algorithm measures the number of VAD=1 and the result shows predominant reduction of bit rate as SNR of noisy speech is low (about 5∼10 dB).

A Gain Control Algorithm of Low Computational Complexity based on Voice Activity Detection (음성 검출 기반의 저연산 이득 제어 알고리즘)

  • Kim, Sang-Kuyn;Cho, Woo-Hyeong;Jeong, Min-A;Kwon, Jang-Woo;Lee, Sangmin
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.40 no.5
    • /
    • pp.924-930
    • /
    • 2015
  • In this paper, we propose a novel approach of low computational complexity to improve the speech quality of the small acoustic equipment in noisy environment. The conventional gain control algorithm suppresses the noise of input signal, and then the part of wide dynamic range compression (WDRC) amplifies the undesired signal. The proposed algorithm controls the gain of hearing aids according to speech present probability by using the output of a voice activity detection (VAD). The performance of the proposed scheme is evaluated under various noise conditions by using objective measurement and yields superior results compared with the conventional algorithm.

Voice Activity Detection Based on SVM Classifier Using Likelihood Ratio Feature Vector (우도비 특징 벡터를 이용한 SVM 기반의 음성 검출기)

  • Jo, Q-Haing;Kang, Sang-Ki;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.8
    • /
    • pp.397-402
    • /
    • 2007
  • In this paper, we apply a support vector machine(SVM) that incorporates an optimized nonlinear decision rule over different sets of feature vectors to improve the performance of statistical model-based voice activity detection(VAD). Conventional method performs VAD through setting up statistical models for each case of speech absence and presence assumption and comparing the geometric mean of the likelihood ratio (LR) for the individual frequency band extracted from input signal with the given threshold. We propose a novel VAD technique based on SVM by treating the LRs computed in each frequency bin as the elements of feature vector to minimize classification error probability instead of the conventional decision rule using geometric mean. As a result of experiments, the performance of SVM-based VAD using the proposed feature has shown better results compared with those of reported VADs in various noise environments.

Statistical Voice Activity Defector Based on Signal Subspace Model (신호 준공간 모델에 기반한 통계적 음성 검출기)

  • Ryu, Kwang-Chun;Kim, Dong-Kook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.27 no.7
    • /
    • pp.372-378
    • /
    • 2008
  • Voice activity detectors (VAD) are important in wireless communication and speech signal processing, In the conventional VAD methods, an expression for the likelihood ratio test (LRT) based on statistical models is derived in discrete Fourier transform (DFT) domain, Then, speech or noise is decided by comparing the value of the expression with a threshold, This paper presents a new statistical VAD method based on a signal subspace approach, The probabilistic principal component analysis (PPCA) is employed to obtain a signal subspace model that incorporates probabilistic model of noisy signal to the signal subspace method, The proposed approach provides a novel decision rule based on LRT in the signal subspace domain, Experimental results show that the proposed signal subspace model based VAD method outperforms those based on the widely used Gaussian distribution in DFT domain.