Search | Korea Science

Voice Activity Detection Based on Discriminative Weight Training with Feedback (궤환구조를 가지는 변별적 가중치 학습에 기반한 음성검출기)

Kang, Sang-Ick;Chang, Joon-Hyuk
- The Journal of the Acoustical Society of Korea
- /
- v.27 no.8
- /
- pp.443-449
- /
- 2008
One of the key issues in practical speech processing is to achieve robust Voice Activity Deteciton (VAD) against the background noise. Most of the statistical model-based approaches have tried to employ equally weighted likelihood ratios (LRs), which, however, deviates from the real observation. Furthermore voice activities in the adjacent frames have strong correlation. In other words, the current frame is highly correlated with previous frame. In this paper, we propose the effective VAD approach based on a minimum classification error (MCE) method which is different from the previous works in that different weights are assigned to both the likelihood ratio on the current frame and the decision statistics of the previous frame.
https://doi.org/10.7776/ASK.2008.27.8.443 인용 PDF KSCI

Robust End Point Detection for Robot Speech Recognition Using Double Talk Detection (음성인식 로봇을 위한 동시통화검출 기반의 강인한 음성 끝점 검출)

Moon, Sung-Kyu;Park, Jin-Soo;Ko, Han-Seok
- The Journal of the Acoustical Society of Korea
- /
- v.31 no.3
- /
- pp.161-169
- /
- 2012
This paper presents a robust speech end-point detector using double talk detection in echoic conditioned speech recognition robot. The proposed method consists of combining conventional end-point detector result and double talk detector result. We have tested the proposed method in isolated word recognition system under echoic conditioned environment. As a result, the proposed algorithm shows superior performance of 30 % to the available techniques in the points of speech recognition rates.
https://doi.org/10.7776/ASK.2012.31.3.161 인용 PDF KSCI

VAD By Neural Network Under Wireless Communication Systems (Neural Network을 이용한 무선 통신시스템에서의 VAD)

Lee Hosun;Kim Sukyung;Park Sung-Kwon
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.30 no.12C
- /
- pp.1262-1267
- /
- 2005
Elliptical basis function (EBF) neural network works stably under high-level background noise environment and makes the nonlinear processing possible. It can be adapted real time VAD with simple design. This paper introduces VAD implementation using EBF and the experimental results show that EBF VAD outperforms G729 Annex B and RBF neural networks. The best error rates achieved by the EBF networks were improved more than $70\%$ in speech and $50\%$ in silence while that achieved by G.729 Annex B and RBF networks respectively.
PDF KSCI

Discriminative Weight Training for a Statistical Model-Based Voice Activity Detection (통계적 모델 기반의 음성 검출기를 위한 변별적 가중치 학습)

Kang, Sang-Ick;Jo, Q-Haing;Park, Seung-Seop;Chang, Joon-Hyuk
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.5
- /
- pp.194-198
- /
- 2007
In this paper, we apply a discriminative weight training to a statistical model-based voice activity detection(VAD). In our approach, the VAD decision rule is expressed as the geometric mean of optimally weighted likelihood ratios(LRs) based on a minimum classification error(MCE) method which is different from the previous works in that different weights are assigned to each frequency bin which is considered more realistic. According to the experimental results, the proposed approach is found to be effective for the statistical model-based VAD using the LR test.
https://doi.org/10.7776/ASK.2007.26.5.194 인용 PDF KSCI

Boll's Spectral Subtraction Algorithm by New Voice Activity Detection (새로운 음성 활동 검출법에 의한 Boll의 스펙트럼 차감 알고리즘)

류종훈;김대경;박장식;손경식
- Journal of Korea Multimedia Society
- /
- v.4 no.1
- /
- pp.46-55
- /
- 2001
In this paper, a new voice activity detection method estimating SNR of enhanced speech with extended spectral subtraction (ESS) is proposed. Voice activity detection is performed by putting an second Wiener filter behind an Wiener filter used in the ESS to estimate speech and noise power of output signal of first Wiener filter. The proposed voice activity detection method does not require many computational loads and performs well under severe input SNR. Boll's spectral substraction algorithm with proposed voice activity detection was compared to ESS under several noise environment having different time-frequency distributions. During speech and non-speech activity, performance of Boll's spectral substraction algorithm with proposed voice activity detection is superior to that of ESS.
PDF

A New Statistical Voice Activity Detector Based on UMP Test (UMP 테스트에 근거한 새로운 통계적 음성검출기)

Jang, Keun-Won;Chang, Joon-Hyuk;Kim, Dong-Kook
- The Journal of the Acoustical Society of Korea
- /
- v.26 no.1
- /
- pp.16-24
- /
- 2007
Voice activity detectors (VADs) are important in wireless communication and speech signal processing. In the conventional VAD methods. an expression for the likelihood ratio test (LRT) based on statistical models is derived. Then, speech or noise is decided by comparing the value of the expression with a threshold. We propose a new method with the modified decision rule based on the Gaussian distribution and the uniformly most power (UMP) test. This method requires the distribution of the absolute value of the incoming speech signal. Then we can obtain the final decision through the relation between the Rayleigh distributions. This VAD method can detect speech without a priori signal-to-noise ratio (SNR) which is required in the conventional VAD algorithms. Additionally, in the various VAD performance tests, the proposed VAD method is shown to be more effective than the traditional scheme.
https://doi.org/10.7776/ASK.2007.26.1.016 인용 PDF KSCI

Speech Enhancement using RNN Phoneme based VAD (음소기반의 순환 신경망 음성 검출기를 이용한 음성 향상)

Lee, Kang;Kang, Sang-Ick;Kwon, Jang-woo;Lee, Samgmin
- Journal of the Institute of Electronics and Information Engineers
- /
- v.54 no.5
- /
- pp.85-89
- /
- 2017
In this papers, we apply high performance hardware and machine learning algorithm to build an advanced VAD algorithm for speech enhancement. Since speech is made of series of phoneme, using recurrent neural network (RNN) which consider previous data is proper method to build a speech model. It is impossible to study every noise in real world. So our algorithm is builded by phoneme based study. we detect voice present frames in noisy speech signal and make enhancement of the speech signal. Phoneme based RNN model shows advanced performance in speech signal which has high correlation among each frames. To verify the performance of proposed algorithm, we compare VAD result with label data and speech enhancement result in various noise environments with previous speech enhancement algorithm.
https://doi.org/10.5573/ieie.2017.54.5.85 인용 PDF KSCI

Dimension Reduction Method of Feature Vector for Real-Time Adaptation of Voice Activity Detection (음성 구간 검출기의 실시간 적응화를 위한 특징 벡터의 차원 축소 방법)

Kim Pyoung-Hwan;Han Hag-Yong;Kim Chang-Keun;Koh Si-Young;Hur Kang-In
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.53-56
- /
- 2004
본 논문은 잡음 환경하에서 특징 벡터의 차원 축소를 통한 음성 구간 검출에 관한 연구이다. 음성/비음성 분류는 통계적 모델을 이용한 분류-기반 방법을 사용한다. 검출기에서 실시간 적응화를 위해 우도-기반의 특징 벡터에 대한 차원 축소 방법을 제안한다. 이 방법은 음성/비음성 클래스에 대한 가우시안 확률 밀도 함수에 의한 비선형적 우도값을 새로운 특징으로 취하는 방법이다. 음성/비음성 결정은 우도비 검증(Likelihood Ratio Test)의 방법을 이용하며, LDA(Linear Discriminant Analys)에 의한 축소 결과와 성능을 비교한다. 실험 결과 제안된 차원 축소 방법을 통하여 2차원으로 축소된 특징 벡터가 고차원에서의 결과와 대등함을 확인하였다.
PDF

Voice Activity Detection Based on Non-negative Matrix Factorization (비음수 행렬 인수분해 기반의 음성검출 알고리즘)

Kang, Sang-Ick;Chang, Joon-Hyuk
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.35 no.8C
- /
- pp.661-666
- /
- 2010
In this paper, we apply a likelihood ratio test (LRT) to a non-negative matrix factorization (NMF) based voice activity detection (VAD) to find optimal threshold. In our approach, the NMF based VAD is expressed as Euclidean distance between noise basis vector and input basis vector which are extracted through NMF. The optimal threshold each of noise environments depend on NMF results distribution in noise region which is estimated statistical model-based VAD. According to the experimental results, the proposed approach is found to be effective for statistical model-based VAD using LRT.
PDF KSCI

A Study on a Robust Voice Activity Detector Under the Noise Environment in the G,723.1 Vocoder (G.723.1 보코더에서 잡음환경에 강인한 음성활동구간 검출기에 관한 연구)

이희원;장경아;배명진
- The Journal of the Acoustical Society of Korea
- /
- v.21 no.2
- /
- pp.173-181
- /
- 2002
Generally the one of serious problems in Voice Activity Detection (VAD) is speech region detection in noise environment. Therefore, this paper propose the new method using energy, lsp varation. As a result of processing time and speech quality of the proposed algorithm, the processing time is reduced due to the accurate detection of inactive period, and there is almot no difference in the subjective quality test. As a result of bit rate, proposed algorithm measures the number of VAD=1 and the result shows predominant reduction of bit rate as SNR of noisy speech is low (about 5∼10 dB).
PDF KSCI

Search Result 137, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)