Search | Korea Science

Robust Entropy Based Voice Activity Detection Using Parameter Reconstruction in Noisy Environment

Han, Hag-Yong;Lee, Kwang-Seok;Koh, Si-Young;Hur, Kang-In
- Journal of information and communication convergence engineering
- /
- v.1 no.4
- /
- pp.205-208
- /
- 2003
Voice activity detection is a important problem in the speech recognition and speech communication. This paper introduces new feature parameter which are reconstructed by spectral entropy of information theory for robust voice activity detection in the noise environment, then analyzes and compares it with energy method of voice activity detection and performance. In experiments, we confirmed that spectral entropy and its reconstructed parameter are superior than the energy method for robust voice activity detection in the various noise environment.
PDF KSCI

Voice Activity Detection Method Using Psycho-Acoustic Model Based on Speech Energy Maximization in Noisy Environments (잡음 환경에서 심리음향모델 기반 음성 에너지 최대화를 이용한 음성 검출 방법)

Choi, Gab-Keun;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.5
- /
- pp.447-453
- /
- 2009
This paper introduces the method for detect voices and exact end point at low SNR by maximizing voice energy. Conventional VAD (Voice Activity Detection) algorithm estimates noise level so it tends to detect the end point inaccurately. Moreover, because it uses relatively long analysis range for reflecting temporal change of noise, computing load too high for application. In this paper, the SEM-VAD (Speech Energy Maximization-Voice Activity Detection) method which uses psycho-acoustical bark scale filter banks to maximize voice energy within frames is introduced. Stable threshold values are obtained at various noise environments (SNR 15 dB, 10 dB, 5 dB, 0 dB). At the test for voice detection in car noisy environment, PHR (Pause Hit Rate) was 100%accurate at every noise environment, and FAR (False Alarm Rate) shows 0% at SNR15 dB and 10 dB, 5.6% at SNR5 dB and 9.5% at SNR0 dB.
https://doi.org/10.7776/ASK.2009.28.5.447 인용 PDF KSCI

Acoustic screening test for laryngeal cancer (음성을 이용한 후두암의 집단선별검사)

박헌수
- Korean Journal of Bronchoesophagology
- /
- v.7 no.2
- /
- pp.161-167
- /
- 2001
Background and Objectives： Total laryngectomy is often required for advanced cases. But this operation induced the many inconvenience of basic daily life. Early diagnosis of laryngeal cancer is very important to prevent from this disastrous condition. In this point of view, mass screening test for early detection of laryngeal cancer is necessary. Screening test using voice has many advantages such as simple, less interventional. Voice collection by Automatic Response System(ARS) is comfortable and easy to got acoustic sample. Thus author tried to got the acoustic parameters which can differentiate normal, benign. and malignant laryngeal diseases and also checked the availability of parameters on neural network system. Materials and Methods: Author has evaluated the voice from 17 laryngeal cancer patients and 45 benign laryngeal disease patients who visited at Department of Otolaryngology, Pusan National University Hospital from May 1998 to April 2001, and 15 normal control. Author chose the sir Parameters (Jitt. vFo, Shim, vAm, NHR, SPI) that was thought to be related with voice collected by ARS among thirty-three parameters analysed by a Multi-Dimensional Voice Program (MDVP). Two-step neural network was used for the availability of six parameters. Results: The detection rate of normal voice by ARS voice analysis is 78.5% and detection rate of abnormal voice was 97.1 o/o. Among abnormal voice, the detection rate of benign laryngeal diseases and laryngeal cancers were 82.4 o/o, 70.6% respectively. Conclusion: Author concluded that six parameters and Matlab based neural network software may be effective in development of acoustic screening system for laryngeal cancer and further study should be necessary for development of new acoustic parameters.
PDF

Dynamic code allocation using voice activeity detection in DS-CDMA cellular system (DS-CDMA 셀룰러 시스템에서의 음성검출을 사용한 동적코드할당방식)

유명수;양영님;고종하;이정규
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.22 no.6
- /
- pp.1302-1310
- /
- 1997
In this paper, we propose a dynamic code allocation strategy using voice activity detection and evaluate the performance of a dynamic code allocation strategy using voice activeity detection in DS-CDMA system. Proposed method allocates code to mobile terminal according to the residual capacity computed by SIR in the base station. In hot spot traffic loading cell, we find that the performance of proposed method is better than that of a fixed code assignment strategy using voice activity detection. Also, we find that the proposed method provide much improvement in blocking probability against the dynamic code assignment strategy withoug voice activity detection.
PDF

Reconstruction Effect of the Spectral Entropy for the Voice Activity Detection (음성 활동 구간 검출을 위한 스펙트랄 엔트로피의 재구성 효과)

Kwon HO-Min;Han Hag-Yong;Lee Kwang-Seok;Koh Si-Young;Hur Kang-In
- Proceedings of the Acoustical Society of Korea Conference
- /
- spring
- /
- pp.25-28
- /
- 2002
Voice activity detection is important Problem in the speech recognition and communication. This paper introduces feature parameter which is reconstructed by the spectral entropy of information theory for the robust voice activity detection in the noise environment, analyzes and compares it with the energy method of voice activity detection and performance. In experiment, we confirmed that the spectral entropy is more feature parameter than the energy method for the robust voice activity detection in the various noise environment.
PDF

Boll's Spectral Subtraction Algorithm by New Voice Activity Detection (새로운 음성 활동 검출법에 의한 Boll의 스펙트럼 차감 알고리즘)

류종훈;김대경;박장식;손경식
- Journal of Korea Multimedia Society
- /
- v.4 no.1
- /
- pp.46-55
- /
- 2001
In this paper, a new voice activity detection method estimating SNR of enhanced speech with extended spectral subtraction (ESS) is proposed. Voice activity detection is performed by putting an second Wiener filter behind an Wiener filter used in the ESS to estimate speech and noise power of output signal of first Wiener filter. The proposed voice activity detection method does not require many computational loads and performs well under severe input SNR. Boll's spectral substraction algorithm with proposed voice activity detection was compared to ESS under several noise environment having different time-frequency distributions. During speech and non-speech activity, performance of Boll's spectral substraction algorithm with proposed voice activity detection is superior to that of ESS.
PDF

Voice Activity Detection Based on Entropy in Noisy Car Environment (차량 잡음 환경에서 엔트로피 기반의 음성 구간 검출)

Roh, Yong-Wan;Lee, Kue-Bum;Lee, Woo-Seok;Hong, Kwang-Seok
- Journal of the Institute of Convergence Signal Processing
- /
- v.9 no.2
- /
- pp.121-128
- /
- 2008
Accurate voice activity detection have a great impact on performance of speech applications including speech recognition, speech coding, and speech communication. In this paper, we propose methods for voice activity detection that can adapt to various car noise situations during driving. Existing voice activity detection used various method such as time energy, frequency energy, zero crossing rate, and spectral entropy that have a weak point of rapid. decline performance in noisy environments. In this paper, the approach is based on existing spectral entropy for VAD that we propose voice activity detection method using MFB(Met-frequency filter banks) spectral entropy, gradient FFT(Fast Fourier Transform) spectral entropy. and gradient MFB spectral entropy. FFT multiplied by Mel-scale is MFB and Mel-scale is non linear scale when human sound perception reflects characteristic of speech. Proposed MFB spectral entropy method clearly improve the ability to discriminate between speech and non-speech for various in noisy car environments that achieves 93.21% accuracy as a result of experiments. Compared to the spectral entropy method, the proposed voice activity detection gives an average improvement in the correct detection rate of more than 3.2%.
PDF

Voice Activity Detection Algorithm using Wavelet Band Entropy Ensemble Analysis in Car Noisy Environments (자동차 잡음 환경에서 웨이브렛 밴드 엔트로피 앙상블 분석을 이용한 음성구간 검출 알고리즘)

Lee, G.H.;Lee, Y.J.;Kim, M.N.
- Journal of Korea Multimedia Society
- /
- v.16 no.9
- /
- pp.1005-1017
- /
- 2013
Voice activity detection is very important process that voice activity separated form noisy speech signal for speech enhance. Over the past few years, many studies have been made on voice activity detection, but it has poor performance in low signal to noise ratio environment or fickle noise such as car noise. In this paper, it proposed new voice activity detection algorithm using ensemble variance based on wavelet band entropy and soft thresholding method. We conduct a survey in a lot of signal to noise ratio environment of car noise to evaluate performance of the proposed algorithm and confirmed performance of the proposed algorithm.
https://doi.org/10.9717/kmms.2013.16.9.1005 인용 PDF KSCI

Implement PAMD for discriminate human and ARS (수화자(受話者) 구별을 위한 PAMD 구현)

서봉수
- Proceedings of the IEEK Conference
- /
- 2003.11a
- /
- pp.61-64
- /
- 2003
In this paper, we implement PAMD(Positive Answering Machine Detection) for discrimination human and ARS. We are used Grunt detection, Glitch Noise detection and Tone detection for PAMD. It distinguishes voice signals from ring-back tone and glitch noise respectively. And as a second step, it judges whether human responses or ARS responses after integrating pattern changes like initial response period, the number of voice data, each time of voice data period and glitch noise. The accuracy is about 9375 in ASR and about 98％ in Mobile phone.
PDF

Development of Voice Activity Detection Algorithm for Elderly Voice based on the Higher Order Differential Energy Operator (고차 미분에너지 기반 노인 음성에서의 음성 구간 검출 알고리즘 연구)

Lee, JiYeoun
- Journal of Digital Convergence
- /
- v.14 no.11
- /
- pp.249-255
- /
- 2016
Since the elderly voices include a lot of noise caused by physiological changes in respiration, phonation, and resonance, the performance of the convergence health-care equipments such as speech recognition, synthesis, analysis program done by elderly voice is deteriorated. Therefore it is necessary to develop researches to operate health-care instruments with elderly voices. In this study, a voice activity detection using a symmetric higher-order differential energy function (SHODEO) was developed and was compared with auto-correlation function(ACF) and the average magnitude difference function(AMDF). It was confirmed to have a better performance than other methods in the voice interval detection. The voice activity detection will be applied to a voice interface for the elderly to improve the accessibility of the smart devices.
https://doi.org/10.14400/JDC.2016.14.11.249 인용 PDF KSCI

Search Result 282, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)