Search | Korea Science

Applying the Bi-level HMM for Robust Voice-activity Detection

Hwang, Yongwon;Jeong, Mun-Ho;Oh, Sang-Rok;Kim, Il-Hwan
- Journal of Electrical Engineering and Technology
- /
- v.12 no.1
- /
- pp.373-377
- /
- 2017
This paper presents a voice-activity detection (VAD) method for sound sequences with various SNRs. For real-time VAD applications, it is inadequate to employ a post-processing for the removal of burst clippings from the VAD output decision. To tackle this problem, building on the bi-level hidden Markov model, for which a state layer is inserted into a typical hidden Markov model (HMM), we formulated a robust method for VAD not requiring any additional post-processing. In the method, a forward-inference-ratio test was devised to detect the speech endpoints and Mel-frequency cepstral coefficients (MFCC) were used as the features. Our experiment results show that, regarding different SNRs, the performance of the proposed approach is more outstanding than those of the conventional methods.
https://doi.org/10.5370/JEET.2017.12.1.373 인용 PDF KSCI

Robust Feature Extraction for Voice Activity Detection in Nonstationary Noisy Environments (음성구간검출을 위한 비정상성 잡음에 강인한 특징 추출)

Hong, Jungpyo;Park, Sangjun;Jeong, Sangbae;Hahn, Minsoo
- Phonetics and Speech Sciences
- /
- v.5 no.1
- /
- pp.11-16
- /
- 2013
This paper proposes robust feature extraction for accurate voice activity detection (VAD). VAD is one of the principal modules for speech signal processing such as speech codec, speech enhancement, and speech recognition. Noisy environments contain nonstationary noises causing the accuracy of the VAD to drastically decline because the fluctuation of features in the noise intervals results in increased false alarm rates. In this paper, in order to improve the VAD performance, harmonic-weighted energy is proposed. This feature extraction method focuses on voiced speech intervals and weighted harmonic-to-noise ratios to determine the amount of the harmonicity to frame energy. For performance evaluation, the receiver operating characteristic curves and equal error rate are measured.
https://doi.org/10.13064/KSSS.2013.5.1.011 인용 PDF

Research of Improving the Performance of Voice Activity Detector in Vocoder (음성부호화기에서의 VAD 성능 향상 연구)

Min, So-Yeon;Lee, Kwang-Hyoung;Bae, Myung-Jin
- Proceedings of the KAIS Fall Conference
- /
- 2007.11a
- /
- pp.194-197
- /
- 2007
.ITU-T 국제 표준화 기구에서 인터넷 폰과 화상회의를 목적으로 개발된 G.723.1 음성 부호화기는 잡음구간에서의 전송률을 낮추기 위한 방법으로 VAD(Voice Activity Detector)와 CNG(Comfort Noise Generator)를 사용하고 있다. 이중 VAD는 최종적으로 현재 프레임의 에너지 레벨을 비교하여 음성의 활동 유무를 판정하고 있다. 하지만 G.723.1 VAD에서는 보다 안정적인 판정을 위해 음성 활동 구간 사이에 삽입되어 있는 묵음 구간에 대해서는 거의 대부분 음성이 활동하는 영역으로 판정을 하고 있다. 따라서 본 논문에서는 묵음 구간에 대해 보다 정확한 판정을 통하여 기존의 방법에 비해 전송율을 더욱 감소시킬 수 있는 방법을 제안한다. 실험에서는 묵음구간을 길게 조절한 문장을 사용하여 측정한 결과, 약 47% 정도의 전송율을 감소시킬 수 있었으며, MOS test 결과, 음질의 열하는 거의 발생하지 않았다.
PDF

An Study on Development of Water Systems Damage Management Standard Caused by Mt. Baekdu Eruption (백두산 분화로 인한 상수도 시설 피해 관리 기준 설정 연구)

Choi, Jung-Ryel;Kim, Min Gyu;Lee, Gyeng-Bin;Chung, Il-Moon
- The Journal of Engineering Geology
- /
- v.28 no.2
- /
- pp.259-266
- /
- 2018
The purpose of this study is to establish the management standards of water systems in Korea. The damage factors of the water systems were classified by accumulation, adsorption, and abrasion. According to the thickness of volcanic ash, the management stage of the water systems was derived in four steps; VAD (Volcanic Ash Degree) I (0~1 mm), II (1~3 mm), III (3~5 mm), IV (over 5 mm). Finally, the management standards for water systems which consist of alarm levels, impacts of volcanic ashes, procedures and action plan to deal with the damage, are presented.
https://doi.org/10.9720/kseg.2018.2.259 인용 PDF KSCI

Blood Flow and Pressure Evaluation for a Pulsatile Conduit-Shaped Ventricular Assist Device with Structural Characteristic of Conduit Shape (관형의 구조적 특징을 갖춘 박동형 관형 심실보조장치의 혈류, 혈압 평가)

Kang, Seong-Min;Choi, Seong-Wook
- Transactions of the Korean Society of Mechanical Engineers B
- /
- v.35 no.11
- /
- pp.1191-1198
- /
- 2011
The use of a ventricular assist device (VAD) can raise the one-year survival rate without cardiac transplantation from 25% to 52%. However, malfunction of the VAD system causes 6% of VAD patients' deaths, which could possibly be avoided through the development of new VADs in which VAD malfunctions do not affect the patient's heart movement or hemodynamic state. A conventional VAD has an impeller or vane for propelling blood that can allow blood to regurgitate when the propelling force is weaker than the aortic pressure. In this paper, we developed a new pulsatile conduit-shaped VAD that has two valves. This device removes the possibility of blood regurgitation and has a small stationary area even when the pumping force is extremely weak. We estimated the characteristics of the device by measuring the outflow and the pressure of the pump in in-vitro and in-vivo experiments.
https://doi.org/10.3795/KSME-B.2011.35.11.1191 인용 PDF KSCI

Acquisition Rate and Accuracy According to Wind Vector Calculation Method of Remote Sensing (원격탐사의 바람벡터 산출 방법에 따른 자료 수집률과 정확도 )

Yu-Jin Kim;Byung Hyuk Kwon
- The Journal of the Korea institute of electronic communication sciences
- /
- v.18 no.5
- /
- pp.965-970
- /
- 2023
Wind profiler and wind lidar produce a vertical profile of winds in high spatiotemporal resolution in the atmospheric boundary layer. The wind lidar makes the wind vector using DBS (Doppler Beam Swinging) and VAD (Velocity Azimuth Display) methods. The DBS method has the advantage of obtaining a wind profile with a fast scan time. On the other hand, there is a restriction that requires at least two beams including vertical beam, which causes a decrease in the data acquisition rate. The VAD method was improved to produce more wind vector of the wind profiler as well as the wind lidar, which generally uses 5 beams. Fourier series was estimated with the radial velocity by the DBS method and wind vector was determined by setting the azimuth interval and applying the radial velocity by the Fourier series to the VAD method. The wind vectors were retrieved at the altitude where the wind was not calculated by the DBS method, and the results of the two methods were consistent.
https://doi.org/10.13067/JKIECS.2023.18.5.965 인용 PDF

Statistical Model-Based Voice Activity Detection Using Spatial Cues for Dual-Channel Noisy Speech Recognition (이중채널 잡음음성인식을 위한 공간정보를 이용한 통계모델 기반 음성구간 검출)

Shin, Min-Hwa;Park, Ji-Hun;Kim, Hong-Kook;Lee, Yeon-Woo;Lee, Seong-Ro
- Phonetics and Speech Sciences
- /
- v.2 no.3
- /
- pp.141-148
- /
- 2010
In this paper, voice activity detection (VAD) for dual-channel noisy speech recognition is proposed in which spatial cues are employed. In the proposed method, a probability model for speech presence/absence is constructed using spatial cues obtained from dual-channel input signal, and a speech activity interval is detected through this probability model. In particular, spatial cues are composed of interaural time differences and interaural level differences of dual-channel speech signals, and the probability model for speech presence/absence is based on a Gaussian kernel density. In order to evaluate the performance of the proposed VAD method, speech recognition is performed for speech segments that only include speech intervals detected by the proposed VAD method. The performance of the proposed method is compared with those of several methods such as an SNR-based method, a direction of arrival (DOA) based method, and a phase vector based method. It is shown from the speech recognition experiments that the proposed method outperforms conventional methods by providing relative word error rates reductions of 11.68%, 41.92%, and 10.15% compared with SNR-based, DOA-based, and phase vector based method, respectively.
PDF

A Weighted Feature Voting Approach for Robust and Real-Time Voice Activity Detection

Moattar, Mohammad Hossein;Homayounpour, Mohammad Mehdi
- ETRI Journal
- /
- v.33 no.1
- /
- pp.99-109
- /
- 2011
This paper concerns a robust real-time voice activity detection (VAD) approach which is easy to understand and implement. The proposed approach employs several short-term speech/nonspeech discriminating features in a voting paradigm to achieve a reliable performance in different environments. This paper mainly focuses on the performance improvement of a recently proposed approach which uses spectral peak valley difference (SPVD) as a feature for silence detection. The main issue of this paper is to apply a set of features with SPVD to improve the VAD robustness. The proposed approach uses a weighted voting scheme in order to take the discriminative power of the employed feature set into account. The experiments show that the proposed approach is more robust than the baseline approach from different points of view, including channel distortion and threshold selection. The proposed approach is also compared with some other VAD techniques for better confirmation of its achievements. Using the proposed weighted voting approach, the average VAD performance is increased to 89.29% for 5 different noise types and 8 SNR levels. The resulting performance is 13.79% higher than the approach based only on SPVD and even 2.25% higher than the not-weighted voting scheme.
https://doi.org/10.4218/etrij.11.1510.0158 인용 PDF KSCI

A Statistical Model-Based Voice Activity Detection Employing the Conditional MAP Criterion with Spectral Deviation (조건 사후 최대 확률과 음성 스펙트럼 변이 조건을 이용한 통계적 모델 기반의 음성 검출기)

Kim, Sang-Kyun;Chang, Joon-Hyuk
- The Journal of the Acoustical Society of Korea
- /
- v.30 no.6
- /
- pp.324-329
- /
- 2011
In this paper, we propose a novel approach to improve the performance of a statistical model-based voice activity detection (VAD) which is based on the conditional maximum a posteriori (CMAP) with deviation. In our approach, the VAD decision rule is expressed as the geometric mean of likelihood ratios (LRs) based on adapted threshold according to the speech presence probability conditioned on both the speech activity decisions and spectral deviation in the pervious frame. Experimental results show that the proposed approach yields better results compared to the CMAP-based VAD using the LR test.
https://doi.org/10.7776/ASK.2011.30.6.324 인용 PDF KSCI

Voice Activity Detection Algorithm base on Radial Basis Function Networks with Dual Threshold (Radial Basis Function Networks를 이용한 이중 임계값 방식의 음성구간 검출기)

Kim Hong lk;Park Sung Kwon
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.29 no.12C
- /
- pp.1660-1668
- /
- 2004
This paper proposes a Voice Activity Detection (VAD) algorithm based on Radial Basis Function (RBF) network using dual threshold. The k-means clustering and Least Mean Square (LMS) algorithm are used to upade the RBF network to the underlying speech condition. The inputs for RBF are the three parameters in a Code Exited Linear Prediction (CELP) coder, which works stably under various background noise levels. Dual hangover threshold applies in BRF-VAD for reducing error, because threshold value has trade off effect in VAD decision. The experimental result show that the proposed VAD algorithm achieves better performance than G.729 Annex B at any noise level.
PDF KSCI

Search Result 217, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)