Search | Korea Science

An Improved VAD Algorithm Employing Speech Enhancement Preprocessing and Threshold Updating (음성 향상 전처리와 문턱값 갱신을 적용한 향상된 음성검출 방법)

이윤창;안상식
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.28 no.11C
- /
- pp.1161-1168
- /
- 2003
In this paper, we propose an improved statistical model-based voice activity detection algorithm and threshold update method. We first improve signal-to-noise ratio by using speech enhancement preprocessing algorithm combined power subtraction method and matched filter, then apply it to LLR test optimum decision rule for improving the performance even in low SNR conditions. And we propose an adaptive threshold update method that was not concerned in any papers. We also perform extensive computer simulations to demonstrate the performance improvement of the proposed VAD algorithm employing the proposed speech enhancement preprocessing algorithm and adaptive threshold update method under various background noise environments. Finally we verify our results by comparing ITU-T G.729 Annex B.
PDF KSCI

Enhancement of speech with time-variant and colored noise

Mine, Katsutoshi;Kitazaki, Masato;Wakabayashi, Katsuyoshi;Morimoto, Yuji
- 제어로봇시스템학회:학술대회논문집
- /
- 1990.10b
- /
- pp.1098-1102
- /
- 1990
We consider a method for enhancement of speech signal degraded by additive random noise with time-variant and/or colored natures. For enhancement of speech signal with such noise, it is effective to utilize the natures of speech and noise. The objective of enhancement of speech is to improve the overall quality and the articulation of speech degraded by the time-variant and/or colored random noise. In the proposed method the distribution model of speech spectrum is given as information to noise reduction system. The proposed system can improve about lOdB in SNR when the input SNR is 0 dB.
PDF

Method for Spectral Enhancement by Binary Mask for Speech Recognition Enhancement Under Noise Environment (잡음환경에서 음성인식 성능향상을 위한 바이너리 마스크를 이용한 스펙트럼 향상 방법)

Choi, Gab-Keun;Kim, Soon-Hyob
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.7
- /
- pp.468-474
- /
- 2010
The major factor that disturbs practical use of speech recognition is distortion by the ambient and channel noises. Generally, the ambient noise drops the performance and restricts places to use. DSR (Distributed Speech Recognition) based speech recognition also has this problem. Various noise cancelling algorithms are applied to solve this problem, but loss of spectrum and remaining noise by incorrect noise estimation at low SNR environments cause drop of recognition rate. This paper proposes methods for speech enhancement. This method uses MMSE-STSA for noise cancelling and ideal binary mask to compensate damaged spectrum. According to experiments at noisy environment (SNR 15 dB ~ 0 dB), the proposed methods showed better spectral results and recognition performance.
https://doi.org/10.7776/ASK.2010.29.7.468 인용 PDF KSCI

Performance Improvement of Perceptual Filter Using Noise Energy Control (잡음 에너지 제어를 통한 지각 필터 성능 개선)

Seo Joung-Kook;Cha Hyung-Tai
- The Journal of the Acoustical Society of Korea
- /
- v.24 no.1
- /
- pp.43-51
- /
- 2005
In this paper, we propose an algorithm that improves a tone quality of a noisy audio signal in order to enhance a Performance of perceptual filter using noise energy control. Most of the algorithms which were proposed by the other researchers usually applied a filter using the noise energy acquired from a silent range. In this case. the improvement rate of tone quality decreases if the noise energy is changed by the magnitude or environment variation in a signal frame. But the Proposed method Provides the means to find a food estimated noise through energy control of the estimated noise which is obtained from a silent range. Also we can get the enhancement of tone qualify in low frequency band unlike other methods. To show the performance of the Proposed algorithm, various input signals which had a different signal-to-noise ratio (SNR) such as 5dB, l0dB, 15dB and 20dB were used to test the proposed algorithm. With the proposed algorithm, we could confirm the enhancement of tone quality in terms of segmental SNR (SSNR). noise-to-mask ration (NMR) and mean opinion score (MOS) test.
PDF KSCI

Speech Enhancement Using Multiple Kalman Filter (다중칼만필터를 이용한 음성향상)

이기용
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.08a
- /
- pp.225-230
- /
- 1998
In this paper, a Kalman filter approach for enhancing speech signals degraded by statistically independent additive nonstationary noise is developed. The autoregressive hidden markov model is used for modeling the statistical characteristics of both the clean speech signal and the nonstationary noise process. In this case, the speech enhancement comprises a weighted sum of conditional mean estimators for the composite states of the models for the speech and noise, where the weights equal to the posterior probabilities of the composite states, given the noisy speech. The conditional mean estimators use a smoothing spproach based on two Kalmean filters with Markovian switching coefficients, where one of the filters propagates in the forward-time direction with one frame. The proposed method is tested against the noisy speech signals degraded by Gaussian colored noise or nonstationary noise at various input signal-to-noise ratios. An app개ximate improvement of 4.7-5.2 dB is SNR is achieved at input SNR 10 and 15 dB. Also, in a comparison of conventional and the proposed methods, an improvement of the about 0.3 dB in SNR is obtained with our proposed method.
PDF

1 Channel Speech Enhancement using ROEX Auditory Filter (ROEX 청각 필터를 이용한 단일채널 Speech Enhancement)

김학윤
- Proceedings of the Acoustical Society of Korea Conference
- /
- 1998.06e
- /
- pp.31-34
- /
- 1998
배경 잡음에 의해 저하된 음성을 복원하는 기술은 이미 오래 전부터 여러 가지 기법들이 연구되어왔다. 이들 기법 중, Spectral Subtraction 기법은 단일 채널에 의한 Speech Enhancement의 대표적인 방법이다. 그러나, 기존의 단일 채널 Speech Enhancement 기법의 중요한 단점은 Musical Noise라 불리는 잔존 Noise의 발생 및 목적신호가 왜곡된다는 것이다. 이 잔존 Noise에 의해 지금까지 연구 보고된 단일 채널 Speech Enhancement기법들은 거의 대부분 SNR은 향상되었지만 명료도의 향상이 곤란하였다고 보고되어왔다. 그러므로, 본 연구에서는 인간의 청각기구의 지각과정을 충실히 모방한 ROEX(Rounded Exponential) 청각 Filter를 이용하여 잔존 Noise인 Musical Noise를 억제시키는 기법을 제안하고자 한다.
PDF

A User friendly Remote Speech Input Unit in Spontaneous Speech Translation System

Lee, Kwang-Seok;Kim, Heung-Jun;Song, Jin-Kook;Choo, Yeon-Gyu
- Proceedings of the Korean Institute of Information and Commucation Sciences Conference
- /
- 2008.05a
- /
- pp.784-788
- /
- 2008
In this research, we propose a remote speech input unit, a new method of user-friendly speech input in speech recognition system. We focused the user friendliness on hands-free and microphone independence in speech recognition applications. Our module adopts two algorithms, the automatic speech detection and speech enhancement based on the microphone array-based beamforming method. In the performance evaluation of speech detection, within-200msec accuracy with respect to the manually detected positions is about 97percent under the noise environments of 25dB of the SNR. The microphone array-based speech enhancement using the delay-and-sum beamforming algorithm shows about 6dB of maximum SNR gain over a single microphone and more than 12% of error reduction rate in speech recognition.
PDF

Statistical Approach of Measurement of Signal to Noise Ratio in According to Change Pulse Sequence on Brain MRI Meningioma and Cyst Images (뇌 수막종 및 낭종에서 자기공명영상 펄스 시퀀스 변화에 따른 신호대잡음비의 통계적 접근)

Lee, Eul-Kyu;Choi, Kwan-Woo;Jeong, Hoi-Woun;Jang, Seo-Goo;Kim, Ki-Won;Son, Soon-Yong;Min, Jung-Whan;Son, Jin-Hyun
- Journal of radiological science and technology
- /
- v.39 no.3
- /
- pp.345-352
- /
- 2016
The purpose of this study was to needed basis of measure MRI CAD development for signal to noise ratio (SNR) by pulse sequence analysis from region of interest (ROI) in brain magnetic resonance imaging (MRI) contrast. We examined images of brain MRI contrast enhancement of 117 patients, from January 2005 to December 2015 in a University-affiliated hospital, Seoul, Korea. Diagnosed as one of two brain diseases such as meningioma and cysts SNR for each patient's image of brain MRI were calculated by using Image J. Differences of SNR among two brain diseases were tested by SPSS Statistics21 ANOVA test for there was statistical significance (p < 0.05). We have analysis socio-demographical variables, SNR according to sequence disease, 95% confidence according to SNR of sequence and difference in a mean of SNR. Meningioma results, with the quality of distributions in the order of T1CE, T2 and T1, FLAIR. Cysts results, with the quality of distributions in the order of T2 and T1, T1CE and FLAIR. SNR of MRI sequences of the brain would be useful to classify disease. Therefore, this study will contribute to evaluate brain diseases, and be a fundamental to enhancing the accuracy of CAD development.
https://doi.org/10.17946/JRST.2016.39.3.07 인용 PDF KSCI

Performance Analysis of a Class of Single Channel Speech Enhancement Algorithms for Automatic Speech Recognition (자동 음성 인식기를 위한 단채널 음질 향상 알고리즘의 성능 분석)

Song, Myung-Suk;Lee, Chang-Heon;Lee, Seok-Pil;Kang, Hong-Goo
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.2E
- /
- pp.86-99
- /
- 2010
This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.
PDF KSCI

Signal Enhancement of a Variable Rate Vocoder with a Hybrid domain SNR Estimator

Park, Hyung Woo
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.13 no.2
- /
- pp.962-977
- /
- 2019
The human voice is a convenient method of information transfer between different objects such as between men, men and machine, between machines. The development of information and communication technology, the voice has been able to transfer farther than before. The way to communicate, it is to convert the voice to another form, transmit it, and then reconvert it back to sound. In such a communication process, a vocoder is a method of converting and re-converting a voice and sound. The CELP (Code-Excited Linear Prediction) type vocoder, one of the voice codecs, is adapted as a standard codec since it provides high quality sound even though its transmission speed is relatively low. The EVRC (Enhanced Variable Rate CODEC) and QCELP (Qualcomm Code-Excited Linear Prediction), variable bit rate vocoders, are used for mobile phones in 3G environment. For the real-time implementation of a vocoder, the reduction of sound quality is a typical problem. To improve the sound quality, that is important to know the size and shape of noise. In the existing sound quality improvement method, the voice activated is detected or used, or statistical methods are used by the large mount of data. However, there is a disadvantage in that no noise can be detected, when there is a continuous signal or when a change in noise is large.This paper focused on finding a better way to decrease the reduction of sound quality in lower bit transmission environments. Based on simulation results, this study proposed a preprocessor application that estimates the SNR (Signal to Noise Ratio) using the spectral SNR estimation method. The SNR estimation method adopted the IMBE (Improved Multi-Band Excitation) instead of using the SNR, which is a continuous speech signal. Finally, this application improves the quality of the vocoder by enhancing sound quality adaptively.
https://doi.org/10.3837/tiis.2019.02.026 인용 PDF KSCI HTML

Search Result 190, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)