Search | Korea Science

A study on combination of loss functions for effective mask-based speech enhancement in noisy environments (잡음 환경에 효과적인 마스크 기반 음성 향상을 위한 손실함수 조합에 관한 연구)

Jung, Jaehee;Kim, Wooil
- The Journal of the Acoustical Society of Korea
- /
- v.40 no.3
- /
- pp.234-240
- /
- 2021
In this paper, the mask-based speech enhancement is improved for effective speech recognition in noise environments. In the mask-based speech enhancement, enhanced spectrum is obtained by multiplying the noisy speech spectrum by the mask. The VoiceFilter (VF) model is used as the mask estimation, and the Spectrogram Inpainting (SI) technique is used to remove residual noise of enhanced spectrum. In this paper, we propose a combined loss to further improve speech enhancement. In order to effectively remove the residual noise in the speech, the positive part of the Triplet loss is used with the component loss. For the experiment TIMIT database is re-constructed using NOISEX92 noise and background music samples with various Signal to Noise Ratio (SNR) conditions. Source to Distortion Ratio (SDR), Perceptual Evaluation of Speech Quality (PESQ), and Short-Time Objective Intelligibility (STOI) are used as the metrics of performance evaluation. When the VF was trained with the mean squared error and the SI model was trained with the combined loss, SDR, PESQ, and STOI were improved by 0.5, 0.06, and 0.002 respectively compared to the system trained only with the mean squared error.
https://doi.org/10.7776/ASK.2021.40.3.234 인용 PDF KSCI

Microphone Array Processing in the Wavelet Domain for Speech Enhancement (마이크로폰 배열을 이용한 웨이브렛 도메인에서의 음성신호 개선)

장병욱;권홍석;김시호;배건성
- Proceedings of the IEEK Conference
- /
- 2001.09a
- /
- pp.513-516
- /
- 2001
마이크로폰을 배열을 이용한 음성개선 기법 중에서 저주파 영역에서의 높은 상관성과 고주파 영역에서의 spatial aliasing을 동시에 고려하기 위하여 대수적인 선형 마이크로폰 배열을 사용하고 웨이브렛 도메인에서의 Wiener 필터에 기반한 postfiltering을 수행하는 방법이 제안된 바 있는데[l], 본 논문에서는 이 방법의 문제점을 분석하고 해결방안을 제시하였다. 제안한 알고리즘을 사용하여 시뮬레이션한 결과, 마이크에 입력되는 음성신호의 SNR이 0dB일 때와 l0dB일 때, 기존의 알고리즘에 비해 약 1.7dB와 2.5dB의 성능개선이 있었으며, 청취실험을 통해서도 음질의 향상을 확인할 수 있었다.
PDF

Spectral subtraction based on speech state and masking effect

김우일;강선미;고한석
- Proceedings of the IEEK Conference
- /
- 1998.06a
- /
- pp.599-602
- /
- 1998
In this paper, a speech enhancement method based on phonemic properties and masking effect is propsoed. It is a modified type of spectral subtraction wherein the spectral sharpening process is exploited in unvoiced state considering the phonemic properties. The masking threshold is used to remove the residual noise. The proposed spectral subtraction shows similar performance as that of the classical spectral subtraction method in view of the SNR. But by the prposed scheme, the unvoiced sound region is shown to exhibit relatively less signal distortion in the enhanced speech.
PDF

Binary ASK way for 1Giga bit MODEM (1Giga bit MODEM을 위한 Binary ASK방식)

;;;Sosuke Onodera;Yoichi Sato
- Proceedings of the IEEK Conference
- /
- 2003.07a
- /
- pp.194-197
- /
- 2003
We proposed Binary ASK system for 1Giga bit Modem. The Binary ASK system has a high speed shutter transmitter and no IF receiver only by symbol synchronization. The advantage of proposed system is that circuitry is very simple without IF process. The disadvantage of proposed system are that line spectrum occurs unordinary interference to other channels, and enhancement to 4-level system is impossible due to its large SNR degradation.
PDF

Speech Enhancement Using Level Adapted Wavelet Packet with Adaptive Noise Estimation

Chang, Sung-Wook;Kwon, Young-Hun;Jung, Sung-Il;Yang, Sung-Il;Lee, Kun-Sang
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.2E
- /
- pp.87-92
- /
- 2003
In this paper, a new speech enhancement method using level adapted wavelet packet is presented. First, we propose a level adapted wavelet packet to alleviate a drawback of the conventional node adapted one in noisy environment. Next, we suggest an adaptive noise estimation method at each node on level adapted wavelet packet tree. Then, for more accurate noise component subtraction, we propose a new estimation method of spectral subtraction weight. Finally, we present a modified spectral subtraction method. The proposed method is evaluated on various noise conditions: speech babble noise, F-l6 cockpit noise, factory noise, pink noise, and Volvo car interior noise. For an objective evaluation, the SNR test was performed. Also, spectrogram test and a very simple listening test as a subjective evaluation were performed.
PDF KSCI

Denoising of Speech Signal Using Wavelet Transform (웨이브렛 변환을 이용한 음성신호의 잡음제거)

한미경;배건성
- The Journal of the Acoustical Society of Korea
- /
- v.19 no.5
- /
- pp.27-34
- /
- 2000
This paper deals with speech enhancement methods using the wavelet transform. A cycle-spinning scheme and undecimated wavelet transform are used for denoising of speech signals, and then their results are compared with that of the conventional wavelet transform. We apply soft-thresholding technique for removing additive background noise from noisy speech. The symlets 8-tap wavelet and pyramid algorithm are used for the wavelet transform. Performance assessments based on average SNR, cepstral distance and informal subjective listening test are carried out. Experimental results demonstrate that both cycle-spinning denoising(CSD) method and undecimated wavelet denoising(CWD) method outperform conventional wavelet denoising(UWD) method in objective performance measure as welt as subjective listening test. The two methods also show less "clicks" that usually appears in the neighborhood of signal discontinuities.
PDF

Audio Enhancement Algorithm Using Adaptive Perceptual Filter (적응 지각 필터를 이용한 오디오 음질 개선 알고리즘)

엄혜영;한헌수;홍민철;차형태
- The Journal of the Acoustical Society of Korea
- /
- v.22 no.8
- /
- pp.687-693
- /
- 2003
In this paper, a new adaptive audio signal enhancement algorithm is proposed. In order to remove a broadband noise from a noisy signal, a filter is designed and applied adaptively to noisy audio signal. The noisy signal is first transformed to frequency domain and divided into bark domain to calculate excitation energy. A filter will be calculated to eliminate the noise by using the excitation energy and noisy energy which is obtained from a silent area. The filter is adaptively adjusted and continuously applied until the threshold point is met. The algorithm also works well even though the noise's energy change all of a sudden. SNR, NMR comparison and MOS Test are performed to show the effectiveness of the proposed algorithm.
PDF KSCI

Paper Title : Speech Parameter Estimation and Enhancement Using the EM Algorithm (EM 알고리즘을 이용한 음성 파라미터 추정 및 향상)

Lee, Ki-Yong;Kang, Young-Tae;Lee, Byung-Gook
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.2E
- /
- pp.68-75
- /
- 1994
In many applications of signal processing, we have to deal with densities which are highly non-Gaussian or which may have Gaussian shape in the middle but have potent deviations in the tails. To fight against these deviations, we consider a finite mixture distribution for the speech excitation. We utilize the EM algorithm for the estimation of speech parameters and their enhancement. Robust Kalman filtering is used in the enhancement process, and a detection/estimation technique is used for parameter estimation. Experimental results show that the proposed algorithm performs better in adverse SNR input conditions.
PDF

Efficient Noise Estimation for Speech Enhancement in Wavelet Packet Transform

Jung, Sung-Il;Yang, Sung-Il
- The Journal of the Acoustical Society of Korea
- /
- v.25 no.4E
- /
- pp.154-158
- /
- 2006
In this paper, we suggest a noise estimation method for speech enhancement in nonstationary noisy environments. The proposed method consists of the following two main processes. First, in order to receive fewer affect of variable signals, a best fitting regression line is used, which is obtained by applying a least squares method to coefficient magnitudes in a node with a uniform wavelet packet transform. Next, in order to update the noise estimation efficiently, a differential forgetting factor and a correlation coefficient per subband are used, where subband is employed for applying the weighted value according to the change of signals. In particular, this method has the ability to update the noise estimation by using the estimated noise at the previous frame only, without utilizing the statistical information of long past frames and explicit nonspeech frames by voice activity detector. In objective assessments, it was observed that the performance of the proposed method was better than that of the compared (minima controlled recursive averaging, weighted average) methods. Furthermore, the method showed a reliable result even at low SNR.
PDF KSCI

Implementation of Chip and Algorithm of a Speech Enhancement for an Automatic Speech Recognition Applied to Telematics Device (텔레메틱스 단말용 음성 인식을 위한 음성향상 알고리듬 및 칩 구현)

Kim, Hyoung-Gook
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.7 no.5
- /
- pp.90-96
- /
- 2008
This paper presents an algorithm of a single chip acoustic speech enhancement for telematics device. The algorithm consists of two stages, i.e. noise reduction and echo cancellation. An adaptive filter based on cross spectral estimation is used to cancel echo. The external background noise is eliminated and the clear speech is estimated by using MMSE log-spectral magnitude estimation. To be suitable for use in consumer electronics, we also design a low cost, high speed and flexible hardware architecture. The performance of the proposed speech enhancement algorithms were measured both by the signal-to-noise ratio(SNR) and recognition accuracy of an automatic speech recognition(ASR) and yields better results compared with the conventional methods.
PDF

Search Result 190, Processing Time 0.028 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)