Search | Korea Science

Lee, Jun-Yong;Kim, Hyoung-Gook
- The Journal of the Acoustical Society of Korea
- /
- v.34 no.3
- /
- pp.227-233
- /
- 2015
In this paper, we propose music and voice separation using kernel sptectrogram models backfitting based on log-spectral amplitude estimator. The existing method separates sources based on the estimate of a desired objects by training MSE (Mean Square Error) designed Winer filter. We introduce rather clear music and voice signals with application of log-spectral amplitude estimator, instead of adaptation of MSE which has been treated as an existing method. Experimental results reveal that the proposed method shows higher performance than the existing methods.
https://doi.org/10.7776/ASK.2015.34.3.227 인용 PDF KSCI

Song, Eunwoo;Kang, Hong-Goo
- Proceedings of the Korean Society of Broadcast Engineers Conference
- /
- 2011.07a
- /
- pp.122-125
- /
- 2011
In real environment, background noise exists everywhere and degrades the performance of system. To reduce this distortion, a speech enhancement algorithm can be very useful and variety methods have been proposed. In this paper, we propose a postfilter to improve the performance of optimally modified log-spectral amplitude (OM-LSA) estimator. Proposed algorithm uses the formant postfilter to minimize perceptual distortion caused by background noise. We adjust an emphasizing parameter which is varied by spectral flatness and first reflection coefficient. The performance of the proposed algorithm is evaluated by measuring the log-spectral distance (LSD) and the perceptual evaluation of speech quality (PESQ) score. The test results show the improvement of proposed algorithm compared to conventional OM-LSA.
PDF

Shin, Minkyu;Ko, Hanseok
- The Journal of the Acoustical Society of Korea
- /
- v.33 no.1
- /
- pp.54-59
- /
- 2014
Identification of RTF (Relative Transfer Function) between sensors is essential to multichannel speech enhancement system. In this paper, we present an approach for estimating the relative transfer function of speech signal. This method adapts a CASA (Computational Auditory Scene Analysis) technique to the conventional OM-LSA (Optimally-Modified Log-Spectral Amplitude) based approach. Evaluation of the proposed approach is performed under simulated stationary and nonstationary WGN (White Gaussian Noise). Experimental results confirm advantages of the proposed approach.
https://doi.org/10.7776/ASK.2014.33.1.054 인용 PDF KSCI

Song, Myung-Suk;Lee, Chang-Heon;Lee, Seok-Pil;Kang, Hong-Goo
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.2E
- /
- pp.86-99
- /
- 2010
This paper analyzes the performance of various single channel speech enhancement algorithms when they are applied to automatic speech recognition (ASR) systems as a preprocessor. The functional modules of speech enhancement systems are first divided into four major modules such as a gain estimator, a noise power spectrum estimator, a priori signal to noise ratio (SNR) estimator, and a speech absence probability (SAP) estimator. We investigate the relationship between speech recognition accuracy and the roles of each module. Simulation results show that the Wiener filter outperforms other gain functions such as minimum mean square error-short time spectral amplitude (MMSE-STSA) and minimum mean square error-log spectral amplitude (MMSE-LSA) estimators when a perfect noise estimator is applied. When the performance of the noise estimator degrades, however, MMSE methods including the decision directed module to estimate a priori SNR and the SAP estimation module helps to improve the performance of the enhancement algorithm for speech recognition systems.
PDF KSCI