Search | Korea Science

Jung, Seungmo;Kim, Moo Young
- Journal of the Institute of Electronics and Information Engineers
- /
- v.51 no.9
- /
- pp.165-170
- /
- 2014
Speech enhancement techniques that remove surrounding noise are stressed to preprocessor of speech recognition. Among the various speech enhancement techniques, Codebook-based Speech Enhancement (CBSE) operates efficiently in non-stationary noise environments. But, CBSE has some problems that inaccurate gains can be estimated if mismatch occur between input noisy signal and trained speech/noise codevectors. In this paper, the Normalized Weighting Factor (NWF) is calculated by long-term noise estimation algorithm based on Signal-to-Noise Ratio, compensated to the conventional inaccurate gains. The proposed CBSE shows better performance than conventional CBSE.
https://doi.org/10.5573/ieie.2014.51.9.165 인용 PDF KSCI

Jung, Seungmo;Kim, Moo Young
- Journal of the Institute of Electronics and Information Engineers
- /
- v.52 no.7
- /
- pp.119-124
- /
- 2015
Speech enhancement has been required as a preprocessor for a noise robust speech recognition system. Codebook-based Speech Enhancement (CBSE) is highly robust in nonstationary noise environments compared with conventional noise estimation algorithms. However, its performance is severely degraded for the codevector combinations that have lower correlation with the input signal since CBSE depends on the trained codebook information. To overcome this problem, only the reliable codevector combinations are selected to be used to remove the codevector combinations that have lower correlation with input signal. The proposed method produces the improved performance compared to the conventional CBSE in terms of Log-Spectral Distortion (LSD) and Perceptual Evaluation of Speech Quality (PESQ).
https://doi.org/10.5573/ieie.2015.52.7.119 인용 PDF KSCI

Lee, Myeong-Seok;Noh, Myung-Hoon;Park, Sung-Joo;Lee, Seok-Pil;Kim, Moo-Young
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.3
- /
- pp.200-208
- /
- 2010
In this work, the minimum statistics (MS) algorithm is combined with the codebook driven short-term predictor parameter estimation (CDSTP) to design a speech enhancement algorithm that is robust against various background noise environments. The MS algorithm functions well for the stationary noise but relatively not for the non-stationary noise. The CDSTP works efficiently for the non-stationary noise, but not for the noise that was not considered in the training stage. Thus, we propose to combine CDSTP and MS. Compared with the single use of MS and CDSTP, the proposed method produces better perceptual evaluation of speech quality (PESQ) score, and especially works excellent for the mixed background noise between stationary and non-stationary noises.
https://doi.org/10.7776/ASK.2010.29.3.200 인용 PDF KSCI

Koo, Bon-Kang;Park, Hee-Wan;Ju, Yeon-Jae;Kang, Sang-Won
- The Journal of the Acoustical Society of Korea
- /
- v.30 no.4
- /
- pp.190-196
- /
- 2011
Bandwidth extension is a technique to improve speech quality and intelligibility, extending from 300-3400 Hz narrowband speech to 50-7000 Hz wideband speech. This paper designs an artificial bandwidth extension (ABE) module embedded in the AMR (adaptive multi-rate) decoder, reducing LPC/LSP analysis and algorithm delay of the ABE module. We also introduce a fast search codebook mapping method for ABE, and design a low power BWE technique based on the AMR decoder. The proposed ABE method reduces the computational complexity and the algorithm delay, respectively, by 28 % and 20 msec, compared to the traditional DTE (decode then extend) method. We also introduce a weighted classified codebook mapping method for constructing the spectral envelope of the wideband speech signal.
https://doi.org/10.7776/ASK.2011.30.4.190 인용 PDF KSCI

Choi, Yoonsang;Li, Yaxing;Kang, Sangwon
- The Journal of the Acoustical Society of Korea
- /
- v.36 no.1
- /
- pp.70-77
- /
- 2017
Bandwidth extension is a technique to improve speech quality, intelligibility and naturalness, extending from the 300 ~ 3,400 Hz narrowband speech to the 50 ~ 7,000 Hz wideband speech. In this paper, an Artificial Bandwidth Extension (ABE) module embedded in the Opus audio decoder is designed using the information of narrowband speech to reduce the computational complexity of LPC (Linear Prediction Coding) and LSF (Line Spectral Frequencies) analysis and the algorithm delay of the ABE module. We proposed a spectral envelope extension method using DBN (Deep Belief Network), one of deep learning techniques, and the proposed scheme produces better extended spectrum than the traditional codebook mapping method.
https://doi.org/10.7776/ASK.2017.36.1.070 인용 PDF KSCI