DOI QR코드

DOI QR Code

잡음 추정 알고리즘을 이용한 신뢰성 있는 코드벡터 조합의 선정 방법

A Selection Method of Reliable Codevectors using Noise Estimation Algorithm

  • 정승모 (세종대학교 정보통신공학과) ;
  • 김무영 (세종대학교 정보통신공학과)
  • Jung, Seungmo (Department of Information and Communication Engineering, Sejong University) ;
  • Kim, Moo Young (Department of Information and Communication Engineering, Sejong University)
  • 투고 : 2015.03.03
  • 심사 : 2015.06.15
  • 발행 : 2015.07.25

초록

배경잡음에 강인한 음성인식을 위한 전처리기로써 음성향상 기법이 요구되고 있다. 코드북 기반의 음성향상 기법은 기존 잡음 추정 알고리즘들과 비교하여 nonstationary 배경잡음 환경에 강인하다는 장점이 있다. 하지만 코드북 정보에 의존적이기 때문에 입력신호와 상관성이 떨어지는 코드벡터의 조합을 사용할 경우 성능이 급격히 떨어진다는 단점이 있다. 본 논문에서는 학습된 음성과 잡음 코드벡터를 조합하는 과정에서 입력신호와 상관성이 떨어지는 코드벡터의 조합을 제거함으로써, Log-Spectral Distortion (LSD)과 Perceptual Evaluation of Speech Quality (PESQ) 관점에서 기존 코드북 기반 알고리즘의 성능을 향상시켰다.

Speech enhancement has been required as a preprocessor for a noise robust speech recognition system. Codebook-based Speech Enhancement (CBSE) is highly robust in nonstationary noise environments compared with conventional noise estimation algorithms. However, its performance is severely degraded for the codevector combinations that have lower correlation with the input signal since CBSE depends on the trained codebook information. To overcome this problem, only the reliable codevector combinations are selected to be used to remove the codevector combinations that have lower correlation with input signal. The proposed method produces the improved performance compared to the conventional CBSE in terms of Log-Spectral Distortion (LSD) and Perceptual Evaluation of Speech Quality (PESQ).

키워드

참고문헌

  1. P. Loizou, Speech Enhancement: Theory and Practice. CRC Press, 2007.
  2. S. Jung and M. Y. Kim, "Gain Compensation Method for Codebook-Based Speech Enhancement," Journal of The Institute of Electronics and Information Engineers, Vol. 51, pp. 2051-2056, 2014
  3. S. Srinivasan, J. Samuelsson, and W. B. Kleijn, "Codebook driven short-term predictor parameter estimation for speech enhancement," IEEE Trans. Speech Audio Process., vol. 14, pp. 163-176, 2006. https://doi.org/10.1109/TSA.2005.854113
  4. S. Srinivasan, J. Samuelsson, and W. B. Kleijn, "Codebook-Based Bayesian Speech Enhancement for Nonstationary Environments," IEEE Trans. Audio, Speech, Lang. Process., vol. 15, pp. 441-452, 2007. https://doi.org/10.1109/TASL.2006.881696
  5. I. Hwang, K. Byun, and M. Y. Kim, "Reliable Codevector Selection for Codebook-based Speech Enhancement", KSCSP2011, vol. 28, pp. 267-268, 2011.
  6. S. Jung and M. Y. Kim, "Reliable Codevectors Selection by eliminating Outlier", KSCSP2014, vol. 31, pp. 163-164, 2014.
  7. J. Garofolo, L. Lamel, W. Fisher, J. Fiscus, D. Pallett, and N. Dahlgren, "DARPA TIMIT acoustic phonetic continuous speech corpus," 1993, CDROM.
  8. A. Varga, H. J. M. Steeneken, M. Tomlinson, and D. Jones, "The Noisex-92 Study on the Effect of Additive Noise on Automatic Speech Recognition," Technical Report. Malvern, U.K.: DRA Speech Res. Unit, 1992.
  9. A. D. Subramaniam and B. D. Rao, "PDF optimized parametric vector quantization of speech line spectral frequencies," IEEE Trans. Speech Audio Process., vol. 11, pp. 130-142, 2003. https://doi.org/10.1109/TSA.2003.809192
  10. I. Cohen, "Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging," IEEE Trans. Speech Audio Process., vol. 11, pp. 466-475, 2003. https://doi.org/10.1109/TSA.2003.811544