Speech Enhancement in Noisy Speech Using Neural Network

신경회로망을 사용한 잡음이 중첩된 음성 강조

  • Choi, Jae-Seung (Department of Information and Communication Engineering, Osaka City University)
  • 최재승 (일본 오사카시립대학교 정보통신공학과)
  • Published : 2005.09.25

Abstract

In speech recognition under a noisy environment, it is necessary to construct a system which reduces the noise and enhances the speech. Then it is effective to imitate the human auditory system which has an excellent analytical spectrum mechanism for speech enhancement. Accordingly, this paper proposes an adaptive method using the auditory mechanism which is called lateral inhibition. This method first estimates the noise intensity by neural network, then adaptively adjusts both the coefficients of the lateral inhibition and the adjusting coefficient of amplitude component according to the noise intensity for each input frame. It is confirmed that the proposed method is effective for speech degraded by white noise, colored noise, and road noise based on the spectral distortion measurement.

잡음이 존재하는 환경 하에서 음성인식을 실시하는 경우, 잡음을 제거하고 음성을 강조하는 시스템이 필요하다. 따라서 우수한 스펙트럴 분석기강인 인간의 청각계를 모의하는 것은 음성강조에 있어서 효과적이다. 이러한 것을 구현하는 하나의 방법으로서 상호억제라고 하는 청각기강을 적응적으로 사용하는 방법을 제안한다. 이것은 신경회로망에 의해서 잡음의 크기를 추정하여 각 프레임에 대해서 그 크기에 따라서 적응적으로 상호억제 계수와 진폭성분조정 계수를 조정함으로써 음성을 강조하는 방법이다. 스펙트럴왜곡율 척도의 평가로부터 백색잡음뿐만 아니라 유색잡음 및 자동차의 주행잡음에 대해서도 본 방식이 효과적이라는 것을 확인한다.

Keywords

References

  1. J. S. Lim, 'Evaluation of a correlation subtraction method for enhancing speech degraded by additive white noise,' IEEE Trans. Acoust., Speech, Signal Processing, Vol. ASSP-26, No.5, pp. 471-472, 1978 https://doi.org/10.1109/TASSP.1978.1163129
  2. J. S. Lim, A. V. Oppenheim, and L. D. Braida, 'Evaluation of an adaptive comb filtering method for enhancing speech degraded by white noise addition,' IEEE Trans. Acoust., Speech, Signal Processing, Vol.26, No.4, pp. 354-358, 1978 https://doi.org/10.1109/TASSP.1978.1163117
  3. Y. M. Cheng and D. O'Shaughnessy, 'Speech enhancement based conceptually on auditory evidence,' IEEE Trans. Signal Processing, Vol. 39, No. 9, pp. 1943-1953, 1991 https://doi.org/10.1109/78.134427
  4. S. F. BOLL, 'Suppression of acoustic noise in speech using spectral subtraction,' IEEE Trans. Acoust., Speech, Signal Processing, Vol.27, No.2, pp. 113-120, 1979 https://doi.org/10.1109/TASSP.1979.1163209
  5. T. V. Sreenivas and P. Kirnapure, 'Codebook constrained wiener filtering for speech enhancement,' IEEE Trans. Speech and Audio Processing, Vol.4, No.5, pp. 383-389, 1996 https://doi.org/10.1109/89.536932
  6. B. Widrow et al., 'Adaptive noise cancelling: Principles and applications,' Proc. IEEE, Vol. 63, No. 12, pp. 1692-1716, 1975 https://doi.org/10.1109/PROC.1975.10036
  7. W. G. Knecht, M. E. Schenkel, and G. S. Moschytz, 'Neural network filters for speech enhancement,' IEEE Trans. Speech and Audio Processing, Vol.3, No.6, pp. 433-438, 1995 https://doi.org/10.1109/89.482210
  8. S. A. Shamma, 'Speech Processing in the Auditory System II: Lateral Inhibition and the Central Processing of Speech Evoked Activity in the Auditory Nerve', J. Acoust. Soc. Am. Vol.78, No.7, pp. 1622-1632, 1985 https://doi.org/10.1121/1.392800
  9. Y. Wu, Y. Li, 'Robust speech/non-speech detection in adverse conditions using the fuzzy polarity correlation method', IEEE International Conference on Systems, Man, and Cybernetics, Oct. pp. 2935-2939, 2000 https://doi.org/10.1109/ICSMC.2000.884446
  10. K. Itoh, N. Kitawaki, K. Kakehi, 'A Study of Objective Quality Measures for Digital Speech Waveform Coding Systems', IEICE, Vol. J 66-A, No. 3, pp. 274-281, 1983