Estimation method of noise intensity by neural network for application in speech enhancement

음성강조에의 응용을 위한 신경회로망에 의한 잡음량의 추정법

  • Choi Jae-Seung (Department of Information and Communication Engineering, Osaka City University)
  • 최재승 (일본 오사카시립대학교 정보통신공학과)
  • Published : 2005.05.01

Abstract

To reduce the noise in the noisy speech, it is desirable to change the parameters of the speech processing system according to the noise intensity to reproduce a good quality speech. This paper proposes an estimation method of noise intensity using a three layered neural network, which is able to learn the three graded speeches that is degraded by white noise or road noise. Experimental results demonstrate that the noise intensity could be estimated by the neural network. Even if the speakers and speech data are different from the training data, estimation rates for the noise intensity can be estimated by the neural network with an average accuracy of $95\%$ or more for white noise.

잡음이 중첩된 음성으로부터 잡음을 제거하기 위해서는, 잡음의 크기에 따라서 음성처리 시스템의 매개변수를 변경하는 것이 양호한 음질의 음성을 재생하는데 바람직하다. 본 논문은 백색잡음 및 자동차의 주행잡음에 의해 저하된 3단계의 음성을 학습할 수 있는 3층 구조의 신경회로망을 사용하여, 음성 중의 잡음량의 크기를 추정하는 방식을 제안한다. 실험결과, 제안한 방법은 신경회로망에 의해서 잡음량이 추정될 수 있는 것을 알 수 있었으며, 화자와 음성 데이터가 학습데이터와 다르더라도 백색잡음에 대해서 평균 $95\%$ 이상의 높은 잡음 추정율을 구할 수 있었다.

Keywords

References

  1. J. S. Lim, 'Evaluation of a correlation subtraction method for enhancing speech degraded by additive white noise', IEEE Trans. Acoust., Speech, Signal Processing. vol. 6, no. 5, pp. 471-472, 1978
  2. J. S. Lim, A. V. Oppenheim, L. D. Braida, 'Evaluation of an adaptive comb filtering method for enhancing speech degraded by white noise addition', IEEE Trans. Acoust., Speech, Signal Processing, vol. 26, no. 4, pp. 354-358, 1978 https://doi.org/10.1109/TASSP.1978.1163117
  3. S. F. Boll, 'Suppression of acoustic noise in speech using spectral subtraction', IEEE Trans. Acoust., Speech, Signal Processing. vol. 27, no. 2, pp. 113-120, 1979 https://doi.org/10.1109/TASSP.1979.1163209
  4. Y. M. Cheng and D. O'Shaughnessy, 'Speech enhancement based conceptually on auditory evidence,' IEEE Trans. Signal Processing, Vol. 39, No. 9, pp. 1943-1953, 1991 https://doi.org/10.1109/78.134427
  5. 최재승, '청각기강의 모델을 이용한 음성강조 시스템', 전자공학회 논문지 제41권 SP편 제6호, pp. 295-302, 2004
  6. T. V. Sreenivas and P. Kirnapure, 'Codebook constrained wiener filtering for speech enhancement,' IEEE Trans. Speech and Audio Processing, Vol.4, No.5, pp. 383-389, 1996 https://doi.org/10.1109/89.536932
  7. S. Oh, V. Viswanathan, P. Papamichalis, 'Hands-free voice communication in an automobile with a microphone array', IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP vol. 92, no. 1, pp. 281-284, 1992 https://doi.org/10.1109/ICASSP.1992.225916
  8. W. G. Knecht, M. E. Schenkel, and G. S. Moschytz, 'Neural network filters for speech enhancement,' IEEE Trans. Speech and Audio Processing, Vol.3, No.6, pp. 433-438, 1995 https://doi.org/10.1109/89.482210
  9. S. Tamura, 'An analysis of a noise reduction neural network', IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP vol. 89, no. 3, pp. 2001-2004, 1989 https://doi.org/10.1109/ICASSP.1989.266851
  10. M. R. Sambur, 'Adaptive noise cancelling for speech signals', IEEE Trans. Acoust., Speech, Signal Processing, vol. 26, no. 5, pp. 419-423, 1978 https://doi.org/10.1109/TASSP.1978.1163137
  11. B. Widrow, et al., 'Adaptive noise cancelling: Principles and applications', Proc. IEEE, vol. 63, no. 12, pp. 1692-1716, 1975 https://doi.org/10.1109/PROC.1975.10036
  12. A. Ishida, H. gobata, 'Speech/Non-speech Discrimination under Real Life Environments'. J. Acoust. Soc. Japan, vol. 47, no. 12, pp. 911-917, 1991
  13. K Itoh, N. Kitawaki, K Kakehi, 'A Study of Objective Quality Measures for Digital Speech Waveform Coding Systems', IEICE, vol. J 66-A, no. 3, pp. 274-281, 1983