DOI QR코드

DOI QR Code

Efficient Compensation of Spectral Tilt for Speech Recognition in Noisy Environment

잡음 환경에서 음성인식을 위한 스펙트럼 기울기의 효과적인 보상 방법

  • Cho, Jungho (Dept. of Digital Electronics, Dongseoul University)
  • 조정호 (동서울대학교 디지털전자과)
  • Received : 2016.10.18
  • Accepted : 2017.02.03
  • Published : 2017.02.28

Abstract

Environmental noise can degrade the performance of speech recognition system. This paper presents a procedure for performing cepstrum based feature compensation to make recognition system robust to noise. The approach is based on direct compensation of spectral tilt to remove effects of additive noise. The noise compensation scheme operates in the cepstral domain by means of calculating spectral tilt of the log power spectrum. Spectral compensation is applied in combination with SNR-dependent cepstral mean compensation. Experimental results, in the presence of white Gaussian noise, subway noise and car noise, show that the proposed compensation method achieves substantial improvements in recognition accuracy at various SNR's.

환경 잡음은 음성인식 시스템의 성능을 떨어뜨릴 수 있다. 이 논문은 인식 시스템이 잡음에 강인하도록 만들기 위하여, 켑스트럼에 기초한 특징 보상을 수행하는 과정을 제시한다. 이 방법은 부가적인 잡음의 영향을 제거하기 위한 직접적인 스펙트럼 기울기 보상에 기초를 둔다. 잡음 보상 방법은 로그 전력 스펙트럼의 스펙트럼 기울기 계산에 의하여 캡스트럼 영역에서 동작한다. 스펙트럼 보상은 SNR에 의존하는 켑스트럼 평균 보상 방법과 함께 사용된다. 백색 가우스 잡음, 지하철 잡음 및 자동차 잡음에 있는 조건에서, 실험 결과는 제안한 보상 방법이 여러 SNR에서 인식률을 상당히 개선한다는 것을 보여준다.

Keywords

References

  1. P. J. Moreno, Speech Recognition in Noisy Environments, Ph. D, Dissertation, Carnegie Mellon University, 1996.
  2. H. Hermansky, "RASTA processing of speech," IEEE Trans. Speech Audio processing, vol. 2, pp. 578-589, Oct. 1994. DOI: https://doi.org/10.1109/89.326616
  3. M. J. Gales, S. Young, "Robust speech recognition using parallel model combination," IEEE Trans. Speech Audio processing, vol. 4. pp. 352-359, Sep. 1996. https://doi.org/10.1109/89.536929
  4. J. Y. Ahn, Y. S. Kim, S. H. Kim, K. I. Hur, "A Study on Voice Recognition Pattern matching level for vehicle ECU control," The Journal of The Institute of Internet, Broadcasting and Communication (JIIBC), Vol. 10, No. 1, pp.75-80, Feb. 2010.
  5. S. V. Vaseghi and B. P. Milner, "Noise compensation methods for hidden Markov model speech recognition in adverse environments," IEEE Trans. Speech and Audio Processing, vol. 5, No. 1, pp. 11-21, Jan. 1997. https://doi.org/10.1109/89.554264
  6. D. C. Popescu and I. Zeljkovic, "Kalman filtering of colored noise for speech enhancement," ICSLP'96, Philadelphia, vol. 1, pp.426-429, Oct. 1996.
  7. S. F. Boll, "Suppression of acoustic noise in speech using spectral subtraction," IEEE Trans. Acoust., Speech, Signal Processing, vol. ASSP-27, no. 2, pp. 113-120, Apr. 1979.
  8. D. Naik, "Pole-filtered cepstral mean subtraction," ICSLP'95, Detroit, vol. 1, pp. 157-160, May, 1995.
  9. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, Prentice-Hall, 1978.
  10. J. Deller, Jr, J. Proakis and J. Hansen, Discrete-Time Processing of Speech Signals, Macmillan Publishing Co, New York, 1993.
  11. V. Goncharoff, E. VonColln, and R. Morris, "Efficient calculation of spectral tilt from various LPC parameters," Proc. IASTED, pp. 60-63, Nov. 1995.
  12. A. Oppenheim and D. Johnson, "Discrete representation of signals," Proc. of IEEE, vol. 60, no. 6, pp. 681-691, June, 1972. https://doi.org/10.1109/PROC.1972.8727
  13. P. A. Regalia, S. K. Mitra and P. P. Vaidyanathan, "The digital all-pass filter: A versatile signal processing building block," Proc. of IEEE, vol. 76, no. 1, pp. 19-37, Jan. 1988. https://doi.org/10.1109/5.3286