A Study on Pitch Detection using Cochlear Model on Cochannel Speech

청각 모델을 이용한 Cochannel 음성에서의 피치 추출에 관한 연구

  • 신대규 (연세대 전기·컴퓨터공학과) ;
  • 신중인 ((주) 비젼아이트.) ;
  • 이재혁 (경동대 정보통신공학부) ;
  • 한두진 (연세대 전기·컴퓨터공학과) ;
  • 박상희 (연세대 전기·컴퓨터공학과)
  • Published : 2000.06.01

Abstract

In this paper, a new pitch estimation method is proposed using the Robinson cochlear model. This estimation method is useful in noisy environments and especially very efficient under cochannel in which two speaker voices exist at the same time. For the one speaker speech, the pitch can be extracted from just the neurogram of the Robinson cochlear model. In this case, as the estimation is performed in time domain, the exact pitch period can be detected though the pitch period is various. But in noisy and cochannel cases, the neurogram has many spurious peaks, so we use the autocorrelators in the neurogram to manifest the period. It the autocorrelators are used for the all delays, the large amount of calculations is necessary. Due to this defect, we propose that the autocorrelators are used for the part of the delays on which energy is concentrated. First of all, the proposed algorithm is applied to the one speaker speech, and later to the cochannel speech. And then the results are compared with the autocorrelation pitch detection method.

Keywords

References

  1. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals, Prentice Hall, 1978
  2. Obaidat MS, Brodzik A, Sadoun B, 'A performance evaluation study of four wavelet algorithms for the pitch period estimation of speech signals,' Information Sciences, Vol. 112, No. 1-4, pp. 213-221, 1998. 12 https://doi.org/10.1016/S0020-0255(98)10032-4
  3. G. Yegnanarayanan, 'A new model of hearing and its performance in pitch perception,' ph. D. thesis, Delaware Univ., 1985
  4. R. J. McAulay and T. F. Quatieri, 'Speech Analysis-Synthesis based on A Sinusoidal Representation,' IEEE Trans. Acoust., Speech, Signal Processing, vol. 34, pp. 744-754, Aug. 1986 https://doi.org/10.1109/TASSP.1986.1164910
  5. David P. Morgan, E. Bryan George, Leonard T. Lee, and Steven M. Kay, 'Cochannel Speaker Separation by Harmonic Enhancement and Suppression,' IEEE Trans. Speech, Audio Processing, Vol. 5, No. 5, pp. 407-424, 1997. 9 https://doi.org/10.1109/89.622561
  6. Matti Karjalainen and Tero Tolonen, 'Multi-pitch and periodicity analysis model for sound separation and auditory scene analysis,' in Proc. Int. Conf. Acoust., Speech, Signal Processing, vol. 2, pp. 929-932, 1999. 3 https://doi.org/10.1109/ICASSP.1999.759824
  7. Lee, Jae Hyuk, 'The Study on the Speech Recognition using Auditory Model,' M. S. thesis, Yonsei Univ., 1987