A Simple Pitch Tracking Algorithm based on the Energy Operator

에너지 연산자에 기초한 간단한 피치 추적 방법

  • Published : 2004.01.01

Abstract

A new method for the estimation of pitch-frequency contour of voiced speech is presented. The method is based on the double application of Kaiser's energy operator[1], which has the capabilities of extracting amplitude and frequency of a sinusoidal waveform. According to the modulation model, a vowel can be represented by a combination of damped sinusoids representing formants, modulated by pitch pulses. Therefore, the amplitude envelope of each of the components will give a pitch-like waveform and the pitch can be obtained by averaging the frequencies of this waveform. The first part is the same as Gopalan's approach[9], but by substituting the LPC based spectral analysis with the second application of energy operator, the algorithm becomes very simple and can be processed on-line. Although the estimation is rather coarse, the suggested algorithm can be useful for getting a general sketch of pitch contour on-line.

유성음의 피치주파수 궤적을 추정할 수 있는 새로운 방법을 제시하였다. 이 방법은 에너지연산자[1]를 두 번 적용하는데 기초하고 있다. Kaiser의 에너지연산자는 정현파의 진폭과 주파수 정보를 추출하는 기능을 가지고 있다. 변조모형에 의하면 유성음은 피치 신호로 변조된 포만트들의 합성으로 파악될 수 있으므로 이 파형의 진폭 포락선을 추출해서 피치 신호와 유사한 파형을 얻는다. 이 파형의 평균 주파수를 검출하여 피치 주파수를 구하는 것이다. 앞부분은 Gopalan의 접근법[9]과 마찬가지이나, 뒷부분의 LPC-스펙트럼 분석등의 과정 대신 또 한번 에너지 연산자를 적용하도록 하여 매우 단순화되고 온라인 적용이 가능한 알고리듬을 얻었다. 추정 결과는 거친 편이지만 온라인으로 피치 궤적의 일반적 스케치를 얻는데 유용할 것으로 기대된다.

Keywords

References

  1. Proc. IEEE ICASSP 90 On a simple algorithm to calculate the 'energy' of a signal J.F.Kaiser
  2. Proc. IEEE ICASSP 92 On separating amplitude from frequency modulations using energy opergy operators P.Maragos;J.F.Kaiser;T.F.Quatieri
  3. IEEE Trans. on Signal Processing v.42 no.2 Conditions for positivity of an energy operator A.C.Bovik;P.Maragos
  4. Proc. IEEE ICASSP 01 An improved demodultion algorithm using splines D.Dimitriadis;P.Maragos
  5. IEEE Trans. on Signal Processing v.41 no.10 Energy separation in signal modulations with application to speech analysis P.Magros;J.F.Kaiser;T.F.Quatieri
  6. Proc. IEEE ICASSP 91 Speeh nonlinearity, modulation and energy operators P.Magros;J.F.Kaiser;T.F.Quatieri
  7. IEEE Signal Processing Letters v.1 no.11 Energy onset times for speaker identification T.F.Quatieri;C.R.Jankowski;Jr.;D.A.Reynolds
  8. Record of 28th Asilomar Conference on Sinals, Systems and Computers An investigation of estimating pitch period using a non-linear differential operator R.K.Whitman;D.M.Etter
  9. Proc. WCCC-ICSP2000 Pitch estimation using a modulation model of speech K.Gopalan
  10. Proc. IEEE ICASSP 92 Application of the modulation model to speech recognition A.B.Finberg;R.J.Mammone;J.L.Flanagan
  11. Speech Processing and Synthesis Toolboxes D.G.Childers
  12. IEEE Trans. on Signal Processing v.41 AM-FM energy detection and separation in noise using multiband energy operators A.C.Bovik;P.Maragos;T.F.Quatieri
  13. Proc. IEEE ICASSP 84 v.9 A performance comparison of pitch extraction algorithms for noisy speeh K.A.Oh;C.K.Un
  14. IEEE Trans. on Aucoustics, Speech, and Signal Processing v.ASSP-26 no.4 Real-time harmonic pitch detector S.Seneff
  15. Proc. IEEE ICASSP 90 v.1 Pitch estimation and voicing detection based on a sinusoidal speech model R.J.McAulay;T.F.Quatieri