Browse > Article

Pitch Estimation Method in an Integrated Time and Frequency Domain by Applying Linear Interpolation  

Kim, Ki-Chul (Dept. of Information and Communication Engineering, Sejong University)
Park, Sung-Joo (Korea Electronics Technology Institute)
Lee, Seok-Pil (Korea Electronics Technology Institute)
Kim, Moo-Young (Dept. of Information and Communication Engineering, Sejong University)
Publication Information
Abstract
An autocorrelation method is used in pitch estimation. Autocorrelation values in time and frequency domains, which have different characteristics, correspond to the pitch period and fundamental frequency, respectively. We utilize an integrated autocorrelation method in time and frequency domains. It can remove the errors of pitch doubling and having. In the time and frequency domains, pitch period and fundamental frequency have reciprocal relation to each other. Especially, fundamental frequency estimation ends up as an error because of the resolution of FFT. To reduce these artifacts, interpolation methods are applied in the integrated autocorrelation domain, which decreases pitch errors. Moreover, only for the pitch candidates found in a time domain, the corresponding frequency-domain autocorrelation values are calculated with reduced computational complexity. Using linear interpolation, we can decrease the required number of FFT coefficients by 8 times. Thus, compared to the conventional methods, computational complexity can be reduced by 9.5 times.
Keywords
Pitch; Fundamental Frequency; Autocorrelation; Speech Signal Processing;
Citations & Related Records
Times Cited By KSCI : 1  (Citation Analysis)
연도 인용수 순위
1 A. de Cheveigne and H. Kawahara, "YIN, a fundamental frequency estimation for speech and music," J. Acoust. Soc. Amer., vol. 111, no. 4, pp. 1917-1930, 2002.   DOI   ScienceOn
2 J.-S. R. Jang, "QBSH: A Corpus for designing QBSH (query by singing/humming) systems", Available at the "QBSH Corpus for Query by Singing/Humming" Link of the "Corpus page" at the organizer's homepage. [Online]. Available: http://www.cs.nthu.edu.tw/~jang
3 A. Klapuri, "Multipitch analysis of polyphonic music and speech signals using an auditory model," IEEE Trans. Audio, Speech, Language Process., vol. 16, no. 2, pp. 255-266, 2008.   DOI
4 A. M. Noll, "Cepstrum pitch determination," J. Acoust. Soc. Amer., vol. 44, no. 6, pp. 1585-1968, 1968.   DOI
5 C. DeBoor, "A Practical Guide to Splines", New York: Springer-Verlag, 1978.
6 한민수, 강동규, "유성음의 프레임별 피치검출," 대한전자공학회 학술대회 논문집, vol. 9, no. 1, pp. 491-494, 1996.
7 ITU-T Recommendation G.729, Coding of Speech at 8 kbit/s using Conjugate-Structure Algebraic-Code-Excited Linear-Prediction (CS-ACELP).
8 M. Antonelli and A. Rizzi, "A Correntropy-based voice to MIDI transcription algorithm," in Proc. IEEE int. Multimedia Signal Processing Workshop, pp. 978-983, 2008.
9 Y. D. Cho, M. Y. Kim, and S. R. Kim, "A spectrally mixed excitation (SMX) vocoder with robust parameter determination," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, vol. 2, pp. 601-604, 1998.
10 S. -P. Heo, M. Suzuki, A. Ito, and S, Makino, "An effective music information retrieval method using three-dimensional continuous DP," IEEE Trans. MULTIMEDIA, vol. 8, no. 3, pp. 633-639, 2006.   DOI
11 J.-S. R. Jang and H. -R. Lee, "A general framework of progressive filtering and its application to Query by Singing/Humming," IEEE Trans. Audio, Speech, Language process., vol. 16, no. 2, pp. 350-358, 2008.   DOI
12 박호종, 윤제열, "오디오 신호의 다중 피치 검출 기술," 대한전자공학회 전자공학회지, vol. 37, no. 1, pp. 63-72, 2010.   과학기술학회마을
13 H. Singer and S. Sagayama, "Pitch dependent phone modelling for HMM based speech recognition," in Proc. IEEE Int. Conf. Acoustics, Speech, Signal Processing, vol. 1, pp. 273-276, 1992.
14 손상목, 홍성훈, 배명진, "IMBE VOCODER의 피치검색시간 단축에 관한 연구," 대한전자공학회 학술대회 논문집, vol. 10, no. 1, pp. 271-274, 1997.
15 Y. J. Kim and J. H. Chung, "Pitch synchronous cepstrum for robust speaker recognition over telephone channels," IET Electronics letters, vol. 40, no. 3, pp. 207-209, 2004.   DOI   ScienceOn