Browse > Article

Experiments on Extraction of Non-Parametric Warping Functions for Speaker Normalization  

Shin, Ok-Keun (한국해양대학교 IT공학부)
Abstract
In this paper. experiments are conducted to extract a set of non-Parametric warping functions to examine the characteristics of the warping among speakers' utterances. For this Purpose. we made use of MFCC and LP spectra of vowels in choosing reference spectrum of each vowel as well as representative spectra of each speaker. These spectra are compared by DTW to give the warping functions of each speaker. The set of warping functions are then defined by clustering the warping functions of all the speakers. Noting that male and female warping functions have shapes similar to Piecewise linear function and Power function respectively, a new hybrid set of warping functions is defined. The effectiveness of the extracted warping functions are evaluated by conducting phone level recognition experiments, and improvements in accuracy rate are observed in both warping functions.
Keywords
Speech recognition; Speaker Normalization; Non-parametric; Hybrid; Warping function;
Citations & Related Records
연도 인용수 순위
  • Reference
1 P. Zhan and Alex Waibel, 'Vocal Tract Length Normalization for Large Vocabulary Continuous Speech Recognition, Language Technologies Institute Technical Report : CMU-LTI-97-150, Carnegie Melon University, May, 1997
2 P. Zhan and M. Westphal, 'Speaker normalization based on frequency warping', ICASSP-97, Munich, Germany. 1039-1042, 1997
3 L. Rabiner and B. Juang, Fundamentals of Speech Recognition, (Parentice Hall, New Jersy, 1993.)
4 P. L. Dognin. 'A Bandpass Transformation for Speaker Normalization', Ph.D. Thesis, University of Pittsburgh, 2003
5 J. S. Garofolo. L. F. Lamel, W. M. Fisher. J. G. Fiscus, D. S. Pallet and N. L. Dahlgren. DARPA TIMIT Acoustic Phonetic Continuous Speech Corpus: CDROM, NIST., 1993
6 D. Pye and P. C. Woodland, 'Experiments in Speaker Normalization', ICASSP, 1047-1050, 1997
7 L. Lee and R. C. Rose, 'A Frequency Warping Approach to Speaker Normalization', IEEE Trans. on Speech and Audio Processing. 6 (1), 49-60, 1998   DOI   ScienceOn
8 H. Wakita, 'Normalization of Vowels by Vocal Tract Length and Its Application to Vowel Identification', IEEE. Trans. on ASSP. Vol. ASSP-25, No. 2, 183-192, 1977
9 S. Young, D. Kershaw, J. Odell, D. Ollason, V. Valtchev and P. Woodland, The HTK Book. ver. 3., Microsoft Corp., 2000
10 S. Molau, S. Kanthak and H. Ney, 'Efficient Vocal Tract Normalization in Automatic Speech Recognition', Proc. ESSV, 209-216, 2000
11 M. Pitz and H. Ney, 'Vocal Tract Normalization as Linear Transformation of MFCC', Proc. EUROSPEECH, 1445-1448, 2003
12 S. Umesh. S. V. B. Kumar, M. K. Vinay, R. Shamar and R. Shinha, 'A Simple Approach to Non-Uniform Vowel Normalization,' Proc. ICASSP, 517-520, 2002
13 M. A. Bacciani, Speech Recognition System Design Based On Automatically Derived Units, Ph. D. Thesis, (Boston University, 1999.)
14 신옥근, '연속음성 인식기를 위한 벡터양자화기 기반의 화자정규화', 한국음향학회지, 제23권 제8호, 583-589, 2004
15 E. Edie and H. Gish, 'A Parametric Approach to Vocal Tract Length Normalization', Proc. ICASSP'96, 346-349, 1996