Browse > Article

Formant-broadened CMS Using the Log-spectrum Transformed from the Cepstrum  

김유진 (인하대학교 전자공학과 디지털 신호처리 연구실)
정혜경 (국방과학연구소 4체계 개발본부 전자전체계부 2팀인하대학교 전자공학과 디지털 신호처리 연구실)
정재호 (인하대학교 전자공학과 디지털 신호처리 연구실)
Abstract
In this paper, we propose a channel normalization method to improve the performance of CMS (cepstral mean subtraction) which is widely adopted to normalize a channel variation for speech and speaker recognition. CMS which estimates the channel effects by averaging long-term cepstrum has a weak point that the estimated channel is biased by the formants of voiced speech which include a useful speech information. The proposed Formant-broadened Cepstral Mean Subtraction (FBCMS) is based on the facts that the formants can be found easily in log spectrum which is transformed from the cepstrum by fourier transform and the formants correspond to the dominant poles of all-pole model which is usually modeled vocal tract. The FBCMS evaluates only poles to be broadened from the log spectrum without polynomial factorization and makes a formant-broadened cepstrum by broadening the bandwidths of formant poles. We can estimate the channel cepstrum effectively by averaging formant-broadened cepstral coefficients. We performed the experiments to compare FBCMS with CMS, PFCMS using 4 simulated telephone channels. In the experiment of channel estimation, we evaluated the distance cepstrum of real channel from the cepstrum of estimated channel and found that we were able to get the mean cepstrum closer to the channel cepstrum due to an softening the bias of mean cepstrum to speech. In the experiment of text-independent speaker identification, we showed the result that the proposed method was superior than the conventional CMS and comparable to the pole-filtered CMS. Consequently, we showed the proposed method was efficiently able to normalize the channel variation based on the conventional CMS.
Keywords
Speaker recognition; Speech recognition; Channel noise removal; Channel mismatch normalization; Cepstral normalization; CMS;
Citations & Related Records
Times Cited By KSCI : 3  (Citation Analysis)
연도 인용수 순위
1 Direct(nonrecursive) relations between cepstrum and predictor coefficients /
[ M. Schroeder ] / IEEE ASSP
2 전화선 채널이 화자확인 시스템의 성능에 미치는 영향 /
[ 조태현;김유진;이재영;정재호 ] / 한국음향학회지   과학기술학회마을
3 A new cepstrum-based channel compensation method for speaker verification /
[ T. F. Lo;M. W. Mak;K. K.Yiu ] / Proc. Eurospeech
4 On the use of bandpass liftering in speech recognition /
[ B. H. Juang;L. R. Rabiner;J. G. Wilpon ] / IEEE ASSP
5 A new Homomorphic Vocoder Framework Using Analysis-by-Synthesis Excitation Analysis /
[ Jae H. Chung ] / Ph. D. Thesis
6 /
[ L. R. Rabiner;B. H. Juang ] / Fundamentals of Speech Recognition
7 /
[ L. R. Rabiner;R. W. Schafer ] / Digital Processing of Speech Signals
8 Channel-robust speaker identification using modified-mean cepstral mean normalization with frequency warping /
[ A. A. Garcia;R. J. Mammone ] / Proc. ICASSP
9 Pole-filtered cepstral mean subtraction /
[ D. Naik ] / Proc. ICASSP
10 음성 인식을 위한 전화망에서의 잡음처리 /
[ 전원석 ] / 한국음향학회지   과학기술학회마을
11 Fast pole-filtering for speaker recognition /
[ R. P. Ramachandran;K. R. Farrell ] / Proc. ISCAS
12 Cepstal Analysis Technique for Automatic Speaker Verification /
[ S. Furui ] / IEEE ASSP
13 Channel normalization using pole-filtered cepstral mean subtraction /
[ D. Naik;R. Mammone ] / Proc. SPIE
14 Robust speaker recognition /
[ R. J. Mammone;X. Zhang;R. P. Ramachandran ] / IEEE signal processing magazine
15 /
[ J. Kupin ] / A Wireline Simulator[Software]
16 Channel estimation and normalization by coherent spectral averaging for robust speaker verification /
[ R. Balchandran;V. Ramanujam;R. J. Mammone ] / Proc. Eurospeech