Browse > Article

Feature Extraction by Optimizing the Cepstral Resolution of Frequency Sub-bands  

지상문 (경성대학교 정보과학부)
조훈영 (한국과학기술원 전자전산학과 전산학전공)
오영환 (한국과학기술원 전자전산학과 전산학전공)
Abstract
Feature vectors for conventional speech recognition are usually extracted in full frequency band. Therefore, each sub-band contributes equally to final speech recognition results. In this paper, feature Teeters are extracted indepedently in each sub-band. The cepstral resolution of each sub-band feature is controlled for the optimal speech recognition. For this purpose, different dimension of each sub-band ceptral vectors are extracted based on the multi-band approach, which extracts feature vector independently for each sub-band. Speech recognition rates and clustering quality are suggested as the criteria for finding the optimal combination of sub-band Teeter dimension. In the connected digit recognition experiments using TIDIGITS database, the proposed method gave string accuracy of 99.125%, 99.775% percent correct, and 99.705% percent accuracy, which is 38%, 32% and 37% error rate reduction relative to baseline full-band feature vector, respectively.
Keywords
Cepstral resolution; Sub-band cepstral vector; Multi-band approach; The optimal combination of sub-band vector dimension;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Towards ASR on partially corrupted speech /
[ H.Hermansky:S.Tibrewala;M.Pavel ] / Proc. Int. Conf. on Spoken Language Processing
2 Optimizing feature extraction for english wrid recognition /
[ E.Choi;D.Hyun;C.Lee ] / Proc. ICASSP
3 Including detailed information feature in MFCC for large vocabulary continuous speech recognition /
[ J.Lei;X.Bo ] / Proc. ICASSP
4 ASR based on independent processing and recombinaiton of partial frequency bands /
[ H.Bourlard;S.Dupont ] / Proc. Int. Conf. on Spoken Language Processing
5 Optimal feature sub-space selection based on discriminant analysis /
[ K.Demuynck;J.Duchateau;D.V.Compernolle ] / Proc. EUROSPEECH
6 How do humans pocess and recognize speech /
[ J.B.Allen ] / IEEE Trans. On Speech and Audio Processing   ScienceOn
7 Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition /
[ N.Kumar;A.G.Andreou ] / Speech Communicatioon
8 Comparison of parametric representations for monosyllable word recognition /
[ S.B.Davis;P.Mermelstain ] / IEEE Trans. ASSP
9 Subband feature extraction using lapped orthogonal transform for speech recognition /
[ Z.Tufekci;J.Gowdy ] / Proc. ICASSP, SPEECH-P11.10
10 /
[ K.Fukunaga ] / Statistical Pattern Recognition
11 Integration of fixed and multiple resolution analysis in a speech recognition system /
[ R.Gemello;D.Albesano;L.Moisa;R.Mori ] / Proc. ICASSP. SPEECH-P11,3
12 A multi-band approach to automatic speech recognition /
[ N.N.Mirghafori ] / ICI TR-99-04
13 Multi-band automatic speech recognition /
[ C.Cerisera;D.Fohr ] / Computer Speech and Language   ScienceOn
14 A database for speaker independent digit recognition /
[ R.G.Reonard ] / Proc. ICASSP, 3. 42.11/1-4
15 Perceptual linear prediction(PLP) analysis of speech /
[ H.Hermansky ] / J. Acoust. Soc. Am.