Browse > Article
http://dx.doi.org/10.13067/JKIECS.2011.6.1.034

Classification of Consonants by SOM and LVQ  

Lee, Chai-Bong (동서대학교 전자공학과)
Lee, Chang-Young (동서대학교 시스템경영공과)
Publication Information
The Journal of the Korea institute of electronic communication sciences / v.6, no.1, 2011 , pp. 34-42 More about this Journal
Abstract
In an effort to the practical realization of phonetic typewriter, we concentrate on the classification of consonants in this paper. Since many of consonants do not show periodic behavior in time domain and thus the validity for Fourier analysis of them are not convincing, vector quantization (VQ) via LBG clustering is first performed to check if the feature vectors of MFCC and LPCC are ever meaningful for consonants. Experimental results of VQ showed that it's not easy to draw a clear-cut conclusion as to the validity of Fourier analysis for consonants. For classification purpose, two kinds of neural networks are employed in our study: self organizing map (SOM) and learning vector quantization (LVQ). Results from SOM revealed that some pairs of phonemes are not resolved. Though LVQ is free from this difficulty inherently, the classification accuracy was found to be low. This suggests that, as long as consonant classification by LVQ is concerned, other types of feature vectors than MFCC should be deployed in parallel. However, the combination of MFCC/LVQ was not found to be inferior to the classification of phonemes by language-moded based approach. In all of our work, LPCC worked worse than MFCC.
Keywords
Speech recognition; Phonetic Typewriter; Self Organizing Map (SOM); Learning Vector Quantization (LVQ);
Citations & Related Records
연도 인용수 순위
  • Reference
1 Kaplan, G. "Words Into Action I," IEEE Spectrum, Vol. 17, pp. 22-26, 1980.
2 Davis, K. H., Biddulph, R., and Balashek, S., "Automatic Recognition of Spoken Digits," J. Acoust. Soc. Am., Vol. 24, No. 6, pp. 637-642, 1952.   DOI
3 Kohonen, T. Self-organization and Associative Memory, 3rd ed., Springer-Verlag, Berlin, 1989..
4 Olson, H. F. and Belar, H., "Phonetic Typewriter," ITE Trans. on Audio, Vol. 5, No. 4, pp. 90-95, 1957.   DOI
5 Kohonen, T. "The Neural Phonetic Typewriter," Computer, Vol. 21, No. 3, pp. 11-22, 1988.
6 Kohonen, T. et al, "Phonetic Typewriter for Finnish and Japanese," ICASSP-88, Vol. 1, pp. 607-610, 1988.
7 Yamada, T., Hanazawa, T., and Kawabata, T. . "Phonetic Typewriter Based on Phoneme Source Modeling," ICASSP-91, Vol. 1, pp. 169-172, 1991.
8 Kohonen, T.,"Workstation-Based Phonetic Typewriter," Neural Networks for Signal Processing, pp. 279-288, 1991.
9 Waibel, A. et al,. "Phoneme Recognition Using Time-Delay Neural Networks," IEEE Trans. on Acoustics, Speech, and Signal Processing, Vol. 37, No. 3, pp. 328-339, 1989.   DOI   ScienceOn
10 Picone, J. W., "Signal Modeling Techniques in Speech Recognition." Proc. IEEE, Vol. 81, No. 9, pp. 1215-1247, 1993.   DOI   ScienceOn
11 Haykin, S. (1999). Neural Networks (2nd Ed.), Prentice Hall, pp. 443-479, 1999.
12 Kohonen, T., "Improved Versions of Learning Vector Quantization," International Joint Conference on Neural Networks, Vol. 1, pp. 545-550, 1990.
13 Fausett, L., Fundamentals of Neural Networks, Prentice Hall, pp. 187-194, 1994.
14 Rabiner, L. & Juang, B., undamentals of Speech Recognition, Prentice Hall, pp. 20-37, 1993.
15 Deller, J. R., Proakis, J. G., & Hansen, J. H. L. Discrete-Time Processing of Speech Signals, Macmillan, pp. 117-137, 1993.
16 Durbin, J., "The Fitting of Time Series Models," Review of the Institute for International Statistics, Vol. 28, pp. 233-243, 1960.   DOI   ScienceOn
17 Lin, H. & Ou, Z. "Switching Auxiliary Chains for Speech Recognition," IEEE Signal Processing Letters, Vol. 14, No. 8, pp. 568-571, 2007.   DOI