Speech Recognition by Neural Net Pattern Recognition Equations with Self-organization

  • Kim, Sung-Ill (Division of Electrical and Electronic Engineering, Kyungnam University) ;
  • Chung, Hyun-Yeol (School of Electrical Engineering and Computer Science, Yeungnam University)
  • Published : 2003.06.01

Abstract

The modified neural net pattern recognition equations were attempted to apply to speech recognition. The proposed method has a dynamic process of self-organization that has been proved to be successful in recognizing a depth perception in stereoscopic vision. This study has shown that the process has also been useful in recognizing human speech. In the processing, input vocal signals are first compared with standard models to measure similarities that are then given to a process of self-organization in neural net equations. The competitive and cooperative processes are conducted among neighboring input similarities, so that only one winner neuron is finally detected. In a comparative study, it showed that the proposed neural networks outperformed the conventional HMM speech recognizer under the same conditions.

Keywords

References

  1. P. C. Woodland, C. J. Leggestter, J. J. Odell, V. Valtchev, and S. J. Young. 'The 1994 HTK large vocabulary speech recognition system,' Proc. IEEE Int. Conf. On Acoustics, Speech, and Signal Processing, 1, 73-76, Detroit, 1995
  2. X. D. Huang, Y. Ariki, and M. A. Jack, Hidden Makov Models for Speech Recognition, Edinburgh University Press, Edinburgh, U. K., 1990
  3. L. Rabiner, A Tutorial 'Hidden Markov models and selected applications in speech recognition,' A. Waibel and K.-F. Lee, editors, Readings in Speech Recognition, 267-296, Morgan Kaufmann, San Mateo, 1990
  4. C. J. Bourlard, and Wellekens. 'Links between Markov models and multi-layer perceptrons,' IEEE Trans. Patt. Anal. Machine Intell., 12, 1167-1178, 1990 https://doi.org/10.1109/34.62605
  5. T. Waibel, G. Hanazawa, K. Hinton, Shikano et al., 'Phoneme recognition using time-delay neural networks,' IEEE Trans. on Acoustics, Speech and Signal Processing, 37 (3), 329-339, 1989
  6. Martinelli, 'Hidden control neural network,' IEEE Trans on Circuits and Systems, Analog and Signal Processing, 41 (3), 245-247, 1994
  7. D. Reimann, T. Ditzinger, E. Fischer, and H. Haken, 'Vergence eye movement control and multivalent perception of Autostereograms,' Biol. Cybern. 73, 123-128, 1995 https://doi.org/10.1007/BF00204050
  8. D. Reinmann, H. Haken, 'Stereo vision by self-organization,' Biol. Cybern., 71, 17-26, 1994 https://doi.org/10.1007/BF00198908
  9. S. Amari, and M. A. Arbib, 'Competition and cooperation in neural nets,' Systems Neuroscience, 119-165, Academic Press, 1977
  10. Y. Yoshitomi, T. Kanda, and T. Kitazoe, 'Neural nets pattern recognition equation for stereo vision,' Trans. IPS, 29-38, 1998
  11. Y. Yoshitomi, T. Kitazoe, J. Tomiyama and Y. Tatebe, 'Sequential stereo vision and phase transition,' Proc. of Third Int. Symp. On Artificial Lile, and Robotics, 318-323, 1998
  12. T. Kitazoe, J. Tomiyama, Y. Yoshitomi, et al., 'Sequential stereoscopic vision and hysteresis,' Proc. Neural Information Processing, 391-396, 1998
  13. T. Kitazoe, S.-I. Kim, and T. Ichiki, 'Acoustic speech recognition by two and three layered neural networks with competition and cooperation,' Proceeding of International Workshop on Speech and computer, 111-114, 1999
  14. T. Kitazoe, S-I Kim, and T. Ichiki, 'Speech Recognition using Stereovision Neural Network Model,' Proc. International Symposium on Artificial Life and Robotics, 2, 576-579, 1999
  15. T. Kohonen, 'Self-organizing map,' Proc. IEEE, 78 (9), 1464-1480, 1990 https://doi.org/10.1109/5.58325