Speech Recognition of Multi-Syllable Words Using Soft Computing Techniques

소프트컴퓨팅 기법을 이용한 다음절 단어의 음성인식

  • 이종수 (연세대학교 기계공학부) ;
  • 윤지원 (연세대학교 대학원 기계공학과)
  • Received : 2010.03.12
  • Accepted : 2010.03.23
  • Published : 2010.03.25

Abstract

The performance of the speech recognition mainly depends on uncertain factors such as speaker's conditions and environmental effects. The present study deals with the speech recognition of a number of multi-syllable isolated Korean words using soft computing techniques such as back-propagation neural network, fuzzy inference system, and fuzzy neural network. Feature patterns for the speech recognition are analyzed with 12th order thirty frames that are normalized by the linear predictive coding and Cepstrums. Using four models of speech recognizer, actual experiments for both single-speakers and multiple-speakers are conducted. Through this study, the recognizers of combined fuzzy logic and back-propagation neural network and fuzzy neural network show the better performance in identifying the speech recognition.

Keywords

References

  1. Oh, T. H., 1998, Speech Language Information Processing, Hong-Neung Science Publishing.
  2. Lee, H. S., 1999, Speech Recognition Technique, Chong-Moon-Gak.
  3. Rabiner, L., and Juang, B. H., 1993, Fundamentals of Speech Recognition, Prentice-Hall.
  4. Jang, J.-S. R., Sun, C.-T., and Mizutani, E., 1997, Neuro-Fuzzy and Soft Computing, Prentice Hall, Upper Saddle River, NJ.
  5. Choi, M. G., and Lee, S. B., 1996, "The Study on the Algorithm for Design of Fuzzy Logic Controller Using Neural Network," Proceedings of Fall Conference on Korean Society of Fuzzy Logic and Intelligent System, pp. 243-248.
  6. Wang, L-X., 1997, A Course in Fuzzy Systems and Control, Prentice-Hall, Upper Saddle River, NJ.
  7. Kim, J. H., Ryu, H. S., Kang, J. M., Kang S. I., Kim, K. H., and Lee, S. B., 2002, "Wheelchair System Design on Speech Recognition Function," Proceedings of Fall Conference on Korean Society of Fuzzy Logic and Intelligent System, pp. 1-5.
  8. Kasabov, N. K., Kozma, R., and Watts, M. J., 1998, "Phoneme-Based Speech Recognition via Fuzzy Neural Networks Modeling and Learning," International Journal of Informatics and Computer Science, Vol. 110, Issue 1-2, pp. 61-79.
  9. Halavati, R., Shouraki, S. B., and Zadeh, S. H., 2007, "Recognition of Human Speech Phonemes Using a Novel Fuzzy Approach," Applied Soft Computing, Vol. 7, No. 3, pp. 828-839. https://doi.org/10.1016/j.asoc.2006.02.007
  10. Helmi, N., and Helmi, B. H., 2008, "Speech Recognition with Fuzzy Neural Network for Discrete Words," Proceedings of 4th International Conference on Natural Computation, Vol. 7, pp. 265-269, Jinan, China.
  11. Jang, C.-F., Chiou, C.-T., and Lai, C.-L., 2007, "Hierarchical Singelton-Type Recurrent Neural Fuzzy Networks for Noisy Speech Recognition," IEEE Transactions on Neural Networks, Vol. 18, No. 3, pp. 833-848. https://doi.org/10.1109/TNN.2007.891194
  12. Othman, A. M., and Riadh, M. H., 2008, "Speech Recognition Using Scaly Neural Networks," International Journal of Intelligent Systems and Technologies, Vol. 3, No. 2, pp. 71-76.
  13. Kasabov, N., and Iliev, G., 2001, "Hybrid System for Robust Recognition of Noisy Speech Based on Evolving Fuzzy Neural Networks and Adaptive Filtering," IEEE/ENNS/ENNS International Joint Conference on Neural Networks.
  14. Lippmann, R. P., 1989, "Review of Neural Networks for Speech Recognition," Neural Computation, Vol. 1, No. 1, pp. 1-38. https://doi.org/10.1162/neco.1989.1.1.1