DOI QR코드

DOI QR Code

Speaker-dependent Speech Recognition Algorithm for Male and Female Classification

남녀성별 분류를 위한 화자종속 음성인식 알고리즘

  • Received : 2012.08.28
  • Accepted : 2012.10.08
  • Published : 2013.04.30

Abstract

This paper proposes a speaker-dependent speech recognition algorithm which can classify the gender for male and female speakers in white noise and car noise, using a neural network. The proposed speech recognition algorithm is trained by the neural network to recognize the gender for male and female speakers, using LPC (Linear Predictive Coding) cepstrum coefficients. In the experiment results, the maximal improvement of total speech recognition rate is 96% for white noise and 88% for car noise, respectively, after trained a total of six neural networks. Finally, the proposed speech recognition algorithm is compared with the results of a conventional speech recognition algorithm in the background noisy environment.

본 논문에서는 백색잡음 및 자동차잡음 환경 하에서 남녀 성별인식이 가능한 신경회로망에 의한 화자종속 음성인식 알고리즘을 제안한다. 본 논문에서 제안한 음성인식 알고리즘은 남성화자 및 여성화자를 인식하기 위하여 LPC 켑스트럼 계수를 사용하여 신경회로망에 의하여 학습된다. 본 실험에서는 백색잡음 및 자동차잡음에 대하여 총 6개의 신경회로망의 네크워크에 대한 인식결과를 나타낸다. 인식실험의 결과로부터 백색잡음에 대해서는 최대 96% 이상의 인식률, 자동차잡음에 대해서는 최대 88% 이상의 인식률을 구하였다. 마지막으로 본 실험에서는 제안하는 음성인식 알고리즘이 배경잡음 환경 하에서의 기존의 음성인식 알고리즘과 비교하여 본 방식의 알고리즘이 유효하다는 것을 실험으로 확인한다.

Keywords

References

  1. A. A. M. Abushariah, T. S. Gunawan, O. O. Khalifa and M. A. M. Abushariah, "English digits speech recognition system based on Hidden Markov Models", 2010 International Conference on Computer and Communication Engineering, pp. 1-5, May 2010.
  2. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagation errors", Nature, Vol. 323, pp. 533-536, 1986. https://doi.org/10.1038/323533a0
  3. T. T. Le, J. S. Mason and T. Kitamura, "Characteristics of multi-layer perceptron models in enhancing degraded speech", Proc. ICSLP-94, pp. 1611-1614, 1994.
  4. L. Yang, L. Jing, Y. Yuxiang and W. Jian, "Improvement algorithm of DTW on isolated-word recognition", 2011 IEEE International Conference on Computer Science and Automation Engineering, Vol. 3, pp. 319-322, 2011.
  5. Y. M. Zeng, Z. Y. Wu, T. Falk and W. Y. Chan, "Robust GMM based gender classification using pitch and RASTA-PLP parameters of speech", 2006 International Conference on Machine Learning and Cybernetics, pp. 3376-3379, August 2006.
  6. C. C. Chen, P. T. Lu, M. L. Hsia, J. Y. Ke and O.T.-C. Chen, "Gender-to-Age hierarchical recognition for speech", 2011 IEEE 54th International Midwest Symposium on Circuits and Systems, pp. 1-4, 2011.
  7. M. Kos, M.; D. Vlaj and Z. Kacic, "Speaker's gender classification and segmentation using spectral and cepstral feature averaging", 2011 18th International Conference on Systems, Signals and Image Processing, pp. 1-4, 2011.
  8. H. Xu, X. Zhang and L. Jia, "The extraction and simulation of Mel frequency cepstrum speech parameters", 2012 International Conference on Systems and Informatics, pp. 1765-1768, 2012.
  9. P. B. Patil, "Multilayered network for LPC based speech recognition", IEEE Transactions on Consumer Electronics, Vol. 44, No. 2, pp. 435-438, 1998. https://doi.org/10.1109/30.681960
  10. H. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions", in Proc. ISCA ITRW ASR2000 on Automatic Speech Recognition: Challenges for the Next Millennium, Paris, France, 2000.