DOI QR코드

DOI QR Code

LPC 켑스트럼 계수와 신경회로망을 사용한 화자인식

Speaker Recognition using LPC cepstrum Coefficients and Neural Network

  • 투고 : 2011.07.14
  • 심사 : 2011.08.26
  • 발행 : 2011.12.31

초록

본 논문에서는 퍼셉트론 신경회로망과 선형예측부호화 켑스트럼 계수를 사용한 화자인식 알고리즘을 제안한다. 제안하는 화자인식 알고리즘은 입력받은 음성신호에 대해서 유성음 구간을 추출한다. 추출된 유성음 구간에 대하여 선형예측 분석에 의하여 화자의 특성을 가지고 있는 선형예측부호화 켑스트럼 계수를 구한다. 구해진 선형예측부호화 켑스트럼 계수를 분류하기 위하여 이 켑스트럼 계수를 퍼셉트론 신경회로망의 입력으로 사용하여 네트워크의 학습을 수행한다. 본 실험에서는 선형예측부호화 켑스트럼 계수와 신경회로망을 사용하여 본 화자인식 알고리즘이 유효하다는 것을 인식률을 통하여 확인한다.

This paper proposes a speaker recognition algorithm using a perceptron neural network and LPC (Linear Predictive Coding) cepstrum coefficients. The proposed algorithm first detects the voiced sections at each frame. Then, the LPC cepstrum coefficients which have speaker characteristics are obtained by the linear predictive analysis for the detected voiced sections. To classify the obtained LPC cepstrum coefficients, a neural network is trained using the LPC cepstrum coefficients. In this experiment, the performance of the proposed algorithm was evaluated using the speech recognition rates based on the LPC cepstrum coefficients and the neural network.

키워드

참고문헌

  1. A. Revathi, Y. Venkataramani, "Speaker independent continuous speech and isolated digit recognition using VQ and HMM", 2011 International Conference on Communications and Signal Processing, pp. 198-202, 2011.
  2. K. Kuah, M. Bodruzzaman, S. Zein-Sabatto, "A neural network-based text independent voice recognition system", Proceedings of the 1994 IEEE Southeastcon 'Creative Technology Transfer - A Global Affair'., pp. 131-135, 1994.
  3. B. Lu, J. J. Xu, "Research on Isolated Word Speech Recognition Based on Biomimetic Pattern Recognition", 2009 International Conference on Artificial Intelligence and Computational Intelligence, Vol. 2, pp. 436-439, 2009.
  4. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, "Learning representations by back-propagation errors", Nature, Vol. 323, pp. 533-536, 1986. https://doi.org/10.1038/323533a0
  5. S. K. Pal, S. Mitra, "Multilayer perceptron, fuzzy sets, and classification", IEEE Transaction on Neural Networks, Vol. 3, No. 5, pp. 683-697, 1992. https://doi.org/10.1109/72.159058
  6. W. G. Knecht, M. E. Schenkel, G. S. Moschytz, "Neural network filters for speech enhancement", IEEE Trans. Speech and Audio Processing, Vol. 3, No. 6, pp. 433-438, 1995. https://doi.org/10.1109/89.482210
  7. P. B. Patil, "Multilayered network for LPC based speech recognition", IEEE Transactions on Consumer Electronics, Vol. 44, No. 2, pp. 435-438, 1998. https://doi.org/10.1109/30.681960
  8. H. Hirsch and D. Pearce, "The AURORA experimental framework for the performance evaluations of speech recognition systems under noisy conditions", in Proc. ISCA ITRW ASR2000 on Automatic Speech Recognition: Challenges for the Next Millennium, Paris, France, 2000.

피인용 문헌

  1. Improvement of Signal-to-Noise Ratio for Speech under Noisy Environment vol.17, pp.7, 2013, https://doi.org/10.6109/jkiice.2013.17.7.1571