DOI QR코드

DOI QR Code

Incremental Learning을 이용한 화자 인식

The Speaker Identification Using Incremental Learning

  • 심귀보 (중앙대학교 전자전기공학부) ;
  • 허광승 (중앙대학교 전자전기공학부) ;
  • 박창현 (중앙대학교 전자전기공학부) ;
  • 이동욱 (중앙대학교 정보통신연구원)
  • 발행 : 2003.10.01

초록

음성 속에는 화자의 특징이 포함되어 있다. 본 논문에서는 신경망에 기초한 Incremental Learning을 이용하여 화자 수에 제한 받지 않는 화자 인식 시스템을 제안한다. 컴퓨터를 통하여 녹음된 음성 신호는 End Detection과정을 통하여 유성음과 무성음을 분류하고 LPC를 이용해 12차수의 Cepstral Coefficients를 추출한다. 이 계수는 화자 식별을 위한 학습 입력값으로 사용 된다. Incremental Learning은 이미 학습한 Weight들을 기억하고 새로운 data에 대해서만 학습을 하는 학습 방법으로 Neural Network 구조가 화자 수에 따라 늘어나므로 화자 수에 제한을 받지 않고 학습이 가능하다.

Speech signal has the features of speakers. In this paper, we propose the speaker identification system which use the incremental learning based on neural network. Recorded speech signal through the Mic is passed the end detection and is divided voiced signal and unvoiced signal. The extracted 12 order cpestrum are used the input data for neural network. Incremental learning is the learning algorithm that the learned weights are remembered and only the new weights, that is created as adding new speaker, are trained. The architecture of neural network is extended with the number of speakers. So, this system can learn without the restricted number of speakers.

키워드

참고문헌

  1. N. Mohankrishnan, M. Shridhar, M.A. Sid-Ahmed "A Composite Scheme for Text-Independent Speaker Recognition", Acoustic, Speech and Signal Processing, IEEE International Conference on'82, Vol: 7, pp. 1653-1656, 1982 https://doi.org/10.1109/ICASSP.1982.1171437
  2. S. Pruzansky, "Pattern-matching procedure for automatic talker recognition", J. Acoustic. Soc. Amer, Vol: 35, pp. 354-358, Apr, 1971
  3. F.K. Soong, A.E. Rosenberg, L.R. Rabiner, B.H. Juang, "A vector quantization approach to speaker recognition", in Proc. ICASSP, pp. 387-390, 1985
  4. Kevin R.Farrell, Richard J.Mammone, Khaled T.Assaleh, "Speaker Recognition Using Neural Networks and Conventional Classifiers", IEEE Transaction on speech and audio processing, Vol: 2, No.1, pp. 194-205, January 1994 https://doi.org/10.1109/89.260362
  5. K.Farrell, R.J.Mammone, A.L.Gorin, "Adaptive Language Acqusition Using Incremental Learning", Acoustics, Speech, and Signal Processing, 1993, ICASSP-93, 1993, IEEE International conference on, Vol: 1, pp. 501-504, Apr 1993
  6. R.Poliker, L.Udpa, S.S.Udpa, V.Honavar, "Learn++: An Incremental Learning algorithm for Multilayer perceptron networks", Acoustic, Speech and Signal Processing, 2000, ICASSP'00, Proceedings, 2000, IEEE International Conference on, Vol: 6, pp. 3414-3417, 2000 https://doi.org/10.1109/ICASSP.2000.860134
  7. 한진수, 음성신호처리, 오성미디어, pp.20-23, 2000
  8. John R.Deller, Jr., John H.L. Hansen, John G. Proakis, Discrete-Time Processing of Speech Signals, IEEE Press, pp. 246-251, 1993
  9. A.M.Kondoz, Digital Speech-Coding for Low Bit Rate Communications Systems, John Wiley & Sons Ltd, pp. 44-46, 1994
  10. Paul M.Embree, Bruce Kimble, C language Algorithms for Digital Signal Processing, Pretence-Hall International Editions, pp. 31-32, 1991
  11. Xuedong Huang, Alejandro, Hsiao-Wuen Hon, Spoken Language Processing A Guide to Theory, Algorithms, and System Development, pp. 294-295, 2000
  12. Ehab F.M.F.Badan, Hany Selim, "Speaker Recognition Using Artificial Neural Networks Based on Vowel phonemes", Signal Processing Proceedings 2000, WCCC-ICSP 2000 5th International Conference on, vol.2, pp. 796-802, 2000 https://doi.org/10.1109/ICOSP.2000.891631
  13. Xician Yue, Datian Ye, Chongxum Zheng, Xiaoyu Wu, "Neural Networks for Improved Text-Independent Speaker Identification", IEEE Engineering in Medicine and Biology Magazine, vol. 21, issue. 2, pp. 53-58, March-April, 2000
  14. Raul Rojas "Neural Networks A Systematic Introduction" Springer, pp. 149-182, 1996