A Method on the Learning Speed Improvement of the Online Error Backpropagation Algorithm in Speech Processing

음성처리에서 온라인 오류역전파 알고리즘의 학습속도 향상방법

  • 이태승 (한국항공대학교 항공전자공학과) ;
  • 이백영 (한국항공대학교 항공전자공학과) ;
  • 황병원 (한국항공대학교 항공전자공학과)
  • Published : 2002.07.01

Abstract

Having a variety of good characteristics against other pattern recognition techniques, the multilayer perceptron (MLP) has been widely used in speech recognition and speaker recognition. But, it is known that the error backpropagation (EBP) algorithm that MLP uses in learning has the defect that requires restricts long learning time, and it restricts severely the applications like speaker recognition and speaker adaptation requiring real time processing. Because the learning data for pattern recognition contain high redundancy, in order to increase the learning speed it is very effective to use the online-based learning methods, which update the weight vector of the MLP by the pattern. A typical online EBP algorithm applies the fixed learning rate for each update of the weight vector. Though a large amount of speedup with the online EBP can be obtained by choosing the appropriate fixed rate, firing the rate leads to the problem that the algorithm cannot respond effectively to different learning phases as the phases change and the number of patterns contributing to learning decreases. To solve this problem, this paper proposes a Changing rate and Omitting patterns in Instant Learning (COIL) method to apply the variable rate and the only patterns necessary to the learning phase when the phases come to change. In this paper, experimentations are conducted for speaker verification and speech recognition, and results are presented to verify the performance of the COIL.

다층신경망 (MLP: multilayer perceptron)은 다른 패턴인식 방법에 비해 여러 가지 훌륭한 특성을 가지고 있어 음성인식 및 화자인식 영역에서 폭넓게 사용되고 있다. 그러나 다층신경망의 학습에 일반적으로 사용되는 오류역전파 (EBP: error backpropagation) 알고리즘은 학습시간이 비교적 오래 걸린다는 단점이 있으며, 이는 화자인식이나 화자적응과 같이 실시간 처리를 요구하는 응용에서 상당한 제약으로 작용한다. 패턴인식에 사용되는 학습데이터는 풍부한 중복특성을 내포하고 있으므로 패턴마다 다층신경망의 내부변수를 갱신하는 온라인 계열의 학습방식이 속도의 향상에 상당한 효과가 있다. 일반적인 온라인 오류역전파 알고리즘에서는 가중치 갱신 시 고정된 학습률을 적용한다. 고정 학습률을 적절히 선택함으로써 패턴인식 응용에서 상당한 속도개선을 얻을 수 있지만, 학습률이 고정된 상태에서는 학습이 진행됨에 따라 학습에 기여하는 패턴영역이 달라지는 현상에 효과적으로 대응하지 못하는 문제가 있다. 이 문제에 대해 본 논문에서는 패턴의 기여도에 따라 가변 하는 학습률과 학습에 기여하는 패턴만을 학습에 반영하는 패턴별 가변 학습률 및 학습생략 (COIL: Changing rate and Omitting patterns in Instant Learning)방법을 제안한다. 제안한 COIL의 성능을 입증하기 위해 화자증명과 음성인식을 실험하고 그 결과를 제시한다.

Keywords

References

  1. Parallel Distributed Processing v.1 Learning Internal representations by error propagation D.E.Rumelhart;G.E.Hinton;R.J.Williams
  2. IEET Transactions on Acoustics,Speech,and Signal Processing v.26 no.1 Dynamic programming algorithm optimization for spoken word recognition H.Sakoe;S.Chiba https://doi.org/10.1109/TASSP.1978.1163055
  3. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing v.1 Speaker-Independent word recognition using dynamic programming neural networks H.Sakoe;R.Isotani;K.Yoshida;K.I.Iso;T.Watanabe
  4. Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing v.1 Integrating time alignment and connectionist networks for high performance continuous speech recognition P.Haffner;M.Franzini;A.Waibel
  5. Proceedings of the IEEE v.77 A tutorial on hidden markov models and selected applications in speech recognition L.R.Rabiner https://doi.org/10.1109/5.18626
  6. IEEE Transactions on Pattern Analysis and Machine Intelligence v.12 Links between hidden markov models and multilayer perceptrons H.Bourlard;C.Wellekens https://doi.org/10.1109/34.62605
  7. IEEE Transactions on Acoustics,Speech,and Signal Processing v.37 Phoneme recognition using time-delay neural networks A.Waibel;T.Hanazawa;G.Hinton;K.Shikano;K.J.Lang https://doi.org/10.1109/29.21701
  8. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing v.2 A high performance text independent speaker recognition system based on vowel spotting and neural nets N.Fakotakis;J.Sirigos https://doi.org/10.1109/ICASSP.1996.543207
  9. Proceedings of the International Joint Conference on Neural Networks v.2 A modular connectionist architecture for text-independent talker identification Y.Bennani;P.Gallinari
  10. Automatic Speech and Speaker Recognition Hybrid connectionist models for continuous speech recognition N.Morgan;H.Bourlard
  11. Neural Networks S.Haykin
  12. Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing v.1 A segment-based speaker adaptation neural network applied to continuous speech recognition K.Fukuzawa;Y.Komori;H.Sawai;M.Sugiyama
  13. Neural Network Toolbox H.Demuth;M.Beale
  14. Proceedings of the IEEE International Conference on Neural Networks v.1 A direct adaptive method for faster backpropagation learning: the rprop algorithm M.Riedmiller;H.Braun
  15. Practical Methods of Optimization R.Fletcher
  16. Proceedings of the 1992 IEEE-SP Workshop Neural Networks for Signal Processing Supervised learning on large redundant training sets M.Moller
  17. Proceedings of the 1988 Connectionist Models Summer School Improving the convergence of back-propagation learning with second-order methods S.Becker;Y.LeCun
  18. Neural Networks for Speech and Sequence Recognition Y.Bengio
  19. Proceedings of the International Joint Conference on Neural Networks v.1 The need for small learning rates on large problems D.R.Wilson;T.R.Martinez
  20. IEEE Communications Magazine v.1 Speaker verification: a tutorial J.M.Naik
  21. Speech Recognition C.Becchetti;L.P.Ricotti