Utilizing Experiences of Supervisor in Sequential Learning for Multilayer Perceptron

지도 경험을 활용한 다계층 퍼셉트론의 순차적 학습 방법

  • Received : 2010.07.05
  • Accepted : 2010.08.27
  • Published : 2010.10.15

Abstract

Evaluating the level of achievement and providing the knowledge which is appropriate at the evaluated level have great influence in studying of the human beings. This shows the importance of the order of training and the training order should be considered in machine learning. In this research, to assess the influence of the order of training, we propose a method of controlling the order of training samples utilizing the experience of supervisor in the training of MLP. The supervisor finds out the current state of MLP using teaching experience and student evaluation, and then selects the most instructive sample for MLP in that state. We use CRF to represent and utilize the experience of supervisor. While the proposed method is similar to active learning in selecting samples, it is basically different in that selection is not to reduce the number of samples to be used but to assist the learning progress. The result from classification problem shows that the method is usually effective in terms of time taken in training in contrast to random selection.

학습 수준의 평가와 수준에 맞는 지식의 제공은 인간의 학습 과정에 많은 영향을 준다. 이것은 학습 순서가 중요하다는 것을 말하고 있으며, 기계 학습에서도 학습 순서를 고려할 필요가 있다. 본 연구는 학습 순서가 학습에 미치는 영향을 알아보기 위해, MLP의 학습에서 지도자의 경험을 이용하여 학습순서를 제어하는 방법을 제안한다. 지도 경험과 평가를 이용하여 MLP의 상태를 파악하고, 현 상태에서 학습 효율이 좋을 것으로 예상되는 학습 자료를 선택하여 학습을 시킨다. 지도자의 경험을 표현하고 활용하기 위해 CRF(Conditional Random Fields)를 이용하였다. 제안한 방법은 학습 자료를 선택한다는 점에서 능동 학습(Active Learning)과 유사하지만, 학습 순서를 제시하기 위한 자료의 선택이란 점에서 능동학습과는 차이가 있다. 분류 문제에 대하여 실험해 본 결과, 순서의 제어가 없는 학습의 경우에 비하여 학습 횟수의 측면에서 일반적으로 더 나은 학습 성능을 보여준다.

Keywords

References

  1. Simon Haykin, Neural Networks:A Comprehensive Foundation, 2nd Ed., Prentice Hall, 1999.
  2. David MacKay, Information Theory, Inference and Learning Algorithms, Cambridge University Press. 2003.
  3. Corinna Cortes and Vladimir Vapnik, "Support- Vector Networks," Machine Leaming, vol.20, pp.273-297, 1995.
  4. J. Lafferty, A. Mccallum and F. Pereira, "Conditional random fields: Probabilistic models for segmenting and labeling sequence data," Proc. 18th International Conf. on Machine Learning, Morgan Kaufmann, San Francisco, CA, pp.282-289, 2001.
  5. Burr Settles, "Biomedical Named Entity Recognition using Conditional Random Fields and Rich Feature Sets," International Joint workshop on Natural Language Processing in Biomedicine and its Applications (NLPBA/BioNLP), pp.107-110, 2004.
  6. Charles Sutton and Andrew McCallum, "An Introduction to Conditional Random Fields for Relational Learning : Introduction to Statistical Relational Learning," MIT Press, Chap. 4, pp.93-127, November 2007.
  7. R. Klinger and K. Tomanek, "Classical probabilistic models and conditional random fields," Technische University at Dortmund, Electronic Publication, 2007.
  8. Rahul Gupta, "Conditional random Fields," Technical report, IIT Bombay, 2006.
  9. Fei Sha and Fernando Pereira, "Shallow parsing with Conditional Random Fields," Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology (NAACL '03), pp.134-141, 2003.
  10. Mingkun Li and Ishwar K.Sethi, "Confidence-Based Active Learning," IEEE Transaction on Pattern Analysis and Machine Intelligence, vol.28, no.8, pp.1251-1261, Aug. 2006.
  11. Burr Settles, "Active Learning Literature Survey," Computer Sciences Technical Report 1648, University of Wisconsin-Madison. 2009.
  12. Fredrik Olsson, "A literature survey of active machine learning in the context of natural language processing," SICS Technical Report T2009:06. Swedish Institute of Computer Science, Kista, Sweden. 2009.
  13. J.-Y. Lee, H.-S. Kim, "Effect of Training Sequence Control in On-line Learning for Multilayer Perceptron," Journal of KIISE : Software and Applications, vol.37, no.7, pp.491-502, Jul. 2010. (in Korean)
  14. L.R. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proceedings of the IEEE, vol.77, no.2, pp.257-286, 1989. https://doi.org/10.1109/5.18626
  15. Andrew McCallum, "MALLET: A Machine Learning for Language Toolkit," http://mallet.cs.umass.edu, 2002.
  16. UCI Machine Learning Repository, http://archive.ics.uci.edu/ml/