강화 학습과 감독 지식의 융합기술

  • Published : 2007.03.31

Abstract

Keywords

References

  1. R. Sutton and A. Barto, Reinforcement Learning. MIT Press, 2000.
  2. D. P. Bertsekas and J. N. Tsitsiklis, Neuro Dynamic Programming. Athena Scientific, 1996
  3. T. Mitchell, Machine Learning, McGraw Hill, 1997
  4. S. Russel and A. L. Zimdars, "Q-decomposition for reinforcement learning agents," in Proc. of the 20th Int. Conf. on Machine Learning, 2003, pp. 278-287
  5. J. N. Tsitsiklis, "Asynchronous stochastic approximation and Q-learning," Machine Learning, Vol. 16, pp. 185-202, 1994
  6. S. Singh, T. jaakkola, M. Littman, and C. Szepesvari, "Convergence results for singlestep on-policy reinforcement learning algorithms," Machine Learning, Vol. 38, pp. 287- 308, 2000. https://doi.org/10.1023/A:1007678930559
  7. A. Y. Ng, D. Harada, and S. Russel. "Policy invariance under reward transformations: theory and application to reward shaping," in Proc. of the 16th Int. Conf. on Machine Learning, 1999, pp. 278-287
  8. H. S. Chang, "Reinforcement Learning with Supervision by Combining Multiple Learnings and Expert Advices," in Proc. of the 2006 American Control Conference, June, 2006, pp. 4159-4164
  9. F. Fernandez and M. Veloso, "Probabilistic Policy Reuse in a Reinforcement Learning Agent," In The Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, May, 2006
  10. A. G. Barto, "Reinforcement Learning" in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. G. Barto, W. B. Powell, and D. Wunsch (eds.), pp. 804- 809, Wiley-IEEE Press, Piscataway, NJ, 2004
  11. M. N. ahmadabadi and M. Asadpour, "Expertness based cooperative Q-learning," IEEE Trans. on Systems, Man, and Cybernetics, part B. Vol. 32, No. 1, pp. 66-76, 2002. https://doi.org/10.1109/3477.979961
  12. A. G. Barto and M. T. Rosentein, "Supervised Actor-Critic Reinforcement Learning," in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. G. Barto, W. B. Powell, and D. Wunsch (eds.), pp. 359-380, Wiley-IEEE Press, Piscataway, NJ, 2004
  13. M. Rosentein and A. G. Barto, "Reinforcement learning with supervision by a stable controller," in Proc. of the American Control Conf., 2004, pp. 4517-4522
  14. K. Driessens and S. Dzeroski, "Integrating experimental and guidance in relational reinforcement learning," in Proc. of the 19th Int. Conf. on Machine Learning, 2002, pp. 115-112