1 |
R. Sutton and A. Barto, Reinforcement Learning. MIT Press, 2000.
|
2 |
S. Russel and A. L. Zimdars, "Q-decomposition for reinforcement learning agents," in Proc. of the 20th Int. Conf. on Machine Learning, 2003, pp. 278-287
|
3 |
M. N. ahmadabadi and M. Asadpour, "Expertness based cooperative Q-learning," IEEE Trans. on Systems, Man, and Cybernetics, part B. Vol. 32, No. 1, pp. 66-76, 2002.
DOI
ScienceOn
|
4 |
M. Rosentein and A. G. Barto, "Reinforcement learning with supervision by a stable controller," in Proc. of the American Control Conf., 2004, pp. 4517-4522
|
5 |
H. S. Chang, "Reinforcement Learning with Supervision by Combining Multiple Learnings and Expert Advices," in Proc. of the 2006 American Control Conference, June, 2006, pp. 4159-4164
|
6 |
D. P. Bertsekas and J. N. Tsitsiklis, Neuro Dynamic Programming. Athena Scientific, 1996
|
7 |
F. Fernandez and M. Veloso, "Probabilistic Policy Reuse in a Reinforcement Learning Agent," In The Fifth International Joint Conference on Autonomous Agents and Multiagent Systems, May, 2006
|
8 |
K. Driessens and S. Dzeroski, "Integrating experimental and guidance in relational reinforcement learning," in Proc. of the 19th Int. Conf. on Machine Learning, 2002, pp. 115-112
|
9 |
T. Mitchell, Machine Learning, McGraw Hill, 1997
|
10 |
S. Singh, T. jaakkola, M. Littman, and C. Szepesvari, "Convergence results for singlestep on-policy reinforcement learning algorithms," Machine Learning, Vol. 38, pp. 287- 308, 2000.
DOI
|
11 |
A. G. Barto and M. T. Rosentein, "Supervised Actor-Critic Reinforcement Learning," in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. G. Barto, W. B. Powell, and D. Wunsch (eds.), pp. 359-380, Wiley-IEEE Press, Piscataway, NJ, 2004
|
12 |
A. Y. Ng, D. Harada, and S. Russel. "Policy invariance under reward transformations: theory and application to reward shaping," in Proc. of the 16th Int. Conf. on Machine Learning, 1999, pp. 278-287
|
13 |
J. N. Tsitsiklis, "Asynchronous stochastic approximation and Q-learning," Machine Learning, Vol. 16, pp. 185-202, 1994
|
14 |
A. G. Barto, "Reinforcement Learning" in Handbook of Learning and Approximate Dynamic Programming, J. Si, A. G. Barto, W. B. Powell, and D. Wunsch (eds.), pp. 804- 809, Wiley-IEEE Press, Piscataway, NJ, 2004
|