1 |
I. Gilboa and D. Schmeidler, "Case-based decision theory," Quart. J. Economics, vol.110, no.4, pp.605-639, 1995
DOI
|
2 |
S. Melax “Reinforcement learning tetris example,” 1998. URL http://www.melax.com/tetris/
|
3 |
M. L. Puterman, Markov Decision Processes: Discrete Stochastic Dynamic Programming, wiley, New York, 1994
|
4 |
L. P. Kaelbling, Michael L. Littman, Andrew W. Moore, “Reinforcement learning: A survey,” Journal of Articial Intelligence Research, vol.4, pp.237-285, 1996
DOI
|
5 |
S. Singh, T. jaakkola, M. Littman, and C. Sze-pesvari, “Convergence results for single-step on-policy reinforcement learning algorithms,” Machine Learning, vol.38, pp.287-308, 2000
DOI
ScienceOn
|
6 |
H. S. Chang, “Reinforcement Learning with Supervision by Combining Multiple Learnings and Expert Advices,” in Proc. of the 2006 American Control Conference, pp.4159-4164, June, 2006
DOI
|
7 |
R. Sutton and A. Barrto, Reinforcement Learning, MIT Press, 2000
|
8 |
A. Y. Ng, D. Harada, S. Russel, “Policy invariance under reward transformations:theory and applica-tion to reward shaping,” in Proc. of the 16th Int. Conf. on Machine Learning, pp.278-287, 1999
|
9 |
E. Hllermeier “Experience-based decision making: a satisficing decision tree approach,” IEEE Trans-actions on Systems, Man, and Cybernetics, vol.35, no.5, pp.641-653, 2005
DOI
ScienceOn
|