1 |
J. P. O'Doherty, S. W. Lee, and D. McNamee, "The structure of reinforcement-learning mechanisms in the human brain," Current Opinion in Behavioral Sciences, Vol.1, pp.94-100, 2014.
|
2 |
W. Dabney, G. Ostrovski, D. Silver, and R.Munos, "Implicit quantile networks for distributional reinforcement learning," In: International Conference on Machine Learning, PMLR, pp.1096-1105, 2018.
|
3 |
D. Hassabis, D. Kumaran, C. Summerfield, and M.Botvinick, "Neuroscience-inspired artificial intelligence," Neuron, Vol.95, No.2, pp.245-258, 2017.
DOI
|
4 |
K. L. Stachenfeld, M. M. Botvinick, and S. J. Gershman, "The hippocampus as a predictive map," Nature Neuroscience, Vol.20, No.11, pp.1643-1653, 2017.
DOI
|
5 |
R. S. Sutton and A. G. Barto, "Reinforcement learning: An introduction," MIT press, 2018.
|
6 |
D. Silver, et al., "Mastering the game of go without human knowledge," Nature, Vol.550, No.7676, pp.354-359, 2017.
DOI
|
7 |
J. Schrittwieser, et al., "Mastering atari, go, chess and shogi by planning with a learned model," Nature, Vol.588, No.7839, pp.604-609, 2020.
DOI
|
8 |
S. W. Lee, S. Shimojo, and J. P. O'Doherty, "Neural computations underlying arbitration between model-based and model-free learning," Neuron, Vol.81, No.3, pp.687-699, 2014.
DOI
|
9 |
R. S. Sutton, "Learning to predict by the methods of temporal differences," Machine Learning, Vol.3, No.1, pp.9-44, 1988.
DOI
|
10 |
E. M. Russek, I. Momennejad, M. M. Botvinick, S. J. Gershman, and N. D. Daw, "Predictive representations can link model-based reinforcement learning to model-free mechanisms," PLoS Computational Biology, Vol.13, No.9, pp.e1005768, 2017.
|
11 |
R. S. Sutton, "Dyna, an integrated architecture for learning, planning, and reacting," ACM Sigart Bulletin, Vol.2, No.4, pp.160-163, 1991.
DOI
|
12 |
J. X. Wang, et al., "Learning to reinforcement learn," arXiv preprint arXiv:1611.05763, 2016.
|
13 |
S.-H. Kim, and J. H. Lee, "Evaluating a successor representation-based reinforcement learning algorithm in the 2-stage Markov decision task," In: Proceedings of the Korea Information Processing Society Conference, Korea Information Processing Society, pp.910-913, 2021.
|
14 |
D. Silver, et al., "Mastering the game of go with deep neural networks and tree search," Nature, Vol.529, No.7587, pp.484-489, 2016.
DOI
|
15 |
J. H. Lee, B. Seymour, J. Z. Leibo, S. J. Lee, and S. W. Lee, "Toward high-performance, memory-efficient, and fast reinforcement learning-Lessons from decision neuro-science," Science Robotics, Vol.4, No.26, pp.eaav2975, 2019.
|
16 |
J. X. Wang, et al., "Prefrontal cortex as a meta-reinforcement learning system," Nature Neuroscience, Vol.21, No.6, pp.860-868, 2018.
DOI
|
17 |
S. J. Gershman, "The successor representation: Its computational logic and neural substrates," Journal of Neuro-scence, Vol.38, No.33, pp.7193-7200, 2018.
DOI
|
18 |
G. Farquhar, et al., "Self-Consistent Models and Values," Advances in Neural Information Processing Systems, Vol.34, pp.1111-1125, 2021
|
19 |
I. Momennejad, E. M. Russek, J. H. Cheong, M. M. Botvinick, N. D. Daw, and S. J. Gershman, "The successor representation in human reinforcement learning," Nature Human Behaviour, Vol.1, No.9, pp.680-692, 2017.
DOI
|
20 |
E. C. Tolman, "Cognitive maps in rats and men," Psychological Review, Vol.55, No.4, pp.189, 1948.
|