1 |
M. Samvelyan, T. Rashid, C. S. Witt, G. Farquhar, N. Nardelli, T. G. J. Rudner, C. M. Hung, P. H. S. Torr, J. N. Foerster, and S. Whiteson, "The StarCraft Multi-Agent Challenge," CoRR, abs/1902.04043, 2019.
|
2 |
J. N. Foerster, G, Farquhar, T. Afouras, N. Nardelli, and S. Whiteson, "Counterfactual multi-agent policy gradients," in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
|
3 |
P. Sunehag, G. Lever, A. Gruslys, W. M. Czarnecki, V. Zambaldi, M. Jaderberg, M. Lanctot, N. Sonnerat, J. Z. Leibo, K. Tuyls, and T. Graepel, "Value-decomposition networks for cooperative multi-agent learning based on team reward," in Proceedings of the 17th International Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2017.
|
4 |
M. Tan, "Multi-agent reinforcement learning: Independent vs. cooperative agents." in Proceedings of the Tenth International Conference on Machine Learning (ICML), pp.330-337, 1993.
|
5 |
C. Watkins, "Learning from delayed rewards," Ph.D. Thesis, University of Cambridge England, 1989.
|
6 |
V. Mnih, et al., "Human-level control through deep reinforcement learning," Nature, pp.529-533, 2015.
|
7 |
A. Tampuu, et al., "Multiagent cooperation and competition with deep reinforcement learning," PLoS ONE, Vol.12, No.4, 2017.
|
8 |
J. N. Foerster, et al., "Stabilising experience replay for deep multi-agent reinforcement learning," in Proceedings of The 34th International Conference on Machine Learning (ICML), pp.1146-1155, 2017
|
9 |
C. Guestrin, D. Koller, and R. Parr, "Multiagent planning with factored MDPs," In Advances in Neural Information Processing Systems (NIPS), MIT Press, pp.1523-1530, 2002.
|
10 |
S. Sukhbaatar, R. Fergus, A. Szlam, and R. Fergus, "Learning multiagent communication with backpropagation," In Advances in Neural Information Processing Systems (NIPS), pp.2244-2252, 2016.
|
11 |
P. Peng, et al., "Multiagent bidirectionally-coordinated nets: Emergence of human-level coordination in learning to play StarCraft combat games," In Advances in Neural Information Processing Systems (NIPS), 2017.
|
12 |
R. Lowe, Y. Wu, A. Tamar, J. Harb, O. P. Abbeel, and I. Mordatch, "Multi-agent actor-critic for mixed cooperative-competitive environments," In Advances in Neural Information Processing Systems (NIPS), pp.6382-6393, 2017.
|
13 |
S. Iqbal, C. A, C. S. Witt, B. Penget, W. Bohmer, S. Whiteson, and F. Sha, "AI-QMIX: Attention and imagination for dynamic multi-agent reinforcement learning," arXiv: 2006.04222, 2020.
|
14 |
J. R. Kok and N. Vlassis, "Collaborative multiagent reinforcement learning by payoff propagation," Journal of Machine Learning Research, pp.1789-1828, 2006.
|
15 |
T. Rashid, M. Samvelyan, C. S. Witt, G. Farquhar, J. N. Foerster, and S. Whiteson, "Qmix: Monotonic value function factorisation for deep multi-agent reinforcement learning," in Proceedings of the International Conference on Machine Learning (ICML), pp.4292-4301, 2018.
|
16 |
J. K. Gupta, M. Egorov, and M. Kochenderfer, "Cooperative multi-agent control using deep reinforcement learning," in Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS), Springer, pp.66-83, 2017.
|