Acknowledgement
본 연구는 한국전자통신연구원 연구운영지원사업의 일환으로 수행되었음[20ZS1100, 자율성장형 복합인공지능 원천기술 연구, 19YE1400, 멀티 에이전트 환경에서 인간-에이전트 협업기술 선행연구 및 개발환경 구축].
References
- V. Mnih et al., "Playing atari with deep reinforcement learning," arXiv preprint, CoRR, 2013, arXiv: 1312.5602.
- J. Schulman et al., "Trust Region Policy Optimization," in Proc. Int Conf. Mach. Learn. (Lille, France), Feb. 2015, pp. 1889-1897.
- J. Schulman et al., "Proximal policy optimization algorithms," arXiv preprint, CoRR, 2017, arXiv: 1707.06347.
- T. P. Lillicrap et al., "Continuous control with deep reinforcement learning," in Int. Conf. Learn. Representations, 2016.
- K. Zhang, Z. Yang, and T. Basar, "Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms," arXiv preprint, CoRR, 2019, arXiv: 1911.10635v1.
- O. Jadid and D. Hajinezhad, "A review of cooperative multiagent deep reinforcement learning," arXiv preprint, CoRR, 2019, arXiv: 1908.03963v3.
- R. Lowe et al., "Multi-agent actor-critic for mixed cooperativecompetitive environments," in Advances in Neural Information Processing Systems, 2017, pp. 6379-6390.
- Y. Yang et al., "Mean field multi-agent reinforcement learning," in Proc. Int. conf. Mach. Learn. (Stockholm, Sweden), 2018.
- S. Iqbal and F. Sha, "Actor-attention-critic for multi-agent reinforcement learning," in Proc. Int. Conf. Mach. Lear. (Long Beach, CA, USA), 2019, pp. 2961-2970.
- T. Haarnoja et al., "Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor," in Prco. Int. Conf. Mach. Learn. (Stockholm, Sweden), 2018, pp. 1861-1870.
- H. Ryu, H. Shin, and J. Park, "Multi-agent actor-critic with hierarchical graph attention network," in Proc. AAAI Conf. Artif. Intell. (New York, USA), 2020, pp. 7236-7243.
- J. Foerster et al., "Counterfactual multi-agent policy gradients," in Proc. AAAI Conf. Artif. Intell. 2020.
- P. Sunehag et al., "Value-decomposition networks for cooperative multi-agent learning based on team reward," in Proc. Int. Conf. Auto. Agent. Multi. Syst. 2018, pp. 2085-2087.
- T. Rashid et al., "QMIX: Monotonic value function factorisation for deep multi-agent reinforcement learning," in Proc. Int. Conf. Mach. Learn. 2018.
- K. Son et al., "Qtran: Learning to factorize with transformation for cooperative multi-agent reinforcement learning," in Proc. Int. Conf. Mach. Learn. 2019.
- Y. Du et al., "LIIR: Learning Individual Intrinsic Reward in MultiAgent Reinforcement Learning," in Proc. Adv. Neural Inform. Process. Syst. 2019, pp. 4403-4414.
- C. V. Goldman and S. Zilberstein, "Decentralized control of cooperative systems: Categorization and complexity analysis," J. Artif. Intelli. Res. vol. 22, 2004, pp. 143-174. https://doi.org/10.1613/jair.1427
- E. Pesce and G. Montana, "Improving coordination in smallscale multi-agent deep reinforcement learning through memory-driven communication," Mach. Learn. vol. 109, 2020, doi: 10.1007/s10994-019-05864-5.
- S. Q. Zhang, Q. Zhang, and J. Lin, "Efficient communication in multi-agent reinforcement learning via variance based control," in Adv. Neural Inform. Process. Syst. 2019, pp. 3235-3244.
- H. Mao et al., "Learning agent communication under limited bandwidth by message rruning," arXiv preprint, CoRR, Dec. 2019, Accessed: Sep. 21, 2020. [Online]. Available: http://arxiv.org/abs/1912.05304.
- D. Kim et al., "Learning to schedule communication in multiagent reinforcement learning," arXiv preprint, CoRR, Feb. 2019, Accessed: Sep. 10, 2020. [Online]. Available: http://arxiv.org/abs/1902.01554.
- J. Foerster et al., "Learning to communicate with deep multiagent reinforcement learning," in Adv. Neural Inform. Process. Syst. 2016, pp. 2137-2145.
- N. Jaques et al., "Social influence as intrinsic motivation for multi-agent deep reinforcement learning," in Proc. Int. Conf. Mach. Learn. 2019, pp. 3040-3049.
- K. Cao et al., "Emergent communication through negotiation," arXiv preprint, CoRR, Apr. 2018, Accessed: Sep. 09, 2020. [Online]. Available: http://arxiv.org/abs/1804.03980.
- T. Eccles et al., "Biases for emergent communication in multiagent reinforcement learning," in Adv. Neural Inform. Process. Syst. 2019, pp. 13111-13121.
- S. Gupta, R. Hazra, and A. Dukkipati, "Networked multi-agent reinforcement learning with emergent communication," In Proc. Int. Conf. Auton. Agents and Multiagent Syst. (Auckland, New Zealand), May 2020.
- T. Wang et al., "Influence-based multi-agent exploration," in Proc. Int. Conf. Learn. Representations, 2020.
- G. Chen, "A new framework for multi-agent reinforcement learning-centralized training and exploration with decentralized execution via policy distillation," in Proc. Int. Conf. Auton. Agents Multiagent Sys. 2019.
- A. Mahajan et al., "Maven: Multi-agent variational exploration," in Adv. Neural Inform. Process. Syst. 2019, pp. 7613-7624.
- G. Brockman et al., "Openai gym," arXiv preprint, CoRR, arXiv: 1606.01540.
- M. Samvelyan et al., "The starcraft multi-agent challenge," arXiv preprint, CoRR, 2019, arXiv: 1902.04043.