Acknowledgement
This work was supported by Electronics and Telecommunications Research Institute (ETRI) grant funded by the Korean government. [21ZR1100, A Study of Hyper-Connected Thinking Internet Technology by autonomous connecting, controlling, and evolving ways].
References
- S. Lee, H. Baek, and S. Oh, The role of openness in open collaboration: A focus on open-source software development projects, ETRI J. 42 (2020), no. 2, 196-204. https://doi.org/10.4218/etrij.2018-0536
- S. J. Karau and K. D. Williams, Social loafing: A meta-analytic review and theoretical integration, J. Personal. Soc. Psychol. 65 (1993), no. 4, 681. https://doi.org/10.1037/0022-3514.65.4.681
- D. Neumann, On the paradox of collaboration, collaborative systems and collaborative networks, in Collaborative Networks in the Internet of Services, Springer, Berlin, Heidelberg, Germany, 2012, pp. 363-373.
- N. Jaques et al., Social influence as intrinsic motivation for multi-agent deep reinforcement learning, in Proc. Int. Conf. Mach. Learn. (Long Beach, CA, USA), June 2019, pp. 3040-3049.
- O. Vinyals et al., Grandmaster level in starcraft ii using multi-agent reinforcement learning, Nature 575 (2019), no. 7782, 350-354. https://doi.org/10.1038/s41586-019-1724-z
- C. Berneret al., Dota 2 with large scale deep reinforcement learning, arXiv preprint, CoRR, 2019, arXiv: 1912.06680.
- L. Panait and S. Luke, Cooperative multi-agent learning: The state of the art, Auton. Agent. Multi-Agent Syst. 11 (2005), no. 3, 387-434. https://doi.org/10.1007/s10458-005-2631-2
- A. Mahajan et al., Maven: Multi-agent variational exploration, in Proc. Conf. Neural Inf. Process. Syst. (Vancouver, Canada), Dec. 2019, pp. 7611-7622.
- R. Lowe et al., Multi-agent actor-critic for mixed cooperative-competitive environments, in Proc. Conf. Neural Inf. Process. Syst. (Long Beach, CA, USA), Jan. 2017, pp. 6379-6390.
- S. Omidshafiei et al., Deep decentralized multi-task multi-agent reinforcement learning under partial observability, in Proc. Int. Conf. Mach. Learn. (Sydney, Australia), Aug. 2017, pp. 2681-2690.
- E. Hughes et al., Inequity aversion improves cooperation in intertemporal social dilemmas, in Proc. Conf.Neural Inf. Process. Syst. (Montreal, Canada), Dec. 2018, pp. 3326-3336.
- D. T. Nguyen, A.Kumar, and H. C. Lau, Credit assignment for collective multiagent RL with global rewards, in Proc. Conf. Neural Inf. Process. Syst. (Montreal, Canada), Dec. 2018, pp. 8102-8113.
- T. Rashid et al., Qmix:Monotonic value function factorisation for deep multi-agent reinforcement learning, arXiv preprint, CoRR, 2018, arXiv:1803.11485.
- P. Sunehag et al., Value-decomposition networks for cooperative multi-agent learning based on team reward, in Proc. Int. Conf. pp. 2085-2087.
- T. Wang et al., Learning nearly decomposable value functions via communication minimization, arXiv preprint, CoRR, 2019, arXiv: 1910.05366.
- F. Prantare and F. Heintz, An anytime algorithm for optimal simultaneous coalition structure generation and assignment, Auton. Agent. Multi-Agent Syst. 34 (2020), no. 1, 1-31. https://doi.org/10.1007/s10458-019-09423-z
- R. C. Craft and C. Leake, The pareto principle in organizational decision making, Manag. Decis. 40 (2002), no. 8, 729-733. https://doi.org/10.1108/00251740210437699
- J. Schulman et al., Trust region policy optimization, in Proc. Int. Conf. Mach. Learn. (Lille, France), July 2015, pp. 1889-1897.
- R. Zhao, X. Sun, and V. Tresp, Maximum entropy-regularized multi-goal reinforcement learning, in Proc. Int. Conf. Mach. Learn. (Long Beach, CA, USA), June 2019, pp. 7553-7562.
- V. Mnih et al., Playing atari with deep reinforcement learning, arXiv preprint, CoRR, 2013, arXiv: 1312.5602.
- D. Silver et al., A general reinforcement learning algorithm that masters chess, shogi, and go through self-play, Science 362 (2018), no. 6419, 1140-1144. https://doi.org/10.1126/science.aar6404
- D. Floreano and R. J.Wood, Science, technology and the future of small autonomous drones, Nature 521 (2015), no. 7553, 460. https://doi.org/10.1038/nature14542
- M. Bojarski et al., End to end learning for self-driving cars, arXiv preprint, CoRR, 2016, arXiv: 1604.07316.
- D. Angelov, Y. Hristov, and S. Ramamoorthy, From demonstrations to task-space specifications. using causal analysis to extract rule parameterization from demonstrations, Auton. Agent. Multi-Agent Syst. 34 (2020), no. 2, 1-19. https://doi.org/10.1007/s10458-019-09423-z
- D. Kim et al., Special issue on smart interactions in cyber-physical systems: Humans, agents, robots, machines, and sensors, ETRI J. 40 (2018), no. 4, 417-420. https://doi.org/10.4218/etrij.18.3018.0000
- S. A. DeLoach,M. F.Wood, and C. H. Sparkman, Multiagent systems engineering, Int. J. Softw. Eng. Knowl. Eng. 11 (2001), no. 03, 231-258. https://doi.org/10.1142/S0218194001000542
- M. G. Lagoudakis and R. Parr, Value function approximation in zero-sum markov games, in Proc. Conf. Uncertain. Artif. Intell. (Alberta, Canada), Aug. 2002, pp. 283-292.
- M. L. Littman, Markov games as a framework for multi-agent reinforcement learning, in Machine Learning Proceedings 1994, Morgan Kaufmann, Burlington, MA, USA, 1994, pp. 157-163.
- G. J. Laurent, L. Matignon, and L. Fort-Piat, The world of independent learners is not markovian, Int. J. Knowl.-based Intell. Eng. Syst. 15 (2011), no. 1, 55-64.
- M. Kleiman-Weiner et al., Coordinate to cooperate or compete: Abstract goals and joint intentions in social interaction, Cogsci, (2016), 1-6.
- J. B. Harvey, The abilene paradox: The management of agreement, Organ. Dyn. 3 (1974), no. 1, 63-80. https://doi.org/10.1016/0090-2616(74)90005-9
- S. J. Grossman and O. D. Hart, An analysis of the principal-agent problem, in Foundations of Insurance Economics, vol. 4, Springer, Dordrecht, Netherlands, 1992, pp. 302-340.
- G. Brockman et al., Openai gym, arXiv preprint, CoRR, 2016, arXiv: 1606.01540.
- I. Mordatch and P. Abbeel, Emergence of grounded compositional language in multi-agent populations, in Proc. AAAI Conf. Artif. Intell. (New Orleans, LA, USA), Feb. 2018, pp. 1495-1502.
- L. Zheng et al., Magent: A many-agent reinforcement learning platform for artificial collective intelligence, in Proc. AAAI Conf. Artif. Intell. (New Orleans, LA, USA), Feb. 2018, pp. 8222-8223.
- J. Hu and M. P. Wellman, Multiagent reinforcement learning: Theoretical framework and an algorithm, in Proc. Int. Conf. Mach. Learn. (San Francisco, CA, USA), July 1998, pp. 242-250.
- J. Schulman et al., Proximal policy optimization algorithms, arXiv preprint, CoRR, 2017, arXiv: 1707.06347.
- M. Liu andG. Cheng, Early stopping for nonparametric testing, in Proc. Conf. Neural Inf. Process. Syst. (Montreal, Canada), Dec. 2018, pp. 3985-3994.
- Z.Wang et al., Sample efficient actor-critic with experience replay, arXiv preprint, CoRR, 2016, arXiv: 1611.01224.
- P. Dhariwal et al., Openai baselines, GitHub, 2017.