Browse > Article

Policy Modeling for Efficient Reinforcement Learning in Adversarial Multi-Agent Environments  

Kwon, Ki-Duk (경기대학교 전자계산학과)
Kim, In-Cheol (경기대학교 전자계산학과)
Abstract
An important issue in multiagent reinforcement learning is how an agent should team its optimal policy through trial-and-error interactions in a dynamic environment where there exist other agents able to influence its own performance. Most previous works for multiagent reinforcement teaming tend to apply single-agent reinforcement learning techniques without any extensions or are based upon some unrealistic assumptions even though they build and use explicit models of other agents. In this paper, basic concepts that constitute the common foundation of multiagent reinforcement learning techniques are first formulated, and then, based on these concepts, previous works are compared in terms of characteristics and limitations. After that, a policy model of the opponent agent and a new multiagent reinforcement learning method using this model are introduced. Unlike previous works, the proposed multiagent reinforcement learning method utilize a policy model instead of the Q function model of the opponent agent. Moreover, this learning method can improve learning efficiency by using a simpler one than other richer but time-consuming policy models such as Finite State Machines(FSM) and Markov chains. In this paper. the Cat and Mouse game is introduced as an adversarial multiagent environment. And effectiveness of the proposed multiagent reinforcement learning method is analyzed through experiments using this game as testbed.
Keywords
Multiagent System; Reinforcement Learning; Opponent Policy Model;
Citations & Related Records
연도 인용수 순위
  • Reference
1 Claus C. and Boutilier C., "The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems," Proceedings of AAAI-98, pp. 746-752, 1998
2 Littman M.L., "Markov Games as Framework for Multi-Agent Reinforcement Learning," Proceedings of the 11th International Conference on Machine Learning, pp. 157-163, 1994
3 Littman M.L., "Friend-or-Foe Q-learning in General- Sum Games," Proceedings of the 18th International Conference on Machine Learning, Morgan Kaufman, pp. 322-328, 2001
4 Carmel D. and Markovitch S., "Learning Models of Intelligent Agents," Proceedings of AAAI-96, pp. 62-67, 1996
5 Sutton, R.S., Barto, A.G. Reinforcement Learning: An Introduction, MIT Press, 1998
6 Rahimi K.A., Tabarraei H., Sadeghi B., "Reinforcement Learning Based Supplier-Agents for Electricity Markets," Proceedings of the IEEE International Symposium on Control and Automation, pp. 1405-1410, 2005
7 Chalkiadakis G. and Boutilier C., "Multiagent Reinforcement Learning: Theoretical Framework and An Algorithm," Proceedings of the 2nd AAMAS-03, pp. 709-716, 2003.
8 Riley P. and Veloso M., "Advice Generation from Observed Execution: Abstract Markov Decision Process Learning," Proceedings of AAAI-2004, 2004
9 Tesauro G., "Multi Agent Learning: Mini Tutorial," IBM T.J.Watson Research Center, 2000
10 Shoham Y., Powers R., and Grenager T., "Multi- Agent Reinforcement Learning: A Critical Survey," Technical Report, Stanford University, 2003
11 Hu J. and Wellman M.P., "Nash Q-learning for General-Sum Stochastic Games," Journal of Machine Learning Research, Vol.4, pp. 1039-1069, 2003   DOI   ScienceOn
12 Yang E. and Gu D., "Multiagent Reinforcement Learning for Multi-Robot Systems: A Survey," University of Essex Technical Report CSM-404, 2004