[KSCI] Korea Science Citation Index Service

http://dx.doi.org/10.5139/JKSAS.2012.40.3.215

Improvements of pursuit performance using episodic parameter optimization in probabilistic games

Kwak, Dong-Jun (서울대학교 기계항공공학부)
Kim, H.-Jin (서울대학교 기계항공공학부)

Publication Information

Journal of the Korean Society for Aeronautical & Space Sciences / v.40, no.3, 2012 , pp. 215-221 More about this Journal

Abstract

In this paper, we introduce an optimization method to improve pursuit performance of a pursuer in a pursuit-evasion game (PEG). Pursuers build a probability map and employ a hybrid pursuit policy which combines the merits of local-max and global-max pursuit policies to search and capture evaders as soon as possible in a 2-dimensional space. We propose an episodic parameter optimization (EPO) algorithm to learn good values for the weighting parameters of a hybrid pursuit policy. The EPO algorithm is performed while many episodes of the PEG are run repeatedly and the reward of each episode is accumulated using reinforcement learning, and the candidate weighting parameter is selected in a way that maximizes the total averaged reward by using the golden section search method. We found the best pursuit policy in various situations which are the different number of evaders and the different size of spaces and analyzed results.

Keywords

Multi-agent system; Pursuit-Evasion Game; Parameter Optimization; Reinforcement Learning;

Citations & Related Records

Reference

1	Schenato, L., Oh, S., and Sastry, S., "Swarm coordination for pursuit evasion games using sensor networks," In Proceedings of the 2005 IEEE International Conference on Robotics and Automation, 2005, pp. 2493-2498.
2	Kwak, D. and Kim, J., "Probabilistic Pursuit-Evasion Game," In Proceedings of KACC 2009.
3	Kwak, D. and Kim, J., "Probabilistic Pursuit-Evasion Game," In Proceedings of KSAS Fall 2009 Conference, 2009, pp. 709-712.
4	Kwak, D. and Kim, J., "Probabilistic Pursuit-Evasion Game using Reinforcement Learning," In Proceedings of KSAS Fall 2011 Conference, 2011.
5	I. D. Couzin, J. Krause, N. R. Franks, and S. A. Levin, "Effective leadership and decision-making in animal groups on the move," Nature, 2005, vol. 433, no. 7025, pp. 513-516. DOI ScienceOn
6	Khosla, P. and Volpe, R., "Superquadric artificial potentials for obstacle avoidance and approach," In Proceedings of the 1988 IEEE International Conference on Robotics and Automation, 1988, pp. 1778-1784.
7	Sutton, R. S. and Barto, A. G., Reinforcement learning: an introduction, MIT Press, Cambridge, Mass., 1998.
8	Press, W. H., Teukolsky, S. A., Vetterling, W. T., and Flannery, B. P., Numerical recipes in C: The art of scientific programming (2nd ed.), Cambridge: Cambridge University Press. 1992.
9	Isaacs, R., Differential games: a mathematical theory with applications to warfare and pursuit, control and optimization, Wiley, New York, 1965.
10	Vidal, R., Shakernia, O., Kim, J., Shim, D., and Sastry, S., "Probabilistic pursuit-evasion games: theory, implementation, and experimental evaluation," IEEE Trans. on Robotics and Automation, 2002, Vol. 42, pp. 662-669.

KSCI

Improvements of pursuit performance using episodic parameter optimization in probabilistic games 에피소드 매개변수 최적화를 이용한 확률게임에서의 추적정책 성능 향상

Improvements of pursuit performance using episodic parameter optimization in probabilistic games