Browse > Article

Localization and a Distributed Local Optimal Solution Algorithm for a Class of Multi-Agent Markov Decision Processes  

Chang, Hyeong-Soo (Department of Computer Science and Engineering, Songang University)
Publication Information
International Journal of Control, Automation, and Systems / v.1, no.3, 2003 , pp. 358-367 More about this Journal
Abstract
We consider discrete-time factorial Markov Decision Processes (MDPs) in multiple decision-makers environment for infinite horizon average reward criterion with a general joint reward structure but a factorial joint state transition structure. We introduce the "localization" concept that a global MDP is localized for each agent such that each agent needs to consider a local MDP defined only with its own state and action spaces. Based on that, we present a gradient-ascent like iterative distributed algorithm that converges to a local optimal solution of the global MDP. The solution is an autonomous joint policy in that each agent's decision is based on only its local state.cal state.
Keywords
Distributed algorithm; local optimal solution; Markov decision process; multi-agent system; stochastic control;
Citations & Related Records

Times Cited By SCOPUS : 0
연도 인용수 순위
  • Reference
1 /
[ D. P. Bertsekas;J. N. Tsitsiklis ] / Parallel and Distributed Computation;Numerical Methods
2 On the complexity of solving Markov decision problems /
[ M. Littman;T. Dean;L. Kaelbling ] / Proc. 11th Annual Conf. on Uncertainty in Artificial Intelligence
3 Applications of Markov decision processes in communication networks: a survey /
[ E. Altman;E. Feinberg(ed.);A. Shwartz(ed.) ] / Markov Decision Processes, Models, Methods, Directions, and Open Problems
4 On the value function of a priority queue with an application to a controlled pollying model /
[ G. Koole;P. Nain ] / Queueing Systems   ScienceOn
5 Optimal planning for autonomous air vehicle battle management /
[ G. Arslan;J. D. Wolfe;J. Shamma;J. L. Speyer ] / Proc. of the 41st IEEE CDC
6 On-line sampling-based control for network queueing problems /
[ H. S. Chang ] / Ph.D. Thesis, School of Electrical and Computer Engineering, Purdue University
7 The linear programming approach to approximate dynamic programming /
[ D. P. de Farias;B. Van Roy ] / Operations Research
8 The Markov-modulated Poisson process (MMPP) cookbook /
[ W. Fischer;K. Mejer-Helistern ] / Performance Evaluation   ScienceOn
9 Decentralized optimal control of Markov chains with a common past information set /
[ M. Aicardi;F Davoli;R. Minciardi ] / IEEE Trans. Automat. Control
10 Decomposition of systems governed by Markov chains /
[ H. J. Kushner;C. Chen ] / IEEE Trans. Automat. Control
11 /
[ D. P. Bertsekas;J. N. Tsitsiklis ] / Neuro-Dynamic Programming
12 On the structure of value functions for threshold policies in queueing models /
[ S. Bhulai;G. Koole ] / Technical Report 2001-4, Department of Stochastics, Vrije Universiteit Amsterdam
13 On computing Markov decision theory-based cost for routing in circuit-switched broadband networks /
[ A. Kolarov;J. Hui ] / J. Network and Systems Management
14 /
[ O. Hernandez-Lerma ] / Adaptive Markov Control Processes
15 Decentralized control of finite state Markov processes /
[ K. Hsu;S. I. Marcus ] / IEEE Trans. Automat. Control
16 On distributed dynamic programming /
[ A. Jalali;M. J. Ferguson ] / IEEE Trans. Automat. Control   ScienceOn
17 Comparing neuro-dynamic programming algorithms for the vehicle routing problem with stochastic demands /
[ N. Secomandi ] / Comput. Oper. Res.   ScienceOn
18 Separable routing:a scheme for state-dependent routing of circuit switched telephone traffic /
[ T. J. Ott;K. R. Krishnan ] / Ann. Oper. Res.
19 /
[ M. L. Puterman ] / Markov Decision Processes:Discrete Stochastic Dynamic Programming
20 A survey of aggregation-disaggregation in large Markov chains /
[ P. J. Schweitzer ] / Proc. 1st Int. Workshop on the Numerical Solution of Markov Chains