Browse > Article

A Localized Adaptive QoS Routing Scheme Using POMDP and Exploration Bonus Techniques  

Han Jeong-Soo (신구대학 인터넷정보과)
Abstract
In this paper, we propose a Localized Adaptive QoS Routing Scheme using POMDP and Exploration Bonus Techniques. Also, this paper shows that CEA technique using expectation values can be simply POMDP problem, because performing dynamic programming to solve a POMDP is highly computationally expensive. And we use Exploration Bonus to search detour path better than current path. For this, we proposed the algorithm(SEMA) to search multiple path. Expecially, we evaluate performances of service success rate and average hop count with $\phi$ and k performance parameters, which is defined as exploration count and intervals. As result, we knew that the larger $\phi$, the better detour path search. And increasing n increased the amount of exploration.
Keywords
Localized QoS Routing; Dynamic Programming; Reinforcement Learning; Markov Decision Process(MDP); Partially Observable Markov Decision Processes(POMDP); Exploration Bonus;
Citations & Related Records
연도 인용수 순위
  • Reference
1 한정수, 'TD($\lambda$)기법을 사용한 지역적이며 적응적인 QoS 라우팅 기법' 한국통신학회 제30권 제5B호 2005, pp304-9309
2 Gregory Z. Grudic, Vijay Kumar, 'Using Policy Gradient Reinforcement Learning on Automous Robot Controllers', IROS03, Las Vagas, US, October, 2003
3 Srihari Nelakuditi, Zhi-Li Zhang and Rose P.Tsang, 'Adaptive Proportional Routing: A Localized QoS Routing Approach', In IEEE Infocom, April 2000
4 S.Banerjee, R.K. Ghosh and A.P.K Reddy, 'Parallel algorithm for shortest pairs of edge-disjoint paths'
5 XYuan and A.Saifee, 'Path Selection Methods for Localized Quality of Service Routing', Technical Report, TR-010801, Dept of Computer Science, Florida State University, July, 2001
6 Richard S. Sutton etc, 'Policy Gradient Methods for Reinforcement Learning with Function Approximation', Advances in Neural Information Processing System, pp. 10571063, MIT Press 2000
7 Srihari Nelakuditi, Zhi-Li Zhang, 'A Localized Adaptive Proportioning Approach to QoS Routing', IEEE Communications Magazine, June 2002
8 Sutton, R.S. 'Learning to predict by the method of temporal differences' Machine Learning 3. 1988, pp. 9-44
9 Y.Liu, C.K. Tham and TCK. Hui, 'MAPS: A Localized and Distributed Adaptive Path Selection in MPLS Networks' in Proceedings of 2003 IEEE Workshop on High Performance Switching and Routing, Torino, Italy, June 2003, pp. 24-28