A Dynamic Channel Assignment Method in Cellular Networks Using Reinforcement learning Method that Combines Supervised Knowledge

감독 지식을 융합하는 강화 학습 기법을 사용하는 셀룰러 네트워크에서 동적 채널 할당 기법

  • 김성완 (서강대학교 컴퓨터공학과) ;
  • 장형수 (서강대학교 컴퓨터공학과)
  • Published : 2008.07.15

Abstract

The recently proposed "Potential-based" reinforcement learning (RL) method made it possible to combine multiple learnings and expert advices as supervised knowledge within an RL framework. The effectiveness of the approach has been established by a theoretical convergence guarantee to an optimal policy. In this paper, the potential-based RL method is applied to a dynamic channel assignment (DCA) problem in a cellular networks. It is empirically shown that the potential-based RL assigns channels more efficiently than fixed channel assignment, Maxavail, and Q-learning-based DCA, and it converges to an optimal policy more rapidly than other RL algorithms, SARSA(0) and PRQ-learning.

최근에 제안된 강화 학습 기법인 "potential-based" reinforcement learning(RL) 기법은 다수 학습들과 expert advice들을 감독 지식으로 강화 학습 알고리즘에 융합하는 것을 가능하게 했고 그 효용성은 최적 정책으로의 이론적 수렴성 보장으로 증명되었다. 본 논문에서는 potential-based RL 기법을 셀룰러 네트워크에서의 채널 할당 문제에 적용한다. Potential-based RL 기반의 동적 채널 할당 기법이 기존의 fixed channel assignment, Maxavail, Q-learning-based dynamic channel assignment 채널 할당 기법들보다 효율적으로 채널을 할당한다. 또한, potential-based RL 기법이 기존의 강화 학습 알고리즘인 Q-learning, SARSA(0)에 비하여 최적 정책에 더 빠르게 수렴함을 실험적으로 보인다.

Keywords

References

  1. R. Sutton and A. Barto, Reinforcement Learning. MIT Press, 2000
  2. M. L. Littman. Algorithms for sequential decision making. Unpublished Ph.D. Thesis, Brown University, Providence, R.I. 1996
  3. M. N. ahmadabadi and M. Asadpour, "Expertness based cooperative Q-learning," IEEE Trans. on Systems, Man, and Cybernetics, part B, Vol.32, No.1, pp. 66-76, 2002 https://doi.org/10.1109/3477.979961
  4. H. S. Chang, "Reinforcement Learning with Supervision by Combining Multiple Learnings and Expert Advices," in Proc. of the 2006 American Control Conference, pp. 4159-4164, 2006
  5. A. Y. Ng, D. Harada, and S. Russel. "Policy invariance under reward transformations: theory and application to reward shaping," in Proc. of the 16th Int. Conf. on Machine Learning, pp. 278-287, 1999
  6. Junhong Nie; Haykin, S., "A dynamic channel assignment policy through Q-learning," IEEE Trans. on Neural Networks, Vol.10, No.6, pp. 1443-1455, 1999 https://doi.org/10.1109/72.809089
  7. Singh, S., Jaakkola, T., Littman, M. L., Szepesv'ari, C, "Convergence results for single-step on-policy reinforcement-learning algorithms," Journal of Machine Learning, Vol.38, No.3, pp. 287-308, 2000 https://doi.org/10.1023/A:1007678930559
  8. T. Mitchell, Machine Learning, McGraw Hill, 1989
  9. Tekinay, S.; Jabbari, B., "Handover and channel assignment in mobile cellular networks," Communications Magazine, IEEE, Vol.29, No.11, pp. 42-46, 1991