• Title/Summary/Keyword: 2단계 마르코프 의사결정 과제

Search Result 1, Processing Time 0.015 seconds

Evaluating SR-Based Reinforcement Learning Algorithm Under the Highly Uncertain Decision Task (불확실성이 높은 의사결정 환경에서 SR 기반 강화학습 알고리즘의 성능 분석)

  • Kim, So Hyeon;Lee, Jee Hang
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.11 no.8
    • /
    • pp.331-338
    • /
    • 2022
  • Successor representation (SR) is a model of human reinforcement learning (RL) mimicking the underlying mechanism of hippocampal cells constructing cognitive maps. SR utilizes these learned features to adaptively respond to the frequent reward changes. In this paper, we evaluated the performance of SR under the context where changes in latent variables of environments trigger the reward structure changes. For a benchmark test, we adopted SR-Dyna, an integration of SR into goal-driven Dyna RL algorithm in the 2-stage Markov Decision Task (MDT) in which we can intentionally manipulate the latent variables - state transition uncertainty and goal-condition. To precisely investigate the characteristics of SR, we conducted the experiments while controlling each latent variable that affects the changes in reward structure. Evaluation results showed that SR-Dyna could learn to respond to the reward changes in relation to the changes in latent variables, but could not learn rapidly in that situation. This brings about the necessity to build more robust RL models that can rapidly learn to respond to the frequent changes in the environment in which latent variables and reward structure change at the same time.