• Title/Summary/Keyword: 심층결정론적정책경사법

Search Result 1, Processing Time 0.019 seconds

Determination of Ship Collision Avoidance Path using Deep Deterministic Policy Gradient Algorithm (심층 결정론적 정책 경사법을 이용한 선박 충돌 회피 경로 결정)

  • Kim, Dong-Ham;Lee, Sung-Uk;Nam, Jong-Ho;Furukawa, Yoshitaka
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.56 no.1
    • /
    • pp.58-65
    • /
    • 2019
  • The stability, reliability and efficiency of a smart ship are important issues as the interest in an autonomous ship has recently been high. An automatic collision avoidance system is an essential function of an autonomous ship. This system detects the possibility of collision and automatically takes avoidance actions in consideration of economy and safety. In order to construct an automatic collision avoidance system using reinforcement learning, in this work, the sequential decision problem of ship collision is mathematically formulated through a Markov Decision Process (MDP). A reinforcement learning environment is constructed based on the ship maneuvering equations, and then the three key components (state, action, and reward) of MDP are defined. The state uses parameters of the relationship between own-ship and target-ship, the action is the vertical distance away from the target course, and the reward is defined as a function considering safety and economics. In order to solve the sequential decision problem, the Deep Deterministic Policy Gradient (DDPG) algorithm which can express continuous action space and search an optimal action policy is utilized. The collision avoidance system is then tested assuming the $90^{\circ}$intersection encounter situation and yields a satisfactory result.