• Title/Summary/Keyword: Actor-Critic Method

Search Result 24, Processing Time 0.021 seconds

Blockchain Based Financial Portfolio Management Using A3C (A3C를 활용한 블록체인 기반 금융 자산 포트폴리오 관리)

  • Kim, Ju-Bong;Heo, Joo-Seong;Lim, Hyun-Kyo;Kwon, Do-Hyung;Han, Youn-Hee
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.8 no.1
    • /
    • pp.17-28
    • /
    • 2019
  • In the financial investment management strategy, the distributed investment selecting and combining various financial assets is called portfolio management theory. In recent years, the blockchain based financial assets, such as cryptocurrencies, have been traded on several well-known exchanges, and an efficient portfolio management approach is required in order for investors to steadily raise their return on investment in cryptocurrencies. On the other hand, deep learning has shown remarkable results in various fields, and research on application of deep reinforcement learning algorithm to portfolio management has begun. In this paper, we propose an efficient financial portfolio investment management method based on Asynchronous Advantage Actor-Critic (A3C), which is a representative asynchronous reinforcement learning algorithm. In addition, since the conventional cross-entropy function can not be applied to portfolio management, we propose a proper method where the existing cross-entropy is modified to fit the portfolio investment method. Finally, we compare the proposed A3C model with the existing reinforcement learning based cryptography portfolio investment algorithm, and prove that the performance of the proposed A3C model is better than the existing one.

Time-varying Proportional Navigation Guidance using Deep Reinforcement Learning (심층 강화학습을 이용한 시변 비례 항법 유도 기법)

  • Chae, Hyeok-Joo;Lee, Daniel;Park, Su-Jeong;Choi, Han-Lim;Park, Han-Sol;An, Kyeong-Soo
    • Journal of the Korea Institute of Military Science and Technology
    • /
    • v.23 no.4
    • /
    • pp.399-406
    • /
    • 2020
  • In this paper, we propose a time-varying proportional navigation guidance law that determines the proportional navigation gain in real-time according to the operating situation. When intercepting a target, an unidentified evasion strategy causes a loss of optimality. To compensate for this problem, proper proportional navigation gain is derived at every time step by solving an optimal control problem with the inferred evader's strategy. Recently, deep reinforcement learning algorithms are introduced to deal with complex optimal control problem efficiently. We adapt the actor-critic method to build a proportional navigation gain network and the network is trained by the Proximal Policy Optimization(PPO) algorithm to learn an evasion strategy of the target. Numerical experiments show the effectiveness and optimality of the proposed method.

Backstepping Sliding Mode-based Model-free Control of Electro-hydraulic Systems

  • Truong, Hoai-Vu-Anh;Trinh, Hoai-An;Ahn, Kyoung-Kwan
    • Journal of Drive and Control
    • /
    • v.19 no.1
    • /
    • pp.51-61
    • /
    • 2022
  • This paper presents a model-free system based on a framework of a backstepping sliding mode control (BSMC) with a radial basis function neural network (RBFNN) and adaptive mechanism for electro-hydraulic systems (EHSs). First, an EHS mathematical model was dedicatedly derived to understand the system behavior. Based on the system structure, BSMC was employed to satisfy the output performance. Due to the highly nonlinear characteristics and the presence of parametric uncertainties, a model-free approximator based on an RBFNN was developed to compensate for the EHS dynamics, thus addressing the difficulty in the requirement of system information. Adaptive laws based on the actor-critic neural network (ACNN) were implemented to suppress the existing error in the approximation and satisfy system qualification. The stability of the closed-loop system was theoretically proven by the Lyapunov function. To evaluate the effectiveness of the proposed algorithm, proportional-integrated-derivative (PID) and improved PID with ACNN (ACPID), which are considered two complete model-free methods, and adaptive backstepping sliding mode control, considered an ideal model-based method with the same adaptive laws, were used as two benchmark control strategies in a comparative simulation. The simulated results validated the superiority of the proposed algorithm in achieving nearly the same performance as the ideal adaptive BSMC.

A Study of Reinforcement Learning-based Cyber Attack Prediction using Network Attack Simulator (NASim) (네트워크 공격 시뮬레이터를 이용한 강화학습 기반 사이버 공격 예측 연구)

  • Bum-Sok Kim;Jung-Hyun Kim;Min-Suk Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.3
    • /
    • pp.112-118
    • /
    • 2023
  • As technology advances, the need for enhanced preparedness against cyber-attacks becomes an increasingly critical problem. Therefore, it is imperative to consider various circumstances and to prepare for cyber-attack strategic technology. This paper proposes a method to solve network security problems by applying reinforcement learning to cyber-security. In general, traditional static cyber-security methods have difficulty effectively responding to modern dynamic attack patterns. To address this, we implement cyber-attack scenarios such as 'Tiny Alpha' and 'Small Alpha' and evaluate the performance of various reinforcement learning methods using Network Attack Simulator, which is a cyber-attack simulation environment based on the gymnasium (formerly Open AI gym) interface. In addition, we experimented with different RL algorithms such as value-based methods (Q-Learning, Deep-Q-Network, and Double Deep-Q-Network) and policy-based methods (Actor-Critic). As a result, we observed that value-based methods with discrete action spaces consistently outperformed policy-based methods with continuous action spaces, demonstrating a performance difference ranging from a minimum of 20.9% to a maximum of 53.2%. This result shows that the scheme not only suggests opportunities for enhancing cybersecurity strategies, but also indicates potential applications in cyber-security education and system validation across a large number of domains such as military, government, and corporate sectors.

  • PDF