• Title/Summary/Keyword: Value-based reinforcement

Search Result 160, Processing Time 0.033 seconds

Reinforcement Method to Enhance Adaptive Route Search for Efficient Real-Time Application Specific QoS Routing (Real-Time Application의 효과적인 QoS 라우팅을 위한 적응적 Route 선택 강화 방법)

  • Oh, Jae-Seuk;Bae, Sung-Il;Ahn, Jin-Ho;Sungh Kang
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.40 no.12
    • /
    • pp.71-82
    • /
    • 2003
  • In this paper, we present a new method to calculate reinforcement value in QoS routing algorithm targeted for real-time applications based on Ant algorithm to efficiently and effectively reinforce ant-like mobile agents to find the best route toward destination in a network regarding necessary QoS metrics. Simulation results show that the proposed method realizes QoS routing more efficiently and more adaptively than those of the existing method thereby providing better solutions for the best route selection for real-time application that has high priority on delay jitter and bandwidth.

R-Trader: An Automatic Stock Trading System based on Reinforcement learning (R-Trader: 강화 학습에 기반한 자동 주식 거래 시스템)

  • 이재원;김성동;이종우;채진석
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.11
    • /
    • pp.785-794
    • /
    • 2002
  • Automatic stock trading systems should be able to solve various kinds of optimization problems such as market trend prediction, stock selection, and trading strategies, in a unified framework. But most of the previous trading systems based on supervised learning have a limit in the ultimate performance, because they are not mainly concerned in the integration of those subproblems. This paper proposes a stock trading system, called R-Trader, based on reinforcement teaming, regarding the process of stock price changes as Markov decision process (MDP). Reinforcement learning is suitable for Joint optimization of predictions and trading strategies. R-Trader adopts two popular reinforcement learning algorithms, temporal-difference (TD) and Q, for selecting stocks and optimizing other trading parameters respectively. Technical analysis is also adopted to devise the input features of the system and value functions are approximated by feedforward neural networks. Experimental results on the Korea stock market show that the proposed system outperforms the market average and also a simple trading system trained by supervised learning both in profit and risk management.

The Effective Goal-Setting and The Practice based on Value-Added Results(VAR) (가치-부가적 성과 관점에 따른 효과적인 목표설정과 실사례)

  • Shin Tack-Hyun
    • Proceedings of the KSR Conference
    • /
    • 2004.10a
    • /
    • pp.1731-1736
    • /
    • 2004
  • The purpose of this article is to introduce a useful methodology of effective goal-setting for the team-level units. As a way to overcome some common symptoms in terms of Strategic Performance Evaluation System such as lack of knowledge on goal-setting, disconnection of process, problem of judging the degree of difficulty about objectives, limits of staff departments evaluation, fairness and authority of evaluators, weakness in coaching technique, and quantity or figure-oriented evaluation, to name a few, and to seek a more plausible goal-setting methodology, the author suggests a persuasive goal-setting concept: VAR(Value-Added Results). VAR, as the end-results, is the team contributions that add value to the organization, and it results from the team's activities. In addition to these goal-setting technique based on the concept of value-added results, several aspects should be improved for Strategic Performance Evaluation System to be implemented more effectively. They are: 1) shift from MBO to MP & D(Managing Performance & Development), 2) impartial exercise of evaluation authority as a organizational public assets, 3) reinforcement of maternal leadership and servantship instead of paternal leadership, 4) utilization of IT-based evaluation system.

  • PDF

A Study on Application of Reinforcement Learning Algorithm Using Pixel Data (픽셀 데이터를 이용한 강화 학습 알고리즘 적용에 관한 연구)

  • Moon, Saemaro;Choi, Yonglak
    • Journal of Information Technology Services
    • /
    • v.15 no.4
    • /
    • pp.85-95
    • /
    • 2016
  • Recently, deep learning and machine learning have attracted considerable attention and many supporting frameworks appeared. In artificial intelligence field, a large body of research is underway to apply the relevant knowledge for complex problem-solving, necessitating the application of various learning algorithms and training methods to artificial intelligence systems. In addition, there is a dearth of performance evaluation of decision making agents. The decision making agent that can find optimal solutions by using reinforcement learning methods designed through this research can collect raw pixel data observed from dynamic environments and make decisions by itself based on the data. The decision making agent uses convolutional neural networks to classify situations it confronts, and the data observed from the environment undergoes preprocessing before being used. This research represents how the convolutional neural networks and the decision making agent are configured, analyzes learning performance through a value-based algorithm and a policy-based algorithm : a Deep Q-Networks and a Policy Gradient, sets forth their differences and demonstrates how the convolutional neural networks affect entire learning performance when using pixel data. This research is expected to contribute to the improvement of artificial intelligence systems which can efficiently find optimal solutions by using features extracted from raw pixel data.

The role of wall configuration and reinforcement type in selecting the pseudo-static coefficients for reinforced soil walls

  • Majid Yazdandoust;Amirhossein Rasouli Jamnani;Mohsen Sabermahani
    • Geomechanics and Engineering
    • /
    • v.35 no.5
    • /
    • pp.555-570
    • /
    • 2023
  • In the current study, a series of experimental and analytical evaluations were performed to introduce the horizontal pseudo static coefficient (kh) as a function of the wall configuration and the reinforcement type for analyzing reinforced soil walls. For this purpose, eight shaking table tests were performed on reduced-scale models of integrated and two-tiered walls reinforced by metal strip and geogrid to determine the distribution of dynamic lateral pressure in the walls. Then, the physical models were analyzed using Mononobe-Okabe method to estimate the value of kh required to establish the dynamic lateral pressures similar to those observed in shaking table tests. Based on the results, the horizontal pseudo static coefficient and the position of resultant lateral force (R) were introduced as a function of the horizontal peak ground acceleration (HPGA), the wall configuration, the reinforcement type as well as maximum wall displacement.

Kernel-based actor-critic approach with applications

  • Chu, Baek-Suk;Jung, Keun-Woo;Park, Joo-Young
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.11 no.4
    • /
    • pp.267-274
    • /
    • 2011
  • Recently, actor-critic methods have drawn significant interests in the area of reinforcement learning, and several algorithms have been studied along the line of the actor-critic strategy. In this paper, we consider a new type of actor-critic algorithms employing the kernel methods, which have recently shown to be very effective tools in the various fields of machine learning, and have performed investigations on combining the actor-critic strategy together with kernel methods. More specifically, this paper studies actor-critic algorithms utilizing the kernel-based least-squares estimation and policy gradient, and in its critic's part, the study uses a sliding-window-based kernel least-squares method, which leads to a fast and efficient value-function-estimation in a nonparametric setting. The applicability of the considered algorithms is illustrated via a robot locomotion problem and a tunnel ventilation control problem.

A Self-Designing Method of Behaviors in Behavior-Based Robotics (행위 기반 로봇에서의 행위의 자동 설계 기법)

  • Yun, Do-Yeong;O, Sang-Rok;Park, Gwi-Tae
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.8 no.7
    • /
    • pp.607-612
    • /
    • 2002
  • An automatic design method of behaviors in behavior-based robotics is proposed. With this method, a robot can design its behaviors by itself without aids of human designer. Automating design procedure of behaviors can make the human designer free from somewhat tedious endeavor that requires to predict all possible situations in which the robot will work and to design a suitable behavior for each situation. A simple reinforcement learning strategy is the main frame of this method and the key parameter of the learning process is significant change of reward value. A successful application to mobile robot navigation is reported too.

Design of Ballistic Calculation Model for Improving Accuracy of Naval Gun Firing based on Deep Learning

  • Oh, Moon-Tak
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.12
    • /
    • pp.11-18
    • /
    • 2021
  • This paper shows the applicability of deep learning algorithm in predicting target position and getting correction value of impact point in order to improve the accuracy of naval gun firing. Predicting target position, the proposed model using LSTM model and RN structure is expected to be more accurate than existing method using kalman filter. Getting correction value of impact point, the another proposed model suggests a reinforcement model that manages factors which is related in ballistic calculation as data set, and learns using the data set. The model is expected to reduce error of naval gun firing. Combining two models, a ballistic calculation model for improving accuracy of naval gun firing based on deep learning algorithm was designed.

Hybrid Corrosion Inhibitor-Based Zwitterions and Phosphate in Reinforced Concrete: Determining Chloride Threshold and Service Life (철근 콘크리트의 Zwitterion 및 인산염 기반 하이브리드 부식 억제제: 염화물 임계값 및 사용 수명 결정)

  • Tran, Duc Thanh;Jeong, Min-Goo;Lee, Han-Seung;Yang, Hyun-Min;Singh, Jitendra Kumar
    • Proceedings of the Korean Institute of Building Construction Conference
    • /
    • 2023.05a
    • /
    • pp.33-34
    • /
    • 2023
  • Corrosion of reinforcement steel is a major cause of deterioration in reinforced concrete (RC) structures. In order to protect these structures from corrosion, corrosion inhibitors are added to the concrete mix. In recent years, zwitterionic compounds have shown promising results as corrosion inhibitors in concrete due to their ability to form a protective layer on the surface of the reinforcement steel. The experimental study involves preparing concrete samples with different concentrations of adding the hybrid corrosion inhibitor at a high concentration of chloride ions. This study aims to determine the chloride threshold value and service life of hybrid corrosion inhibitors in reinforced concrete based on zwitterions. The samples are subjected to accelerated corrosion tests in a chloride environment to determine the threshold value and service life of the corrosion inhibitor. The effect of hybrid inhibitor on mechanical properties is guaranteed in allowable range. The chloride threshold concentration and service life of hybrid inhibitor containing samples perform greater than those of plain RC.

  • PDF

Improved Deep Q-Network Algorithm Using Self-Imitation Learning (Self-Imitation Learning을 이용한 개선된 Deep Q-Network 알고리즘)

  • Sunwoo, Yung-Min;Lee, Won-Chang
    • Journal of IKEEE
    • /
    • v.25 no.4
    • /
    • pp.644-649
    • /
    • 2021
  • Self-Imitation Learning is a simple off-policy actor-critic algorithm that makes an agent find an optimal policy by using past good experiences. In case that Self-Imitation Learning is combined with reinforcement learning algorithms that have actor-critic architecture, it shows performance improvement in various game environments. However, its applications are limited to reinforcement learning algorithms that have actor-critic architecture. In this paper, we propose a method of applying Self-Imitation Learning to Deep Q-Network which is a value-based deep reinforcement learning algorithm and train it in various game environments. We also show that Self-Imitation Learning can be applied to Deep Q-Network to improve the performance of Deep Q-Network by comparing the proposed algorithm and ordinary Deep Q-Network training results.