• Title/Summary/Keyword: reinforcement method

Search Result 2,412, Processing Time 0.038 seconds

Design of Reinforcement Learning Controller with Self-Organizing Map (자기 조직화 맵을 이용한 강화학습 제어기 설계)

  • 이재강;김일환
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.53 no.5
    • /
    • pp.353-360
    • /
    • 2004
  • This paper considers reinforcement learning control with the self-organizing map. Reinforcement learning uses the observable states of objective system and signals from interaction of the system and environment as input data. For fast learning in neural network training, it is necessary to reduce learning data. In this paper, we use the self-organizing map to partition the observable states. Partitioning states reduces the number of learning data which is used for training neural networks. And neural dynamic programming design method is used for the controller. For evaluating the designed reinforcement learning controller, an inverted pendulum on the cart system is simulated. The designed controller is composed of serial connection of self-organizing map and two Multi-layer Feed-Forward Neural Networks.

Analysis of Stress Intensity Factor for the Cracked Plate Reinforced with a Sheet by Seam Welding (심용접에 의한 판재로 보강된 균열판의 응력세기계수 해석)

  • 김옥환;박성두;이영호
    • Journal of Welding and Joining
    • /
    • v.16 no.1
    • /
    • pp.63-69
    • /
    • 1998
  • The stress intensity factor has been calculated theoretically for the cracked plate subjected to remote normal stress and reinforced with a sheet by symmetric seam welding. The singular integral equation was derived based on displacement compatibility condition between the cracked sheet and the reinforcement plate, and solved by means of Erdogran and Gupta's method. The results from the derived equation for stress intensity factor were compared with FEM solutions and seems to be reasonable. The reinforcement effect gets better as welding line is closer to the crack and the stiffness ratio of the cracked plate and the reinforcement sheet becomes larger.

  • PDF

Avoiding collaborative paradox in multi-agent reinforcement learning

  • Kim, Hyunseok;Kim, Hyunseok;Lee, Donghun;Jang, Ingook
    • ETRI Journal
    • /
    • v.43 no.6
    • /
    • pp.1004-1012
    • /
    • 2021
  • The collaboration productively interacting between multi-agents has become an emerging issue in real-world applications. In reinforcement learning, multi-agent environments present challenges beyond tractable issues in single-agent settings. This collaborative environment has the following highly complex attributes: sparse rewards for task completion, limited communications between each other, and only partial observations. In particular, adjustments in an agent's action policy result in a nonstationary environment from the other agent's perspective, which causes high variance in the learned policies and prevents the direct use of reinforcement learning approaches. Unexpected social loafing caused by high dispersion makes it difficult for all agents to succeed in collaborative tasks. Therefore, we address a paradox caused by the social loafing to significantly reduce total returns after a certain timestep of multi-agent reinforcement learning. We further demonstrate that the collaborative paradox in multi-agent environments can be avoided by our proposed effective early stop method leveraging a metric for social loafing.

Q-learning for intersection traffic flow Control based on agents

  • Zhou, Xuan;Chong, Kil-To
    • Proceedings of the IEEK Conference
    • /
    • 2009.05a
    • /
    • pp.94-96
    • /
    • 2009
  • In this paper, we present the Q-learning method for adaptive traffic signal control on the basis of multi-agent technology. The structure is composed of sixphase agents and one intersection agent. Wireless communication network provides the possibility of the cooperation of agents. As one kind of reinforcement learning, Q-learning is adopted as the algorithm of the control mechanism, which can acquire optical control strategies from delayed reward; furthermore, we adopt dynamic learning method instead of static method, which is more practical. Simulation result indicates that it is more effective than traditional signal system.

  • PDF

Self-Organized Reinforcement Learning Using Fuzzy Inference for Stochastic Gradient Ascent Method

  • K, K.-Wong;Akio, Katuki
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.96.3-96
    • /
    • 2001
  • In this paper the self-organized and fuzzy inference used stochastic gradient ascent method is proposed. Fuzzy rule and fuzzy set increase as occasion demands autonomously according to the observation information. And two rules(or two fuzzy sets)becoming to be similar each other as progress of learning are unified. This unification causes the reduction of a number of parameters and learning time. Using fuzzy inference and making a rule with an appropriate state division, our proposed method makes it possible to construct a robust reinforcement learning system.

  • PDF

Application Assessment of FRP Grouting Method (FRP보강 그라우팅공법의 적용성 평가)

  • 박종호;오명렬;이재덕;박용원
    • Proceedings of the Korean Geotechical Society Conference
    • /
    • 2001.10b
    • /
    • pp.60-76
    • /
    • 2001
  • The grout-effect evaluation of the ground reinforcement technique, which has been widely applied to civil engineering and construction fields, is not established for the guidelines of choosing the efficient evaluation method, and in fact the expects have little effort to determine the reinforcement effect quantitively. This paper presents some results a field test performance of FRP pressure grouting method at a collapsed slope were carried out to verify the improving effect.

  • PDF

Evaluation on the Maximum Yield Strength of Steel Stirrups in Reinforced Concrete Beams (철근콘크리트 보에 사용된 전단보강철근의 항복강도 제한에 대한 평가)

  • Lee, Jin-Eun;Lee, Jung-Yoon
    • Journal of the Korea Concrete Institute
    • /
    • v.24 no.6
    • /
    • pp.685-693
    • /
    • 2012
  • The yield strength of shear reinforcement is restricted in the present design codes. In this study, the possibility of the yield strength increase in shear reinforcement is evaluated according to ACI318-08, EC2-02 and CSA-04 by comparing the experimental and calculated results. Three cases were used to analyze the shear strength of the beam. One had no limitation in the yield strength of shear reinforcement, another had restriction on the yield strength of shear reinforcement, and the other had a restriction on the yield strength of shear reinforcement and the shear reinforcement ratio. The study results showed that the case with unlimited shear reinforcement yield strength predicted the test result better than other two cases. Even though the rebar yield strength higher than the strength required in present code was applied to existing shear design equation, the result was reasonable. Therefore, the design equation seemed to be appropriate even if the high-strength shear reinforcement is used in practice based on the existing shear design method.

Experimental Reinforcement Agent for Damaged Walls of Payathonzu Temple Murals in Bagan, Myanmar

  • Lee, Na Ra;Lee, Hwa Soo;Han, Kyeong Soon
    • Journal of Conservation Science
    • /
    • v.36 no.4
    • /
    • pp.284-295
    • /
    • 2020
  • This study focuses on reinforcement agents for wall damage, such as cracks, breakage, or delamination, for mural paintings from the Payathonzu temple. Experiments were conducted with filling and grouting agents based on the reinforcing method. In the filling reinforcement experiment, different mixing ratios of lime to sand, and additives (jaggery, seaweed glue, and Primal SF-016) were used. In the grouting reinforcement experiment, the mixing ratio of lime and pozzolan was the same, and the additive types were identical to the filling reinforcement experiment. The filling reinforcement experiment showed that there were fewer physical changes such as contraction, with a greater mixing ratio of lime to sand, however, the compressive strength decreased as the mixing ratio increased. With additives, the change in volume of agent decreased and the compressive strength increased, which was especially prominent for jaggery and Primal SF-016. The grouting reinforcement experiment showed that there was a remarkable contraction with an increased amount of moisture that originates from the characteristic of grouting agents that requires flowability. With additives, the water content of the agent decreased, whereas the compressive strength and adhesion increased. Among the additives, Primal SF-016 exhibited the highest compressive strength, and seaweed glue exhibited the most considerable viscosity and adhesion. The study results showed that the characteristics of reinforcement agents vary according to the mixing ratio and additives of the filling and grouting agents. Therefore, it is necessary to selectively apply the mixing ratio and additives for different reinforcement agents considering the wall damage for conservation treatments.

Suspension Control using Reinforcement Learning (강화학습에 의한 현가장치의 제어)

  • Jeong, Gyu-Baek;Mun, Yeong-Jun;Park, Ju-Yeong
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.11a
    • /
    • pp.163-166
    • /
    • 2007
  • 최근에 국내외의 인공지능 분야에서는, 강화학습(reinforcement learning)에 관한 연구가 활발히 진행되고 있다. 본 논문에서는 능동형 현가장치(active-suspension)의 제어를 위하여 RLS 기반 NAC(natural actor-critic)을 활용한 강화학습 기법을 적용해보고, 그 성능을 시뮬레이션을 통해 확인해본다.

  • PDF

A reinforcement learning-based method for the cooperative control of mobile robots (강화 학습에 의한 소형 자율 이동 로봇의 협동 알고리즘 구현)

  • 김재희;조재승;권인소
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.648-651
    • /
    • 1997
  • This paper proposes methods for the cooperative control of multiple mobile robots and constructs a robotic soccer system in which the cooperation will be implemented as a pass play of two robots. To play a soccer game, elementary actions such as shooting and moving have been designed, and Q-learning, which is one of the popular methods for reinforcement learning, is used to determine what actions to take. Through simulation, learning is successful in case of deliberate initial arrangements of ball and robots, thereby cooperative work can be accomplished.

  • PDF