• 제목/요약/키워드: Self reinforcement

검색결과 282건 처리시간 0.029초

자기 조직화 맵을 이용한 강화학습 제어기 설계 (Design of Reinforcement Learning Controller with Self-Organizing Map)

  • 이재강;김일환
    • 대한전기학회논문지:시스템및제어부문D
    • /
    • 제53권5호
    • /
    • pp.353-360
    • /
    • 2004
  • This paper considers reinforcement learning control with the self-organizing map. Reinforcement learning uses the observable states of objective system and signals from interaction of the system and environment as input data. For fast learning in neural network training, it is necessary to reduce learning data. In this paper, we use the self-organizing map to partition the observable states. Partitioning states reduces the number of learning data which is used for training neural networks. And neural dynamic programming design method is used for the controller. For evaluating the designed reinforcement learning controller, an inverted pendulum on the cart system is simulated. The designed controller is composed of serial connection of self-organizing map and two Multi-layer Feed-Forward Neural Networks.

병원간호사의 셀프리더십 강화 프로그램의 효과 (The Effects of Self-leadership Reinforcement Program for Hospital Nurses)

  • 박은하;채영란
    • Journal of Korean Biological Nursing Science
    • /
    • 제20권2호
    • /
    • pp.132-140
    • /
    • 2018
  • Purpose: This study has been carried out in order to develop and verify the effects of self-leadership reinforcement program for hospital nurses. Methods: The research design was a non-equivalent control group pre-posttest design. Participants were 64 individuals (32 in each group), all of whom were nurses working at a university hospital, with less than five years of job experience. Experimental group was provided with two hours of self-leadership reinforcement program, once per week, for four weeks. The questionnaire for pre and post test included general characteristics, transfer motivation for learning, self-leadership, communication ability, clinical nursing competency, organizational commitment, and turnover intentions. Results: There was a significant difference in self-leadership scores between experimental group and control group (F= 15.10, p<.001). There was also a significant difference between the experimental group and the control group in terms of transfer motivation for learning (t = -5.44 p<.001), communication ability (F = 15.29, p<.001), clinical nursing competency (F = 15.23, p<.001), and organizational commitment scores (F = 7.21, p=.009). Conclusion: The self-leadership reinforcement program developed in this study was effective in improving self-leadership, communication ability, clinical nursing competency, and organizational commitment. Thus, by implementing the program at clinical levels, it will be a basis for nursing personnel resource administration.

Self-Imitation Learning을 이용한 개선된 Deep Q-Network 알고리즘 (Improved Deep Q-Network Algorithm Using Self-Imitation Learning)

  • 선우영민;이원창
    • 전기전자학회논문지
    • /
    • 제25권4호
    • /
    • pp.644-649
    • /
    • 2021
  • Self-Imitation Learning은 간단한 비활성 정책 actor-critic 알고리즘으로써 에이전트가 과거의 좋은 경험을 활용하여 최적의 정책을 찾을 수 있도록 해준다. 그리고 actor-critic 구조를 갖는 강화학습 알고리즘에 결합되어 다양한 환경들에서 알고리즘의 상당한 개선을 보여주었다. 하지만 Self-Imitation Learning이 강화학습에 큰 도움을 준다고 하더라도 그 적용 분야는 actor-critic architecture를 가지는 강화학습 알고리즘으로 제한되어 있다. 본 논문에서 Self-Imitation Learning의 알고리즘을 가치 기반 강화학습 알고리즘인 DQN에 적용하는 방법을 제안하고, Self-Imitation Learning이 적용된 DQN 알고리즘의 학습을 다양한 환경에서 진행한다. 아울러 그 결과를 기존의 결과와 비교함으로써 Self-Imitation Leaning이 DQN에도 적용될 수 있으며 DQN의 성능을 개선할 수 있음을 보인다.

Experimental and numeral investigation on self-compacting concrete column with CFRP-PVC spiral reinforcement

  • Chen, Zongping;Xu, Ruitian
    • Earthquakes and Structures
    • /
    • 제22권1호
    • /
    • pp.39-51
    • /
    • 2022
  • The axial compression behavior of nine self-compacting concrete columns confined with CFRP-PVC spirals was studied. Three parameters of spiral reinforcement spacing, spiral reinforcement diameter and height diameter ratio were studied. The test results show that the CFRP strip and PVC tube are destroyed first, and the spiral reinforcement and longitudinal reinforcement yield. The results show that with the increase of spiral reinforcement spacing, the peak bearing capacity decreases, but the ductility increases; with the increase of spiral reinforcement diameter, the peak bearing capacity increases, but has little effect on ductility, and the specimen with the ratio of height to diameter of 7.5 has the best mechanical properties. According to the reasonable constitutive relation of material, the finite element model of axial compression is established. Based on the verified finite element model, the stress mechanism is revealed. Finally, the composite constraint model and bearing capacity calculation method are proposed.

신경회로망을 이용한 도립전자의 학습제어 (Learning Control of Inverted Pendulum Using Neural Networks)

  • 이재강;김일환
    • 산업기술연구
    • /
    • 제24권A호
    • /
    • pp.99-107
    • /
    • 2004
  • This paper considers reinforcement learning control with the self-organizing map. Reinforcement learning uses the observable states of objective system and signals from interaction of the system and the environments as input data. For fast learning in neural network training, it is necessary to reduce learning data. In this paper, we use the self-organizing map to parition the observable states. Partitioning states reduces the number of learning data which is used for training neural networks. And neural dynamic programming design method is used for the controller. For evaluating the designed reinforcement learning controller, an inverted pendulum of the cart system is simulated. The designed controller is composed of serial connection of self-organizing map and two Multi-layer Feed-Forward Neural Networks.

  • PDF

Reinforcement Learning Control using Self-Organizing Map and Multi-layer Feed-Forward Neural Network

  • Lee, Jae-Kang;Kim, Il-Hwan
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2003년도 ICCAS
    • /
    • pp.142-145
    • /
    • 2003
  • Many control applications using Neural Network need a priori information about the objective system. But it is impossible to get exact information about the objective system in real world. To solve this problem, several control methods were proposed. Reinforcement learning control using neural network is one of them. Basically reinforcement learning control doesn't need a priori information of objective system. This method uses reinforcement signal from interaction of objective system and environment and observable states of objective system as input data. But many methods take too much time to apply to real-world. So we focus on faster learning to apply reinforcement learning control to real-world. Two data types are used for reinforcement learning. One is reinforcement signal data. It has only two fixed scalar values that are assigned for each success and fail state. The other is observable state data. There are infinitive states in real-world system. So the number of observable state data is also infinitive. This requires too much learning time for applying to real-world. So we try to reduce the number of observable states by classification of states with Self-Organizing Map. We also use neural dynamic programming for controller design. An inverted pendulum on the cart system is simulated. Failure signal is used for reinforcement signal. The failure signal occurs when the pendulum angle or cart position deviate from the defined control range. The control objective is to maintain the balanced pole and centered cart. And four states that is, position and velocity of cart, angle and angular velocity of pole are used for state signal. Learning controller is composed of serial connection of Self-Organizing Map and two Multi-layer Feed-Forward Neural Networks.

  • PDF

시불변 학습계수와 이진 강화 함수를 가진 자기 조직화 형상지도 신경회로망의 동적특성 (The dynamics of self-organizing feature map with constant learning rate and binary reinforcement function)

  • 석진욱;조성원
    • 제어로봇시스템학회논문지
    • /
    • 제2권2호
    • /
    • pp.108-114
    • /
    • 1996
  • We present proofs of the stability and convergence of Self-organizing feature map (SOFM) neural network with time-invarient learning rate and binary reinforcement function. One of the major problems in Self-organizing feature map neural network concerns with learning rate-"Kalman Filter" gain in stochsatic control field which is monotone decreasing function and converges to 0 for satisfying minimum variance property. In this paper, we show that the stability and convergence of Self-organizing feature map neural network with time-invariant learning rate. The analysis of the proposed algorithm shows that the stability and convergence is guranteed with exponentially stable and weak convergence properties as well.s as well.

  • PDF

일정 학습계수와 이진 강화함수를 가진 자기 조직화 형상지도 신경회로망 (Self-Organizing Feature Map with Constant Learning Rate and Binary Reinforcement)

  • 조성원;석진욱
    • 전자공학회논문지B
    • /
    • 제32B권1호
    • /
    • pp.180-188
    • /
    • 1995
  • A modified Kohonen's self-organizing feature map (SOFM) algorithm which has binary reinforcement function and a constant learning rate is proposed. In contrast to the time-varing adaptaion gain of the original Kohonen's SOFM algorithm, the proposed algorithm uses a constant adaptation gain, and adds a binary reinforcement function in order to compensate for the lowered learning ability of SOFM due to the constant learning rate. Since the proposed algorithm does not have the complicated multiplication, it's digital hardware implementation is much easier than that of the original SOFM.

  • PDF

Controller Learning Method of Self-driving Bicycle Using State-of-the-art Deep Reinforcement Learning Algorithms

  • Choi, Seung-Yoon;Le, Tuyen Pham;Chung, Tae-Choong
    • 한국컴퓨터정보학회논문지
    • /
    • 제23권10호
    • /
    • pp.23-31
    • /
    • 2018
  • Recently, there have been many studies on machine learning. Among them, studies on reinforcement learning are actively worked. In this study, we propose a controller to control bicycle using DDPG (Deep Deterministic Policy Gradient) algorithm which is the latest deep reinforcement learning method. In this paper, we redefine the compensation function of bicycle dynamics and neural network to learn agents. When using the proposed method for data learning and control, it is possible to perform the function of not allowing the bicycle to fall over and reach the further given destination unlike the existing method. For the performance evaluation, we have experimented that the proposed algorithm works in various environments such as fixed speed, random, target point, and not determined. Finally, as a result, it is confirmed that the proposed algorithm shows better performance than the conventional neural network algorithms NAF and PPO.

고유동 자기충전 콘크리트와 이형철근의 부착특성 (The Bond Characteristics of Deformed Bars in High Flowing Self-Compacting Concrete)

  • 최연왕;정재권;김경환;안태호
    • 대한토목학회논문집
    • /
    • 제29권5A호
    • /
    • pp.511-518
    • /
    • 2009
  • 본 연구에서는 콘크리트와 철근과의 부착강도에 영향을 미치는 요인 중 철근의 위치를 수평하부철근(Horizontal reinforcement at Bottom position, HB), 수평상부철근(Horizontal reinforcement at Top position, HT), 및 수직철근(Vertical reinforcement type, V)으로 변화시킨 시험체를 제작하여 3수준의 콘크리트 강도 변화(30, 50 및 70 MPa)에 따른 고유동 자기충전 콘크리트(High flowing Self-compacting Concrete, HSCC) 및 일반콘크리트(Conventional Concrete, CC)와 이형철근의 부착 특성을 비교 분석 하였다. HSCC 및 CC의 상부근 철근계수를 평가하기 위하여 HB/HT 철근의 부착강도비를 측정한 결과 50 및 70 MPa의 경우 HB/HT의 부착강도비는 1.3이하로 나타났으며, 30 MPa의 경우 HSCC 및 CC에서 각각 1.2 및 2.1로 나타났다. 따라서 HSCC 30, 50 및 70 MPa의 경우 콘크리트구조설계기준(2007) 정착길이 설계시 상부근 계수에 제시되는 수평상부철근에 대한 정착길이 보정계수를 CC의 1.3보다는 감소시켜 적용하는 것이 바람직할 것으로 판단된다.