Search | Korea Science

Designing an Efficient Reward Function for Robot Reinforcement Learning of The Water Bottle Flipping Task (보틀플리핑의 로봇 강화학습을 위한 효과적인 보상 함수의 설계)

Yang, Young-Ha;Lee, Sang-Hyeok;Lee, Cheol-Soo
- The Journal of Korea Robotics Society
- /
- v.14 no.2
- /
- pp.81-86
- /
- 2019
Robots are used in various industrial sites, but traditional methods of operating a robot are limited at some kind of tasks. In order for a robot to accomplish a task, it is needed to find and solve accurate formula between a robot and environment and that is complicated work. Accordingly, reinforcement learning of robots is actively studied to overcome this difficulties. This study describes the process and results of learning and solving which applied reinforcement learning. The mission that the robot is going to learn is bottle flipping. Bottle flipping is an activity that involves throwing a plastic bottle in an attempt to land it upright on its bottom. Complexity of movement of liquid in the bottle when it thrown in the air, makes this task difficult to solve in traditional ways. Reinforcement learning process makes it easier. After 3-DOF robotic arm being instructed how to throwing the bottle, the robot find the better motion that make successful with the task. Two reward functions are designed and compared the result of learning. Finite difference method is used to obtain policy gradient. This paper focuses on the process of designing an efficient reward function to improve bottle flipping motion.
https://doi.org/10.7746/jkros.2019.14.2.081 인용 PDF KSCI

Digital Twin and Visual Object Tracking using Deep Reinforcement Learning (심층 강화학습을 이용한 디지털트윈 및 시각적 객체 추적)

Park, Jin Hyeok;Farkhodov, Khurshedjon;Choi, Piljoo;Lee, Suk-Hwan;Kwon, Ki-Ryong
- Journal of Korea Multimedia Society
- /
- v.25 no.2
- /
- pp.145-156
- /
- 2022
Nowadays, the complexity of object tracking models among hardware applications has become a more in-demand duty to complete in various indeterminable environment tracking situations with multifunctional algorithm skills. In this paper, we propose a virtual city environment using AirSim (Aerial Informatics and Robotics Simulation - AirSim, CityEnvironment) and use the DQN (Deep Q-Learning) model of deep reinforcement learning model in the virtual environment. The proposed object tracking DQN network observes the environment using a deep reinforcement learning model that receives continuous images taken by a virtual environment simulation system as input to control the operation of a virtual drone. The deep reinforcement learning model is pre-trained using various existing continuous image sets. Since the existing various continuous image sets are image data of real environments and objects, it is implemented in 3D to track virtual environments and moving objects in them.
https://doi.org/10.9717/kmms.2022.25.2.145 인용 PDF KSCI HTML

An Artificial Intelligence Game Agent Using CNN Based Records Learning and Reinforcement Learning (CNN 기반 기보학습 및 강화학습을 이용한 인공지능 게임 에이전트)

Jeon, Youngjin;Cho, Youngwan
- Journal of IKEEE
- /
- v.23 no.4
- /
- pp.1187-1194
- /
- 2019
This paper proposes a CNN architecture as value function network of an artificial intelligence Othello game agent and its learning scheme using reinforcement learning algorithm. We propose an approach to construct the value function network by using CNN to learn the records of professional players' real game and an approach to enhance the network parameter by learning from self-play using reinforcement learning algorithm. The performance of value function network CNN was compared with existing ANN by letting two agents using each network to play games each other. As a result, the winning rate of the CNN agent was 69.7% and 72.1% as black and white, respectively. In addition, as a result of applying the reinforcement learning, the performance of the agent was improved by showing 100% and 78% winning rate, respectively, compared with the network-based agent without the reinforcement learning.
https://doi.org/10.7471/ikeee.2019.23.4.1187 인용 PDF KSCI

Reinforcement Learning for Node-disjoint Path Problem in Wireless Ad-hoc Networks (무선 애드혹 네트워크에서 노드분리 경로문제를 위한 강화학습)

Jang, Kil-woong
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.23 no.8
- /
- pp.1011-1017
- /
- 2019
This paper proposes reinforcement learning to solve the node-disjoint path problem which establishes multipath for reliable data transmission in wireless ad-hoc networks. The node-disjoint path problem is a problem of determining a plurality of paths so that the intermediate nodes do not overlap between the source and the destination. In this paper, we propose an optimization method considering transmission distance in a large-scale wireless ad-hoc network using Q-learning in reinforcement learning, one of machine learning. Especially, in order to solve the node-disjoint path problem in a large-scale wireless ad-hoc network, a large amount of computation is required, but the proposed reinforcement learning efficiently obtains appropriate results by learning the path. The performance of the proposed reinforcement learning is evaluated from the viewpoint of transmission distance to establish two node-disjoint paths. From the evaluation results, it showed better performance in the transmission distance compared with the conventional simulated annealing.
https://doi.org/10.6109/jkiice.2019.23.8.1011 인용 PDF KSCI

Effective Utilization of Domain Knowledge for Relational Reinforcement Learning (관계형 강화 학습을 위한 도메인 지식의 효과적인 활용)

Kang, MinKyo;Kim, InCheol
- KIPS Transactions on Software and Data Engineering
- /
- v.11 no.3
- /
- pp.141-148
- /
- 2022
Recently, reinforcement learning combined with deep neural network technology has achieved remarkable success in various fields such as board games such as Go and chess, computer games such as Atari and StartCraft, and robot object manipulation tasks. However, such deep reinforcement learning describes states, actions, and policies in vector representation. Therefore, the existing deep reinforcement learning has some limitations in generality and interpretability of the learned policy, and it is difficult to effectively incorporate domain knowledge into policy learning. On the other hand, dNL-RRL, a new relational reinforcement learning framework proposed to solve these problems, uses a kind of vector representation for sensor input data and lower-level motion control as in the existing deep reinforcement learning. However, for states, actions, and learned policies, It uses a relational representation with logic predicates and rules. In this paper, we present dNL-RRL-based policy learning for transportation mobile robots in a manufacturing environment. In particular, this study proposes a effective method to utilize the prior domain knowledge of human experts to improve the efficiency of relational reinforcement learning. Through various experiments, we demonstrate the performance improvement of the relational reinforcement learning by using domain knowledge as proposed in this paper.
https://doi.org/10.3745/KTSDE.2022.11.3.141 인용 PDF KSCI

Self-Organizing Feature Map with Constant Learning Rate and Binary Reinforcement (일정 학습계수와 이진 강화함수를 가진 자기 조직화 형상지도 신경회로망)

조성원;석진욱
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.32B no.1
- /
- pp.180-188
- /
- 1995
A modified Kohonen's self-organizing feature map (SOFM) algorithm which has binary reinforcement function and a constant learning rate is proposed. In contrast to the time-varing adaptaion gain of the original Kohonen's SOFM algorithm, the proposed algorithm uses a constant adaptation gain, and adds a binary reinforcement function in order to compensate for the lowered learning ability of SOFM due to the constant learning rate. Since the proposed algorithm does not have the complicated multiplication, it's digital hardware implementation is much easier than that of the original SOFM.
PDF

Hand Reaching Movement Acquired through Reinforcement Learning

Shibata, Katsunari;Sugisaka, Masanori;Ito, Koji
- 제어로봇시스템학회:학술대회논문집
- /
- 2000.10a
- /
- pp.474-474
- /
- 2000
This paper shows that a system with two-link arm can obtain hand reaching movement to a target object projected on a visual sensor by reinforcement learning using a layered neural network. The reinforcement signal, which is an only signal from the environment, is given to the system only when the hand reaches the target object. The neural network computes two joint torques from visual sensory signals, joint angles, and joint angular velocities considering the urn dynamics. It is known that the trajectory of the voluntary movement o( human hand reaching is almost straight, and the hand velocity changes like bell-shape. Although there are some exceptions, the properties of the trajectories obtained by the reinforcement learning are somewhat similar to the experimental result of the human hand reaching movement.
PDF

Performance Improvement of Genetic Algorithms by Reinforcement Learning (강화학습을 통한 유전자 알고리즘의 성능개선)

이상환;전효병;심귀보
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 1998.03a
- /
- pp.81-84
- /
- 1998
Genetic Algorithms (GAs) are stochastic algorithms whose search methods model some natural phenomena. The procedure of GAs may be divided into two sub-procedures : Operation and Selection. Chromosomes can produce new offspring by means of operation, and the fitter chromosomes can produce more offspring than the less fit ones by means of selection. However, operation which is executed randomly and has some limits to its execution can not guarantee to produce fitter chromosomes. Thus, we propose a method which gives a directional information to the genetic operator by reinforcement learning. It can be achived by using neural networks to apply reinforcement learning to the genetic operator. We use the amount of fitness change which can be considered as reinforcement signal to calcualte the error terms for the output units. Then the weights are updated using backpropagtion algorithm. The performance improvement of GAs using reinforcement learning can be measured by applying the pr posed method to GA-hard problem.
PDF

Comparison of learning performance of character controller based on deep reinforcement learning according to state representation (상태 표현 방식에 따른 심층 강화 학습 기반 캐릭터 제어기의 학습 성능 비교)

Sohn, Chaejun;Kwon, Taesoo;Lee, Yoonsang
- Journal of the Korea Computer Graphics Society
- /
- v.27 no.5
- /
- pp.55-61
- /
- 2021
The character motion control based on physics simulation using reinforcement learning continue to being carried out. In order to solve a problem using reinforcement learning, the network structure, hyperparameter, state, action and reward must be properly set according to the problem. In many studies, various combinations of states, action and rewards have been defined and successfully applied to problems. Since there are various combinations in defining state, action and reward, many studies are conducted to analyze the effect of each element to find the optimal combination that improves learning performance. In this work, we analyzed the effect on reinforcement learning performance according to the state representation, which has not been so far. First we defined three coordinate systems: root attached frame, root aligned frame, and projected aligned frame. and then we analyze the effect of state representation by three coordinate systems on reinforcement learning. Second, we analyzed how it affects learning performance when various combinations of joint positions and angles for state.
https://doi.org/10.15701/kcgs.2021.27.5.55 인용 PDF KSCI

UAV Path Planning based on Deep Reinforcement Learning using Cell Decomposition Algorithm (셀 분해 알고리즘을 활용한 심층 강화학습 기반 무인 항공기 경로 계획)

Kyoung-Hun Kim;Byungsun Hwang;Joonho Seon;Soo-Hyun Kim;Jin-Young Kim
- The Journal of the Institute of Internet, Broadcasting and Communication
- /
- v.24 no.3
- /
- pp.15-20
- /
- 2024
Path planning for unmanned aerial vehicles (UAV) is crucial in avoiding collisions with obstacles in complex environments that include both static and dynamic obstacles. Path planning algorithms like RRT and A^* are effectively handle static obstacle avoidance but have limitations with increasing computational complexity in high-dimensional environments. Reinforcement learning-based algorithms can accommodate complex environments, but like traditional path planning algorithms, they struggle with training complexity and convergence in higher-dimensional environment. In this paper, we proposed a reinforcement learning model utilizing a cell decomposition algorithm. The proposed model reduces the complexity of the environment by decomposing the learning environment in detail, and improves the obstacle avoidance performance by establishing the valid action of the agent. This solves the exploration problem of reinforcement learning and improves the convergence of learning. Simulation results show that the proposed model improves learning speed and efficient path planning compared to reinforcement learning models in general environments.
https://doi.org/10.7236/JIIBC.2024.24.3.15 인용 PDF HTML

Search Result 829, Processing Time 0.03 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)