• Title/Summary/Keyword: Q learning

Search Result 424, Processing Time 0.034 seconds

Multiple Reward Reinforcement learning control of a mobile robot in home network environment

  • Kang, Dong-Oh;Lee, Jeun-Woo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.1300-1304
    • /
    • 2003
  • The following paper deals with a control problem of a mobile robot in home network environment. The home network causes the mobile robot to communicate with sensors to get the sensor measurements and to be adapted to the environment changes. To get the improved performance of control of a mobile robot in spite of the change in home network environment, we use the fuzzy inference system with multiple reward reinforcement learning. The multiple reward reinforcement learning enables the mobile robot to consider the multiple control objectives and adapt itself to the change in home network environment. Multiple reward fuzzy Q-learning method is proposed for the multiple reward reinforcement learning. Multiple Q-values are considered and max-min optimization is applied to get the improved fuzzy rule. To show the effectiveness of the proposed method, some simulation results are given, which are performed in home network environment, i.e., LAN, wireless LAN, etc.

  • PDF

Topic directed Web Spidering using Reinforcement Learning (강화학습을 이용한 주제별 웹 탐색)

  • Lim, Soo-Yeon
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.15 no.4
    • /
    • pp.395-399
    • /
    • 2005
  • In this paper, we presents HIGH-Q learning algorithm with reinforcement learning for more fast and exact topic-directed web spidering. The purpose of reinforcement learning is to maximize rewards from environment, an reinforcement learning agents learn by interacting with external environment through trial and error. We performed experiments that compared the proposed method using reinforcement learning with breath first search method for searching the web pages. In result, reinforcement learning method using future discounted rewards searched a small number of pages to find result pages.

Development of Semi-Active Control Algorithm Using Deep Q-Network (Deep Q-Network를 이용한 준능동 제어알고리즘 개발)

  • Kim, Hyun-Su;Kang, Joo-Won
    • Journal of Korean Association for Spatial Structures
    • /
    • v.21 no.1
    • /
    • pp.79-86
    • /
    • 2021
  • Control performance of a smart tuned mass damper (TMD) mainly depends on control algorithms. A lot of control strategies have been proposed for semi-active control devices. Recently, machine learning begins to be applied to development of vibration control algorithm. In this study, a reinforcement learning among machine learning techniques was employed to develop a semi-active control algorithm for a smart TMD. The smart TMD was composed of magnetorheological damper in this study. For this purpose, an 11-story building structure with a smart TMD was selected to construct a reinforcement learning environment. A time history analysis of the example structure subject to earthquake excitation was conducted in the reinforcement learning procedure. Deep Q-network (DQN) among various reinforcement learning algorithms was used to make a learning agent. The command voltage sent to the MR damper is determined by the action produced by the DQN. Parametric studies on hyper-parameters of DQN were performed by numerical simulations. After appropriate training iteration of the DQN model with proper hyper-parameters, the DQN model for control of seismic responses of the example structure with smart TMD was developed. The developed DQN model can effectively control smart TMD to reduce seismic responses of the example structure.

Optimizing Energy Efficiency in Mobile Ad Hoc Networks: An Intelligent Multi-Objective Routing Approach

  • Sun Beibei
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.19 no.2
    • /
    • pp.107-114
    • /
    • 2024
  • Mobile ad hoc networks represent self-configuring networks of mobile devices that communicate without relying on a fixed infrastructure. However, traditional routing protocols in such networks encounter challenges in selecting efficient and reliable routes due to dynamic nature of these networks caused by unpredictable mobility of nodes. This often results in a failure to meet the low-delay and low-energy consumption requirements crucial for such networks. In order to overcome such challenges, our paper introduces a novel multi-objective and adaptive routing scheme based on the Q-learning reinforcement learning algorithm. The proposed routing scheme dynamically adjusts itself based on measured network states, such as traffic congestion and mobility. The proposed approach utilizes Q-learning to select routes in a decentralized manner, considering factors like energy consumption, load balancing, and the selection of stable links. We present a formulation of the multi-objective optimization problem and discuss adaptive adjustments of the Q-learning parameters to handle the dynamic nature of the network. To speed up the learning process, our scheme incorporates informative shaped rewards, providing additional guidance to the learning agents for better solutions. Implemented on the widely-used AODV routing protocol, our proposed approaches demonstrate better performance in terms of energy efficiency and improved message delivery delay, even in highly dynamic network environments, when compared to the traditional AODV. These findings show the potential of leveraging reinforcement learning for efficient routing in ad hoc networks, making the way for future advancements in the field of mobile ad hoc networking.

A Learning based Algorithm for Traveling Salesman Problem (강화학습기법을 이용한 TSP의 해법)

  • Lim, JoonMook;Bae, SungMin;Suh, JaeJoon
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.32 no.1
    • /
    • pp.61-73
    • /
    • 2006
  • This paper deals with traveling salesman problem(TSP) with the stochastic travel time. Practically, the travel time between demand points changes according to day and time zone because of traffic interference and jam. Since the almost pervious studies focus on TSP with the deterministic travel time, it is difficult to apply those results to logistics problem directly. But many logistics problems are strongly related with stochastic situation such as stochastic travel time. We need to develop the efficient solution method for the TSP with stochastic travel time. From the previous researches, we know that Q-learning technique gives us to deal with stochastic environment and neural network also enables us to calculate the Q-value of Q-learning algorithm. In this paper, we suggest an algorithm for TSP with the stochastic travel time integrating Q-learning and neural network. And we evaluate the validity of the algorithm through computational experiments. From the simulation results, we conclude that a new route obtained from the suggested algorithm gives relatively more reliable travel time in the logistics situation with stochastic travel time.

Variable Selection of Feature Pattern using SVM-based Criterion with Q-Learning in Reinforcement Learning (SVM-기반 제약 조건과 강화학습의 Q-learning을 이용한 변별력이 확실한 특징 패턴 선택)

  • Kim, Chayoung
    • Journal of Internet Computing and Services
    • /
    • v.20 no.4
    • /
    • pp.21-27
    • /
    • 2019
  • Selection of feature pattern gathered from the observation of the RNA sequencing data (RNA-seq) are not all equally informative for identification of differential expressions: some of them may be noisy, correlated or irrelevant because of redundancy in Big-Data sets. Variable selection of feature pattern aims at differential expressed gene set that is significantly relevant for a special task. This issues are complex and important in many domains, for example. In terms of a computational research field of machine learning, selection of feature pattern has been studied such as Random Forest, K-Nearest and Support Vector Machine (SVM). One of most the well-known machine learning algorithms is SVM, which is classical as well as original. The one of a member of SVM-criterion is Support Vector Machine-Recursive Feature Elimination (SVM-RFE), which have been utilized in our research work. We propose a novel algorithm of the SVM-RFE with Q-learning in reinforcement learning for better variable selection of feature pattern. By comparing our proposed algorithm with the well-known SVM-RFE combining Welch' T in published data, our result can show that the criterion from weight vector of SVM-RFE enhanced by Q-learning has been improved by an off-policy by a more exploratory scheme of Q-learning.

Interaction and Flow as the Antecedents of e-Learner Satisfaction (이러닝 만족도 영향요인으로서의 상호작용과 몰입)

  • Moon, Chul-Woo;Kim, Jae-Hyoun
    • The Journal of Korean Association of Computer Education
    • /
    • v.14 no.3
    • /
    • pp.63-72
    • /
    • 2011
  • Satisfactory e-learning experience of working part-time adult students is a truly dynamic and multidimensional process that reflects learning needs and abilities. Special attention is given to understanding the role of student-to-faculty interaction, student-to-student interaction, e-learning content and course structure, flow, periodic off-line class meetings and synchronous Q&A sessions. Survey questions were developed and distributed to adult graduate students. Some of them were asked to complete the questions with the most interesting subjects or classes in their mind, and others with the most difficult subjects in their mind. The structural model for each group was tested. The values of path coefficients corresponding to the group with the difficult subjects turn out to be higher for the following paths; a) interaction among professors and students and satisfaction, b) contents quality and flow, c) Q&A and interaction among professors and students, d) Q&A and interaction among students. For the other paths such as interaction among students and satisfaction, contents structure and flow, the coefficient values corresponding to the group with the interesting subjects are higher. Some implications for e-learning design were provided as well.

  • PDF

Dynamic Positioning of Robot Soccer Simulation Game Agents using Reinforcement learning

  • Kwon, Ki-Duk;Cho, Soo-Sin;Kim, In-Cheol
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.59-64
    • /
    • 2001
  • The robot soccer simulation game is a dynamic multi-agent environment. In this paper we suggest a new reinforcement learning approach to each agent's dynamic positioning in such dynamic environment. Reinforcement learning is the machine learning in which an agent learns from indirect, delayed reward an optimal policy to chose sequences of actions that produce the greatest cumulative reward. Therefore the reinforcement learning is different from supervised learning in the sense that there is no presentation of input pairs as training examples. Furthermore, model-free reinforcement learning algorithms like Q-learning do not require defining or learning any models of the surrounding environment. Nevertheless it can learn the optimal policy if the agent can visit every state- action pair infinitely. However, the biggest problem of monolithic reinforcement learning is that its straightforward applications do not successfully scale up to more complex environments due to the intractable large space of states. In order to address this problem. we suggest Adaptive Mediation-based Modular Q-Learning (AMMQL)as an improvement of the existing Modular Q-Learning (MQL). While simple modular Q-learning combines the results from each learning module in a fixed way, AMMQL combines them in a more flexible way by assigning different weight to each module according to its contribution to rewards. Therefore in addition to resolving the problem of large state effectively, AMMQL can show higher adaptability to environmental changes than pure MQL. This paper introduces the concept of AMMQL and presents details of its application into dynamic positioning of robot soccer agents.

  • PDF

Dynamic CBDT : Extension of CBDT via Reinforcement Method of Q-learning (Dynamic CBDT : Q-learning의 강화기법을 응용한 CBDT 확장 기법)

  • Jin, Y.K.;Chang, H.S.
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2006.10b
    • /
    • pp.194-199
    • /
    • 2006
  • 본 논문에서는 불확실한 환경 상에서의 의사결정 알고리즘인 "Case-based Decision Theory" (CBDT) 알고리즘을 dynamic하게 연동되는 연속된 의사결정 문제에 대하여 강화학습의 대표적인 Q-learning의 강화기법을 응용하여 확장한 새로운 의사결정 알고리즘 "Dynamic CBDT"를 제안하고, CBDT알고리즘에 대한 Dynamic CBDT의 효율성을 테트리스 실험을 통하여 확인한다.

  • PDF

A reinforcement learning-based method for the cooperative control of mobile robots (강화 학습에 의한 소형 자율 이동 로봇의 협동 알고리즘 구현)

  • 김재희;조재승;권인소
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1997.10a
    • /
    • pp.648-651
    • /
    • 1997
  • This paper proposes methods for the cooperative control of multiple mobile robots and constructs a robotic soccer system in which the cooperation will be implemented as a pass play of two robots. To play a soccer game, elementary actions such as shooting and moving have been designed, and Q-learning, which is one of the popular methods for reinforcement learning, is used to determine what actions to take. Through simulation, learning is successful in case of deliberate initial arrangements of ball and robots, thereby cooperative work can be accomplished.

  • PDF