• Title/Summary/Keyword: Q learning

Search Result 426, Processing Time 0.025 seconds

Traffic Control using Q-Learning Algorithm (Q 학습을 이용한 교통 제어 시스템)

  • Zheng, Zhang;Seung, Ji-Hoon;Kim, Tae-Yeong;Chong, Kil-To
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.11
    • /
    • pp.5135-5142
    • /
    • 2011
  • A flexible mechanism is proposed in this paper to improve the dynamic response performance of a traffic flow control system in an urban area. The roads, vehicles, and traffic control systems are all modeled as intelligent systems, wherein a wireless communication network is used as the medium of communication between the vehicles and the roads. The necessary sensor networks are installed in the roads and on the roadside upon which reinforcement learning is adopted as the core algorithm for this mechanism. A traffic policy can be planned online according to the updated situations on the roads, based on all the information from the vehicles and the roads. This improves the flexibility of traffic flow and offers a much more efficient use of the roads over a traditional traffic control system. The optimum intersection signals can be learned automatically online. An intersection control system is studied as an example of the mechanism using Q-learning based algorithm, and simulation results showed that the proposed mechanism can improve the traffic efficiency and the waiting time at the signal light by more than 30% in various conditions compare to the traditional signaling system.

Traffic Offloading in Two-Tier Multi-Mode Small Cell Networks over Unlicensed Bands: A Hierarchical Learning Framework

  • Sun, Youming;Shao, Hongxiang;Liu, Xin;Zhang, Jian;Qiu, Junfei;Xu, Yuhua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.11
    • /
    • pp.4291-4310
    • /
    • 2015
  • This paper investigates the traffic offloading over unlicensed bands for two-tier multi-mode small cell networks. We formulate this problem as a Stackelberg game and apply a hierarchical learning framework to jointly maximize the utilities of both macro base station (MBS) and small base stations (SBSs). During the learning process, the MBS behaves as a leader and the SBSs are followers. A pricing mechanism is adopt by MBS and the price information is broadcasted to all SBSs by MBS firstly, then each SBS competes with other SBSs and takes its best response strategies to appropriately allocate the traffic load in licensed and unlicensed band in the sequel, taking the traffic flow payment charged by MBS into consideration. Then, we present a hierarchical Q-learning algorithm (HQL) to discover the Stackelberg equilibrium. Additionally, if some extra information can be obtained via feedback, we propose an improved hierarchical Q-learning algorithm (IHQL) to speed up the SBSs' learning process. Last but not the least, the convergence performance of the proposed two algorithms is analyzed. Numerical experiments are presented to validate the proposed schemes and show the effectiveness.

The Effects of the Learning Cycle Model by Learner's Characteristics in Junior High School (중학교 과학수업에서 학습자 특성에 따른 순환학습 모형의 효과)

  • Jeong, Jin-Su;Chung, Wan-Ho
    • Journal of The Korean Association For Science Education
    • /
    • v.15 no.3
    • /
    • pp.284-290
    • /
    • 1995
  • This study examined the effects of the learning cycle model by learner's characteristics such as I.Q., cognitive levels, inquiry skins, cognitive style, activity, reflectiveness. To see the effects of the learning cycle model, nonequivalent control group pretest-posttest multiple treatment designs was used in the study. 99 middle school second-graders(female) were divided into two groups. One group was selected as the experimental group (n=50), the other served at the comparison group(n=49). During the eight-month period, the students in the experimental group were instructed according to the learning cycle model, while the students in the comparison group were instructed according to the traditional instruction methods. Achievement data from science achievement test were analyzed by an ANOVA technique. The results of the study are as follows : 1. Science knowledge achievement. For the lower level students of activity, the learning cycle model is superior to the traditional approaches in science knowledge achievement. 2. Science inquiry skills. For the upper level students of I.Q., cognitive levels, inquiry skills, cognitive style and reflectiveness, the learning cycle model is superior to the traditional approaches in science inquiry skills. 3. Attitudes toward science. For the lower level students of I.Q., cognitive levels, inquiry skills, cognitive style, activity and reflectiveness, the learning cycle model is superior to the traditional approaches in attitudes toward science.

  • PDF

Applying Deep Reinforcement Learning to Improve Throughput and Reduce Collision Rate in IEEE 802.11 Networks

  • Ke, Chih-Heng;Astuti, Lia
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.1
    • /
    • pp.334-349
    • /
    • 2022
  • The effectiveness of Wi-Fi networks is greatly influenced by the optimization of contention window (CW) parameters. Unfortunately, the conventional approach employed by IEEE 802.11 wireless networks is not scalable enough to sustain consistent performance for the increasing number of stations. Yet, it is still the default when accessing channels for single-users of 802.11 transmissions. Recently, there has been a spike in attempts to enhance network performance using a machine learning (ML) technique known as reinforcement learning (RL). Its advantage is interacting with the surrounding environment and making decisions based on its own experience. Deep RL (DRL) uses deep neural networks (DNN) to deal with more complex environments (such as continuous state spaces or actions spaces) and to get optimum rewards. As a result, we present a new approach of CW control mechanism, which is termed as contention window threshold (CWThreshold). It uses the DRL principle to define the threshold value and learn optimal settings under various network scenarios. We demonstrate our proposed method, known as a smart exponential-threshold-linear backoff algorithm with a deep Q-learning network (SETL-DQN). The simulation results show that our proposed SETL-DQN algorithm can effectively improve the throughput and reduce the collision rates.

Extended Q-larning under Multiple Tasks (복수의 부분 작업을 위한 확장된 Q-Learning)

  • 오도훈;윤소정;오경환
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04b
    • /
    • pp.229-231
    • /
    • 2000
  • 많은 학습 방법 중에서 비교적 최근에 제시된 강화학습은 동적인 환경에서 뛰어난 학습 능력을 보여주었다. 이런 장점을 바탕으로 강화학습은 학습을 기초로 하는 에이전트 연구에 많이 사용되고 있다. 하지만, 현재까지 연구 결과는 강화학습으로 구축된 에이전트로 해결 할 수 있는 작업의 난이도에 한계가 있음을 보이고 있다. 특히, 복수의 부분 작업으로 구성되어 있는 복합 작업을 처리할 경우에 기존의 강화학습 방법은 문제 해결에 한계를 보여주고 있다. 본 논문에서는 복수의 부분 작업으로 구성된 복합 작업이 왜 처리하기 힘든가를 분석하고, 이런 문제를 처리할 수 있는 방안을 제안한다. 본 논문에서 제안하고 있는 EQ-Learning은 강화학습 방법의 대표적인 Q-Learning을 개량하고 기존의 문제를 해결한다. 이 방법은 각각의 부분 작업 해결 방안을 학습시키고 그 학습 결과들의 적절한 적용 순서를 찾아내 복합 작업을 해결한다. EQ-Learning의 타당성을 검증하기 위해 격자 공간에서 복수의 부분작업으로 구성된 미로 문제를 통하여 실험하였다.

  • PDF

Application of Deep Recurrent Q Network with Dueling Architecture for Optimal Sepsis Treatment Policy

  • Do, Thanh-Cong;Yang, Hyung Jeong;Ho, Ngoc-Huynh
    • Smart Media Journal
    • /
    • v.10 no.2
    • /
    • pp.48-54
    • /
    • 2021
  • Sepsis is one of the leading causes of mortality globally, and it costs billions of dollars annually. However, treating septic patients is currently highly challenging, and more research is needed into a general treatment method for sepsis. Therefore, in this work, we propose a reinforcement learning method for learning the optimal treatment strategies for septic patients. We model the patient physiological time series data as the input for a deep recurrent Q-network that learns reliable treatment policies. We evaluate our model using an off-policy evaluation method, and the experimental results indicate that it outperforms the physicians' policy, reducing patient mortality up to 3.04%. Thus, our model can be used as a tool to reduce patient mortality by supporting clinicians in making dynamic decisions.

Dynamic Computation Offloading Based on Q-Learning for UAV-Based Mobile Edge Computing

  • Shreya Khisa;Sangman Moh
    • Smart Media Journal
    • /
    • v.12 no.3
    • /
    • pp.68-76
    • /
    • 2023
  • Emerging mobile edge computing (MEC) can be used in battery-constrained Internet of things (IoT). The execution latency of IoT applications can be improved by offloading computation-intensive tasks to an MEC server. Recently, the popularity of unmanned aerial vehicles (UAVs) has increased rapidly, and UAV-based MEC systems are receiving considerable attention. In this paper, we propose a dynamic computation offloading paradigm for UAV-based MEC systems, in which a UAV flies over an urban environment and provides edge services to IoT devices on the ground. Since most IoT devices are energy-constrained, we formulate our problem as a Markov decision process considering the energy level of the battery of each IoT device. We also use model-free Q-learning for time-critical tasks to maximize the system utility. According to our performance study, the proposed scheme can achieve desirable convergence properties and make intelligent offloading decisions.

Q-Learning based Collision Avoidance for 802.11 Stations with Maximum Requirements

  • Chang Kyu Lee;Dong Hyun Lee;Junseok Kim;Xiaoying Lei;Seung Hyong Rhee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.17 no.3
    • /
    • pp.1035-1048
    • /
    • 2023
  • The IEEE 802.11 WLAN adopts a random backoff algorithm for its collision avoidance mechanism, and it is well known that the contention-based algorithm may suffer from performance degradation especially in congested networks. In this paper, we design an efficient backoff algorithm that utilizes a reinforcement learning method to determine optimal values of backoffs. The mobile nodes share a common contention window (CW) in our scheme, and using a Q-learning algorithm, they can avoid collisions by finding and implicitly reserving their optimal time slot(s). In addition, we introduce Frame Size Control (FSC) algorithm to minimize the possible degradation of aggregate throughput when the number of nodes exceeds the CW size. Our simulation shows that the proposed backoff algorithm with FSC method outperforms the 802.11 protocol regardless of the traffic conditions, and an analytical modeling proves that our mechanism has a unique operating point that is fair and stable.

A Case Study on the Effect of Online Cooperative Learning applied in Accounting Class (온라인 협력학습 회계수업 적용방안 및 효과에 관한 사례연구)

  • Song, Seungah
    • The Journal of the Korea Contents Association
    • /
    • v.22 no.4
    • /
    • pp.535-546
    • /
    • 2022
  • This study tried to explore factors for improving academic achievement in online non-face-to-face education based on the survey results of a University's online cooperative learning Q&A. Due to the Corona situation, both professors and learners can easily feel psychological isolation due to the implementation of all non-face-to-face online classes. As one of the methods, it was intended to suggest the direction of future education to various teachers and learners by sharing class cases in which the online cooperative learning methodology was applied. Previous studies on non-face-to-face online learning, online cooperative learning, and learning promotion method were reviewed, and the online Q&A method was adopted as a specific learning promotion method to conduct research. In the Q&A process, learners were given an opportunity to check their learning content, share knowledge and communicate, and performance evaluation-related factors such as guaranteeing anonymity of the questioner and answerer, improvement points system, and absolute evaluation were asked. As a result of the survey analysis, it was found that they are the success factors of online cooperative learning. It is a small change that can be applied in practice in the future where online non-face-to-face learning is likely to continue, but by sharing meaningful cases of application of teaching methodologies, both professors and learners being motivated and actively involved in. It is expected that we will be able to suggest methods and directions for improving skills together by changing and supplementing the learning field.

Flipped Learning teaching model design and application for the University's "Linear Algebra" ('선형대수학' 플립드러닝(Flipped Learning) 강의 모델 설계 및 적용)

  • Park, Kyung-Eun;Lee, Sang-Gu
    • Communications of Mathematical Education
    • /
    • v.30 no.1
    • /
    • pp.1-22
    • /
    • 2016
  • We had a full scale of literature survey and case survey of mathematics Flipped Learning class models. The purpose of this study is to design and adopt a Flipped Learning 'Linear Algebra' class model that fis our need. We applied our new model to 30 students at S University. Then we analyzed the activities and performance of students in this course. Our Flipped Learning 'Linear Algebra' teaching model is followed in 3 stages : The first stage involved the students viewing an online lecture as homework and participating free question-answer by themselves on Q&A before class, the second stage involved in-class learning which researcher solved the students' Q&A and highlighted the main ideas through the Point-Lecture, the third stage involved the students participating more advanced topic by themselves on Q&A and researcher (or peers) finalizing students' Q&A. According to the survey, the teaching model made a certain contribution not only to increase students' participation and interest, but also to improve their communication skill and self-directed learning skill in all classes and online. We used the Purposive Sampling from the obtained data. For the research's validity and reliability, we used the Content Validity and the Alternate-Form Method. We found several meaningful output from this analysis.