• Title/Summary/Keyword: Q learning

Search Result 430, Processing Time 0.023 seconds

A Naive Bayesian-based Model of the Opponent's Policy for Efficient Multiagent Reinforcement Learning (효율적인 멀티 에이전트 강화 학습을 위한 나이브 베이지만 기반 상대 정책 모델)

  • Kwon, Ki-Duk
    • Journal of Internet Computing and Services
    • /
    • v.9 no.6
    • /
    • pp.165-177
    • /
    • 2008
  • An important issue in Multiagent reinforcement learning is how an agent should learn its optimal policy in a dynamic environment where there exist other agents able to influence its own performance. Most previous works for Multiagent reinforcement learning tend to apply single-agent reinforcement learning techniques without any extensions or require some unrealistic assumptions even though they use explicit models of other agents. In this paper, a Naive Bayesian based policy model of the opponent agent is introduced and then the Multiagent reinforcement learning method using this model is explained. Unlike previous works, the proposed Multiagent reinforcement learning method utilizes the Naive Bayesian based policy model, not the Q function model of the opponent agent. Moreover, this learning method can improve learning efficiency by using a simpler one than other richer but time-consuming policy models such as Finite State Machines(FSM) and Markov chains. In this paper, the Cat and Mouse game is introduced as an adversarial Multiagent environment. And then effectiveness of the proposed Naive Bayesian based policy model is analyzed through experiments using this game as test-bed.

  • PDF

Optimal Scheduling of Satellite Tracking Antenna of GNSS System (다중위성 추적 안테나의 위성추적 최적 스케쥴링)

  • Ahn, Chae-Ik;Shin, Ho-Hyun;Kim, You-Dan;Jung, Seong-Kyun;Lee, Sang-Uk;Kim, Jae-Hoon
    • Journal of the Korean Society for Aeronautical & Space Sciences
    • /
    • v.36 no.7
    • /
    • pp.666-673
    • /
    • 2008
  • To construct the accurate radio satellite navigation system, the efficient communication each satellite with the ground station is very important. Throughout the communication, the orbit of each satellite can be corrected, and those information will be used to analyze the satellite satus by the operator. Since there are limited resources of ground station, the schedule of antenna's azimuth and elevation angle should be optimized. On the other hand, the satellite in the medium earth orbit does not pass the same point of the earth surface due to the rotation of the earth. Therefore, the antenna pass schedule must be updated at the proper moment. In this study, Q learning approach which is a form of model-free reinforcement learning and genetic algorithm are considered to find the optimal antenna schedule. To verify the optimality of the solution, numerical simulations are conducted.

Subjectivity on Problem Based Learning(PBL) Experience of Freshmen in Nursing students (간호학과 신입생의 문제중심학습(PBL)의 경험에 관한 주관성연구)

  • Park, Ju-Young;Yang, Nam-Young
    • Journal of Digital Convergence
    • /
    • v.11 no.1
    • /
    • pp.329-338
    • /
    • 2013
  • Purpose: This study was to identify the types of subjectivity on PBL experience of freshmen in nursing students. Method: This study is exploratory research through Q methodology. From 102 Q populations, we selected 31 Q sorting was done by 25 of P sample. When the Q sorting is completed on nine point scale, we interviewed participants and documented their responses. The data was analyzed by using QUNAL program. Result: The result of the study showed 4 types. Four factors provided an explanation for 71.6% of total variances, and these four factors were analyzed and categorized as four types. We named type 1 as [positive pressure], type 2 as [relational friendly], type 3 as [creative benefit], type 4 as [paticipatory development]. Conclusion: In this study, PBL was valuable experience and recognized as a variety of perspectives for freshmen in nursing students. These findings indicate we suggest that planning of strategy for efficient operation on PBL was reflected above results.

Nursing Students who have Experienced by Clinical practice Recognition type of Core Fundamental Nursing skills (임상실습을 경험한 간호대학생의 핵심기본간호술에 대한 인식유형)

  • Jeon, Mi-Kyung;Jung, Hyun-Jang
    • Journal of the Korea Convergence Society
    • /
    • v.9 no.3
    • /
    • pp.297-305
    • /
    • 2018
  • The purpose of this study is to identify the types of recognition of core basic nursing skills of nursing students who have experienced clinical practice with Q methodology. The subjects of this study were 34 nursing students of graduate school who had clinical practice using 34 Q samples The data were analyzed by PC-QUAN program. As a result of the research, 1 type 'behavior - centered', core basic nursing is recognized to be performed correctly according to nursing situation, 2 types of 'future preparation type' are core nursing, And 3 types of 'dependent learning type' were categorized as recognizing that sufficient learning is required in school for accurate nursing practice. Analysis of each type will contribute to providing basic data for establishing an effective educational strategy.

A Subjectivity Study of Culinary Arts Major Students in Problem Based Learning(PBL) Program for Culinary Competition (조리전공 대학생의 요리경연대회 참가를 위한 문제중심학습(PBL) 적용사례연구)

  • Shin, Seoung-Hoon;Kim, Chan-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.8
    • /
    • pp.598-608
    • /
    • 2019
  • This study provided the analysis of the culinary arts students' subjectivity in problem based learning(PBL) program for culinary competition. Q methodology was employed for finding common characteristic of among students' opinion and also future suggestion was generated. The study found four different types of common structures. First one is Problem-Solving Ability Type(N=6), the second one is Team Member Collaboration Important Type(N=8), The third one is Self-Directed Learning Needed Type(N=3), and the last one is Employment Preparation Type(N=2). Through the analysis, students aware this particular PBL program as a problem solving skill development, understanding of coworking in group, importance of self directed learning, and preparation for securing job opportunity. The study also suggest that the educator need to perform as a negotiator in coworking process within group members and need to have an active approach on stimulation of study motivation among the students.

Stealthy Behavior Simulations Based on Cognitive Data (인지 데이터 기반의 스텔스 행동 시뮬레이션)

  • Choi, Taeyeong;Na, Hyeon-Suk
    • Journal of Korea Game Society
    • /
    • v.16 no.2
    • /
    • pp.27-40
    • /
    • 2016
  • Predicting stealthy behaviors plays an important role in designing stealth games. It is, however, difficult to automate this task because human players interact with dynamic environments in real time. In this paper, we present a reinforcement learning (RL) method for simulating stealthy movements in dynamic environments, in which an integrated model of Q-learning with Artificial Neural Networks (ANN) is exploited as an action classifier. Experiment results show that our simulation agent responds sensitively to dynamic situations and thus is useful for game level designer to determine various parameters for game.

Design of Emotional Learning Controllers for AC Voltage and Circulating Current of Wind-Farm-Side Modular Multilevel Converters

  • Li, Keli;Liao, Yong;Liu, Ren;Zhang, Jimiao
    • Journal of Power Electronics
    • /
    • v.16 no.6
    • /
    • pp.2294-2305
    • /
    • 2016
  • The introduction of a high-voltage direct-current (HVDC) system based on a modular multilevel converter (MMC) for wind farm integration has stimulated studies on methods to control this type of converter. This research article focuses on the control of the AC voltage and circulating current for a wind-farm-side MMC (WFS-MMC). After theoretical analysis, emotional learning (EL) controllers are proposed for the controls. The EL controllers are derived from the learning mechanisms of the amygdala and orbitofrontal cortex which make the WFS-MMC insensitive to variance in system parameters, power change, and fault in the grid. The d-axis and q-axis currents are respectively considered for the d-axis and q-axis voltage controls to improve the performance of AC voltage control. The practicability of the proposed control is verified under various conditions with a point-to-point MMC-HVDC system. Simulation results show that the proposed method is superior to the traditional proportional-integral controller.

Opportunistic Spectrum Access with Discrete Feedback in Unknown and Dynamic Environment:A Multi-agent Learning Approach

  • Gao, Zhan;Chen, Junhong;Xu, Yuhua
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.9 no.10
    • /
    • pp.3867-3886
    • /
    • 2015
  • This article investigates the problem of opportunistic spectrum access in dynamic environment, in which the signal-to-noise ratio (SNR) is time-varying. Different from existing work on continuous feedback, we consider more practical scenarios in which the transmitter receives an Acknowledgment (ACK) if the received SNR is larger than the required threshold, and otherwise a Non-Acknowledgment (NACK). That is, the feedback is discrete. Several applications with different threshold values are also considered in this work. The channel selection problem is formulated as a non-cooperative game, and subsequently it is proved to be a potential game, which has at least one pure strategy Nash equilibrium. Following this, a multi-agent Q-learning algorithm is proposed to converge to Nash equilibria of the game. Furthermore, opportunistic spectrum access with multiple discrete feedbacks is also investigated. Finally, the simulation results verify that the proposed multi-agent Q-learning algorithm is applicable to both situations with binary feedback and multiple discrete feedbacks.

The Effect of Input Variables Clustering on the Characteristics of Ensemble Machine Learning Model for Water Quality Prediction (입력자료 군집화에 따른 앙상블 머신러닝 모형의 수질예측 특성 연구)

  • Park, Jungsu
    • Journal of Korean Society on Water Environment
    • /
    • v.37 no.5
    • /
    • pp.335-343
    • /
    • 2021
  • Water quality prediction is essential for the proper management of water supply systems. Increased suspended sediment concentration (SSC) has various effects on water supply systems such as increased treatment cost and consequently, there have been various efforts to develop a model for predicting SSC. However, SSC is affected by both the natural and anthropogenic environment, making it challenging to predict SSC. Recently, advanced machine learning models have increasingly been used for water quality prediction. This study developed an ensemble machine learning model to predict SSC using the XGBoost (XGB) algorithm. The observed discharge (Q) and SSC in two fields monitoring stations were used to develop the model. The input variables were clustered in two groups with low and high ranges of Q using the k-means clustering algorithm. Then each group of data was separately used to optimize XGB (Model 1). The model performance was compared with that of the XGB model using the entire data (Model 2). The models were evaluated by mean squared error-ob servation standard deviation ratio (RSR) and root mean squared error. The RSR were 0.51 and 0.57 in the two monitoring stations for Model 2, respectively, while the model performance improved to RSR 0.46 and 0.55, respectively, for Model 1.