Search | Korea Science

Explicit Dynamic Coordination Reinforcement Learning Based on Utility

Si, Huaiwei;Tan, Guozhen;Yuan, Yifu;peng, Yanfei;Li, Jianping
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.16 no.3
- /
- pp.792-812
- /
- 2022
Multi-agent systems often need to achieve the goal of learning more effectively for a task through coordination. Although the introduction of deep learning has addressed the state space problems, multi-agent learning remains infeasible because of the joint action spaces. Large-scale joint action spaces can be sparse according to implicit or explicit coordination structure, which can ensure reasonable coordination action through the coordination structure. In general, the multi-agent system is dynamic, which makes the relations among agents and the coordination structure are dynamic. Therefore, the explicit coordination structure can better represent the coordinative relationship among agents and achieve better coordination between agents. Inspired by the maximization of social group utility, we dynamically construct a factor graph as an explicit coordination structure to express the coordinative relationship according to the utility among agents and estimate the joint action values based on the local utility transfer among factor graphs. We present the application of such techniques in the scenario of multiple intelligent vehicle systems, where state space and action space are a problem and have too many interactions among agents. The results on the multiple intelligent vehicle systems demonstrate the efficiency and effectiveness of our proposed methods.
https://doi.org/10.3837/tiis.2022.03.003 인용 PDF KSCI HTML

Development of Semi-Active Control Algorithm Using Deep Q-Network (Deep Q-Network를 이용한 준능동 제어알고리즘 개발)

Kim, Hyun-Su;Kang, Joo-Won
- Journal of Korean Association for Spatial Structures
- /
- v.21 no.1
- /
- pp.79-86
- /
- 2021
Control performance of a smart tuned mass damper (TMD) mainly depends on control algorithms. A lot of control strategies have been proposed for semi-active control devices. Recently, machine learning begins to be applied to development of vibration control algorithm. In this study, a reinforcement learning among machine learning techniques was employed to develop a semi-active control algorithm for a smart TMD. The smart TMD was composed of magnetorheological damper in this study. For this purpose, an 11-story building structure with a smart TMD was selected to construct a reinforcement learning environment. A time history analysis of the example structure subject to earthquake excitation was conducted in the reinforcement learning procedure. Deep Q-network (DQN) among various reinforcement learning algorithms was used to make a learning agent. The command voltage sent to the MR damper is determined by the action produced by the DQN. Parametric studies on hyper-parameters of DQN were performed by numerical simulations. After appropriate training iteration of the DQN model with proper hyper-parameters, the DQN model for control of seismic responses of the example structure with smart TMD was developed. The developed DQN model can effectively control smart TMD to reduce seismic responses of the example structure.
https://doi.org/10.9712/KASS.2021.21.1.79 인용 PDF KSCI

A Constructive Algorithm of Fuzzy Model for Nonlinear System Modeling (비선형 시스템 모델링을 위한 퍼지 모델 구성 알고리즘)

Choi, Jong-Soo
- Proceedings of the KIEE Conference
- /
- 1998.11b
- /
- pp.648-650
- /
- 1998
This paper proposes a constructive algorithm for generating the Takagi-Sugeno type fuzzy model through the sequential learning from training data set. The proposed algorithm has a two-stage learning scheme that performs both structure and parameter learning simultaneously. The structure learning constructs fuzzy model using two growth criteria to assign new fuzzy rules for given observation data. The parameter learning adjusts the parameters of existing fuzzy rules using the LMS rule. To evaluate the performance of the proposed fuzzy modeling approach, well-known benchmark is used in simulation and compares it with other modeling approaches.
PDF

The Content Structure of the Navigation Course Using Learning Hierarchy (학습위계에 의한 항해교과의 내용 구조화)

Yoon, Hyun-Sang
- Journal of Fisheries and Marine Sciences Education
- /
- v.6 no.2
- /
- pp.198-216
- /
- 1994
The problem of promoting instructional effect using reorganizing the content of textbook is one of the major concerns of many education theorists and teachers. The results of many researches about above problem reveal that reorganizing the content of textbook promotes the ability of recall and problem solving of learners. The content structure of current navigation textbook revealed a categorical structure as its basic framework, though it seems to be a poor one. A categorical structure is known as providing an inferior information processing mechanism for learners than a learning hierarchy content structure is. Furthermore current content structure hasn't given any considerations to navigation in practice, spatial contexts and sequential events of ships from a harbor to another harbor. The learning hierarchy content structure has an advantage of giving learners more systematic and stronger knowledge networks than a categorical structure.
PDF

Solving Continuous Action/State Problem in Q-Learning Using Extended Rule Based Fuzzy Inference System

Kim, Min-Soeng;Lee, Ju-Jang
- Transactions on Control, Automation and Systems Engineering
- /
- v.3 no.3
- /
- pp.170-175
- /
- 2001
Q-learning is a kind of reinforcement learning where the agent solves the given task based on rewards received from the environment. Most research done in the field of Q-learning has focused on discrete domains, although the environment with which the agent must interact is generally continuous. Thus we need to devise some methods that enable Q-learning to be applicable to the continuous problem domain. In this paper, an extended fuzzy rule is proposed so that it can incorporate Q-learning. The interpolation technique, which is widely used in memory-based learning, is adopted to represent the appropriate Q value for current state and action pair in each extended fuzzy rule. The resulting structure based on the fuzzy inference system has the capability of solving the continuous state about the environment. The effectiveness of the proposed structure is shown through simulation on the cart-pole system.
PDF

The structural relationship among task value, self-efficacy, goal structure, and academic emotions for promoting self-regulated learning in e-learning course (이러닝 수업에서 대학생의 자기조절학습에 영향을 미치는 과제가치, 자기효능감, 수업 성취목표구조, 학업정서 간의 구조적 관계)

You, Ji-Won
- The Journal of Korean Association of Computer Education
- /
- v.15 no.4
- /
- pp.61-77
- /
- 2012
The purpose of this study was to examine the structural relationship among task value, self-efficacy, classroom goal structure, and academic emotions(enjoyment, fear, boredom) for promoting self-regulated learning in e-learning course. The results showed that task value, self-efficacy, class goal structure influenced academic emotions and self-regulated learning, and enjoyment had mediation effects among exogenous variables and self-regulated learning. The findings offer implications of facilitating self-regulated learning while considering academic emotions.
PDF

A study on interrelation between the structure of a Plant and the str neural network emulator and the learning rate (플랜트구조와 신경망에뮬레이터의 구조 및 학습시간과의 관계)

Pae, Chang-Han;Lee, Kwang-Won
- Proceedings of the KIEE Conference
- /
- 1997.07b
- /
- pp.386-389
- /
- 1997
Error-backpropagation has been used in the bulk of Practical applications for neural networks. While an emulator, a multilayered neural network, learns to identify the system's dynamic characteristics. There is, however, no concrete theoretical results about the structure of a plant and the structure of a multilayered neural network and the learning rate. The paper investigates the relation between structure of a plant and a multilayered network and learning rate. Simulation study shows that the plant signal with a short period and a fast sam time is preferable for learning of the network emulator.
PDF

Genetic algorithm based deep learning neural network structure and hyperparameter optimization (유전 알고리즘 기반의 심층 학습 신경망 구조와 초모수 최적화)

Lee, Sanghyeop;Kang, Do-Young;Park, Jangsik
- Journal of Korea Multimedia Society
- /
- v.24 no.4
- /
- pp.519-527
- /
- 2021
Alzheimer's disease is one of the challenges to tackle in the coming aging era and is attempting to diagnose and predict through various biomarkers. While the application of various deep learning-based technologies as powerful imaging technologies has recently expanded across the medical industry, empirical design is not easy because there are various deep earning neural networks architecture and categorical hyperparameters that rely on problems and data to solve. In this paper, we show the possibility of optimizing a deep learning neural network structure and hyperparameters for Alzheimer's disease classification in amyloid brain images in a representative deep earning neural networks architecture using genetic algorithms. It was observed that the optimal deep learning neural network structure and hyperparameter were chosen as the values of the experiment were converging.
https://doi.org/10.9717/kmms.2020.24.4.519 인용 PDF KSCI HTML

A Matrix-Based Genetic Algorithm for Structure Learning of Bayesian Networks

Ko, Song;Kim, Dae-Won;Kang, Bo-Yeong
- International Journal of Fuzzy Logic and Intelligent Systems
- /
- v.11 no.3
- /
- pp.135-142
- /
- 2011
Unlike using the sequence-based representation for a chromosome in previous genetic algorithms for Bayesian structure learning, we proposed a matrix representation-based genetic algorithm. Since a good chromosome representation helps us to develop efficient genetic operators that maintain a functional link between parents and their offspring, we represent a chromosome as a matrix that is a general and intuitive data structure for a directed acyclic graph(DAG), Bayesian network structure. This matrix-based genetic algorithm enables us to develop genetic operators more efficient for structuring Bayesian network: a probability matrix and a transpose-based mutation operator to inherit a structure with the correct edge direction and enhance the diversity of the offspring. To show the outstanding performance of the proposed method, we analyzed the performance between two well-known genetic algorithms and the proposed method using two Bayesian network scoring measures.
https://doi.org/10.5391/IJFIS.2011.11.3.135 인용 PDF KSCI

Differentially Responsible Adaptive Critic Learning ( DRACL ) for the Self-Learning Control of Multiple-Input System (多入力 시스템의 자율학습제어를 위한 차등책임 적응비평학습)

Kim, Hyong-Suk
- Journal of the Korean Institute of Telematics and Electronics S
- /
- v.36S no.2
- /
- pp.28-37
- /
- 1999
Differentially Responsible Adaptive Critic Learning technique is proposed for learning the control technique with multiple control inputs as in robot system using reinforcement learning. The reinforcement learning is a self-learning technique which learns the control skill based on the critic information Learning is a after a long series of control actions. The Adaptive Critic Learning (ACL) is the representative reinforcement learning structure. The ACL maximizes the learning performance using the two learning modules called the action and the critic modules which exploit the external critic value obtained seldomly. Drawback of the ACL is the fact that application of the ACL is limited to the single input system. In the proposed Differentially Responsible Action Dependant Adaptive Critic learning structure, the critic function is constructed as a function of control input elements. The responsibility of the individual control action element is computed based on the partial derivative of the critic function in terms of each control action element. The proposed learning structure has been constructed with the CMAC neural networks and some simulations have been done upon the two dimensional Cart-Role system and robot squatting problem. The simulation results are included.
PDF

Search Result 2,166, Processing Time 0.024 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)