• Title/Summary/Keyword: Multi-learning System

Search Result 631, Processing Time 0.023 seconds

Thompson sampling based path selection algorithm in multipath communication system (다중경로 통신 시스템에서 톰슨 샘플링을 이용한 경로 선택 기법)

  • Chung, Byung Chang
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.12
    • /
    • pp.1960-1963
    • /
    • 2021
  • In this paper, we propose a multiplay Thompson sampling algorithm in multipath communication system. Multipath communication system has advantages on communication capacity, robustness, survivability, and so on. It is important to select appropriate network path according to the status of individual path. However, it is hard to obtain the information of path quality simultaneously. To solve this issue, we propose Thompson sampling which is popular in machine learning area. We find some issues when the algorithm is applied directly in the proposal system and suggested some modifications. Through simulation, we verified the proposed algorithm can utilize the entire network paths. In summary, our proposed algorithm can be applied as a path allocation in multipath-based communications system.

Application of reinforcement learning to hyper-redundant system Acquisition of locomotion pattern of snake like robot

  • Ito, K.;Matsuno, F.
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2001.01a
    • /
    • pp.65-70
    • /
    • 2001
  • We consider a hyper-redundant system that consists of many uniform units. The hyper-redundant system has many degrees of freedom and it can accomplish various tasks. Applysing the reinforcement learning to the hyper-redundant system is very attractive because it is possible to acquire various behaviors for various tasks automatically. In this paper we present a new reinforcement learning algorithm "Q-learning with propagation of motion". The algorithm is designed for the multi-agent systems that have strong connections. The proposed algorithm needs only one small Q-table even for a large scale system. So using the proposed algorithm, it is possible for the hyper-redundant system to learn the effective behavior. In this algorithm, only one leader agent learns the own behavior using its local information and the motion of the leader is propagated to another agents with time delay. The reward of the leader agent is given by using the whole system information. And the effective behavior of the leader is learned and the effective behavior of the system is acquired. We apply the proposed algorithm to a snake-like hyper-redundant robot. The necessary condition of the system to be Markov decision process is discussed. And the computer simulation of learning the locomotion is demonstrated. From the simulation results we find that the task of the locomotion of the robot to the desired point is learned and the winding motion is acquired. We can conclude that our proposed system and our analysis of the condition, that the system is Markov decision process, is valid.

  • PDF

A Learning Method of LQR Controller Using Jacobian (자코비안을 이용한 LQR 제어기 학습법)

  • Lim, Yoon-Kyu;Chung, Byeong-Mook
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.22 no.8 s.173
    • /
    • pp.34-41
    • /
    • 2005
  • Generally, it is not easy to get a suitable controller for multi variable systems. If the modeling equation of the system can be found, it is possible to get LQR control as an optimal solution. This paper suggests an LQR learning method to design LQR controller without the modeling equation. The proposed algorithm uses the same cost function with error and input energy as LQR is used, and the LQR controller is trained to reduce the function. In this training process, the Jacobian matrix that informs the converging direction of the controller Is used. Jacobian means the relationship of output variations for input variations and can be approximately found by the simple experiments. In the simulations of a hydrofoil catamaran with multi variables, it can be confirmed that the training of LQR controller is possible by using the approximate Jacobian matrix instead of the modeling equation and this controller is not worse than the traditional LQR controller.

Game Sprite Generator Using a Multi Discriminator GAN

  • Hong, Seungjin;Kim, Sookyun;Kang, Shinjin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.8
    • /
    • pp.4255-4269
    • /
    • 2019
  • This paper proposes an image generation method using a Multi Discriminator Generative Adversarial Net (MDGAN) as a next generation 2D game sprite creation technique. The proposed GAN is an Autoencoder-based model that receives three areas of information-color, shape, and animation, and combines them into new images. This model consists of two encoders that extract color and shape from each image, and a decoder that takes all the values of each encoder and generates an animated image. We also suggest an image processing technique during the learning process to remove the noise of the generated images. The resulting images show that 2D sprites in games can be generated by independently learning the three image attributes of shape, color, and animation. The proposed system can increase the productivity of massive 2D image modification work during the game development process. The experimental results demonstrate that our MDGAN can be used for 2D image sprite generation and modification work with little manual cost.

An Automatic Cooperative coordination Model for the Multiagent System using Reinforcement Learning (강화학습을 이용한 멀티 에이전트 시스템의 자동 협력 조정 모델)

  • 정보윤;윤소정;오경환
    • Korean Journal of Cognitive Science
    • /
    • v.10 no.1
    • /
    • pp.1-11
    • /
    • 1999
  • Agent-based systems technology has generated lots of excitement in these years because of its promise as a new paradigm for conceptualizing. designing. and l implementing software systems Especially, there has been many researches for multi agent system because of the characteristics that it fits to the distributed and open Internet environments. In a multiagent system. agents must cooperate with each other through a Coordination procedure. when the conflicts between agents arise. where those are caused b by the point that each action acts for a purpose separately without coordination. But P previous researches for coordination methods in multi agent system have a deficiency that they can not solve correctly the cooperation problem between agents which have different goals in dynamic environment. In this paper. we solve the cooperation problem of multiagent that has multiple goals in a dynamic environment. with an automatic cooperative coordination model using I reinforcement learning. We will show the two pursuit problems that we extend a traditional problem in multi agent systems area for modeling the restriction in the multiple goals in a dynamic environment. and we have verified the validity of the proposed model with an experiment.

  • PDF

A Multi-Agent Simulation for the Electricity Spot Market

  • Oh, Hyungna
    • Proceedings of the Korea Inteligent Information System Society Conference
    • /
    • 2003.05a
    • /
    • pp.255-263
    • /
    • 2003
  • A multi-agent system designed to represent newly deregulated electricity markets in the USA is aimed at testing the capability of the multi-agent model to replicate the observed price behavior in the wholesale market and developing a smart business intelligence which quickly searches the optimum offer strategy responding to the change in market environments. Simulation results show that the optimum offer strategy is to withhold expensive generating units and submit relatively low offers when demand is low, regardless of firm size; the optimum offer strategy during a period of high demand is either to withhold capacity or speculate for a large firm, while it is to be a price taker a small firm: all in all, the offer pattern observed in the market is close to the optimum strategy. From the firm's perspective, the demand-side participation as well as the intense competition dramatically reduces the chance of high excess profit.

  • PDF

Variational Auto-Encoder Based Semi-supervised Learning Scheme for Learner Classification in Intelligent Tutoring System (지능형 교육 시스템의 학습자 분류를 위한 Variational Auto-Encoder 기반 준지도학습 기법)

  • Jung, Seungwon;Son, Minjae;Hwang, Eenjun
    • Journal of Korea Multimedia Society
    • /
    • v.22 no.11
    • /
    • pp.1251-1258
    • /
    • 2019
  • Intelligent tutoring system enables users to effectively learn by utilizing various artificial intelligence techniques. For instance, it can recommend a proper curriculum or learning method to individual users based on their learning history. To do this effectively, user's characteristics need to be analyzed and classified based on various aspects such as interest, learning ability, and personality. Even though data labeled by the characteristics are required for more accurate classification, it is not easy to acquire enough amount of labeled data due to the labeling cost. On the other hand, unlabeled data should not need labeling process to make a large number of unlabeled data be collected and utilized. In this paper, we propose a semi-supervised learning method based on feedback variational auto-encoder(FVAE), which uses both labeled data and unlabeled data. FVAE is a variation of variational auto-encoder(VAE), where a multi-layer perceptron is added for giving feedback. Using unlabeled data, we train FVAE and fetch the encoder of FVAE. And then, we extract features from labeled data by using the encoder and train classifiers with the extracted features. In the experiments, we proved that FVAE-based semi-supervised learning was superior to VAE-based method in terms with accuracy and F1 score.

Vibration Optimization Using Immune-GA Algorithm (면역-유전알고리즘을 이용한 진동최적화)

  • 최병근;양보석
    • Proceedings of the Korean Society for Noise and Vibration Engineering Conference
    • /
    • 1998.04a
    • /
    • pp.273-279
    • /
    • 1998
  • An immune system has powerful abilities such as memory, recognition and learning to respond to invading antigens, and is applied to many engineering algorithm recently. In this paper, the combined optimization algorithm is proposed for multi-optimization problem by introducing the capability of the immune system that controls the proliferation of clones to the genetic algorithm. The optimizing ability of the proposed optimization algorithm is identified by using two multi-peak functions which have many local optimums and optimization of the unbalance response function for rotor model.

  • PDF

Automatically Bending Process control for Shaft Straightening Machine (축교정기를 위한 자동굽힘공정제어기 설계)

  • 김승철
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 1998.10a
    • /
    • pp.54-59
    • /
    • 1998
  • In order to minimize straightness error of deflected shafts, a automatically bending process control system is designed, fabricated, and studied. The multi-step straightening process and the three-point bending process are developed for the geometric adaptive straightness control. Load-deflection relationship, on-line identification of variations of material properties, on-line springback prediction, and studied for the three-point bending processes. Selection of a loading point supporting condition are derved form fuzzy inference and fuzzy self-learning method in the multi-step straighternign process. Automatic straightening machine is fabricated by using the develped ideas. Validity of the proposed system si verified through experiments.

  • PDF

Nonlinear System Modeling Based on Multi-Backpropagation Neural Network (다중 역전파 신경회로망을 이용한 비선형 시스템의 모델링)

  • Baeg, Jae-Huyk;Lee, Jung-Moon
    • Journal of Industrial Technology
    • /
    • v.16
    • /
    • pp.197-205
    • /
    • 1996
  • In this paper, we propose a new neural architecture. We synthesize the architecture from a combination of structures known as MRCCN (Multi-resolution Radial-basis Competitive and Cooperative Network) and BPN (Backpropagation Network). The proposed neural network is able to improve the learning speed of MRCCN and the mapping capability of BPN. The ability and effectiveness of identifying a ninlinear dynamic system using the proposed architecture will be demonstrated by computer simulation.

  • PDF