통합 검색 | Korea Science

Generating Cooperative Behavior by Multi-Agent Profit Sharing on the Soccer Game

Miyazaki, Kazuteru;Terada, Takashi;Kobayashi, Hiroaki
- 한국지능시스템학회:학술대회논문집
- /
- 한국퍼지및지능시스템학회 2003년도 ISIS 2003
- /
- pp.166-169
- /
- 2003
Reinforcement learning if a kind of machine learning. It aims to adapt an agent to a given environment with a clue to a reward and a penalty. Q-learning [8] that is a representative reinforcement learning system treats a reward and a penalty at the same time. There is a problem how to decide an appropriate reward and penalty values. We know the Penalty Avoiding Rational Policy Making algorithm (PARP) [4] and the Penalty Avoiding Profit Sharing (PAPS) [2] as reinforcement learning systems to treat a reward and a penalty independently. though PAPS is a descendant algorithm of PARP, both PARP and PAPS tend to learn a local optimal policy. To overcome it, ion this paper, we propose the Multi Best method (MB) that is PAPS with the multi-start method[5]. MB selects the best policy in several policies that are learned by PAPS agents. By applying PS, PAPS and MB to a soccer game environment based on the SoccerBots[9], we show that MB is the best solution for the soccer game environment.
PDF

모바일 환경 신뢰도 평가 학습에 의한 다중 객체 추적 (Multi-Object Tracking based on Reliability Assessment of Learning in Mobile Environment)

한우리;김영섭;이용환
- 반도체디스플레이기술학회지
- /
- 제14권3호
- /
- pp.73-77
- /
- 2015
This paper proposes an object tracking system according to reliability assessment of learning in mobile environments. The proposed system is based on markerless tracking, and there are four modules which are recognition, tracking, detecting and learning module. Recognition module detects and identifies an object to be matched on current frame correspond to the database using LSH through SURF, and then this module generates a standard object information that has the best reliability of learning. The standard object information is used for evaluating and learning the object that is successful tracking in tracking module. Detecting module finds out the object based on having the best possible knowledge available among the learned objects information, when the system fails to track. The experimental results show that the proposed system is able to recognize and track the reliable objects with reliability assessment of learning for the use of mobile platform.
PDF KSCI

멀티 터치스크린과 실감형 인터페이스를 적용한 과학 실험 학습 시스템 (Learning System for Scientific Experiments with Multi-touch Screen and Tangible User Interface)

김준우;맹준희;주지영;임광혁
- 한국콘텐츠학회논문지
- /
- 제10권8호
- /
- pp.461-471
- /
- 2010
최근, 현실세계와 가상세계를 결합하여 디지털 컨텐츠 형태로 보여주는 증강 현실 기술이 등장하고 있다. 증강현실 기술의 효과를 극대화하기 위해서, 사용자가 현실세계의 사물을 조작하는 것과 유사한 방법으로 디지털 콘텐츠와 상호작용하는 것을 가능하게 해 주는 실감형 인터페이스가 적용된다. 특히, 교육 분야에서는 이러한 기술들이 학습자의 흥미와 몰입도를 높이고, 학습 효과를 극대화할 수 있는 새로운 학습 콘텐츠의 제작을 가능하게 할 것으로 기대되고 있다. 본 논문에서는 멀티 터치스크린과 실감형 인터페이스를 적용한 과학 실험 학습 시스템을 제안한다. 이 시스템은 종래의 실험대 위에서 이루어지는 실험도구의 조작을 대체하기 위하여 대형 멀티 터치스크린을 장착한 실험 테이블과 간단한 사용자의 제스쳐를 인식할 수 있는 실감형 학습 디바이스를 사용한다. 실제 과학 실험에서는 높은 비용이나 긴 시간, 또는 위험 물질들이 요구되기도 하는데, 본 시스템은 이러한 요인들을 극복하면서 학습자들에게 다양한 실험을 현실감 있게 제공할 수 있다.
https://doi.org/10.5392/JKCA.2010.10.8.461 인용 PDF KSCI

강화학습을 이용한 다중 에이전트 제어 전략 (Multi-Agent Control Strategy using Reinforcement Leaning)

이형일
- 한국멀티미디어학회논문지
- /
- 제6권5호
- /
- pp.937-944
- /
- 2003
다중 에이전트 시스템에서 가장 중요한 문제는 여러 에이전트가 서로 효율적인 협동(coordination)을 통해서 목표(goal)를 성취하는 것과 다른 에이전트들과의 충돌(collision) 을 방지하는 것이다. 본 논문에서는 먹이 추적 문제의 목표를 효율적으로 성취하기 위해 새로운 전략 방법을 제안한다. 제안된 제어 전략은 다중 에이전트를 제어하기 위해 강화 학습을 이용하였고, 에이전트들 간의 거리관계와 공간 관계를 고려하였다.
PDF

Implementation of a Sightseeing Multi-function Controller Using Neural Networks

Jae-Kyung, Lee;Jae-Hong, Yim
- Journal of information and communication convergence engineering
- /
- 제21권1호
- /
- pp.45-53
- /
- 2023
This study constructs various scenarios required for landscape lighting; furthermore, a large-capacity general-purpose multifunctional controller is designed and implemented to validate the operation of the various scenarios. The multi-functional controller is a large-capacity general-purpose controller composed of a drive and control unit that controls the scenarios and colors of LED modules and an LED display unit. In addition, we conduct a computer simulation by designing a control system to represent the most appropriate color according to the input values of the temperature, illuminance, and humidity, using the neuro-control system. Consequently, when examining the result and output color according to neuro-control, unlike existing crisp logic, neuro-control does not require the storage of many data inputs because of the characteristics of artificial intelligence; the desired value can be controlled by learning with learning data.
https://doi.org/10.56977/jicce.2023.21.1.45 인용 PDF

LSTM-based Early Fire Detection System using Small Amount Data

Seonhwa Kim;Kwangjae Lee
- 반도체디스플레이기술학회지
- /
- 제23권1호
- /
- pp.110-116
- /
- 2024
Despite the continuous advancement of science and technology, fire accidents continue to occur without decreasing over time, so there is a constant need for a system that can accurately detect fires at an early stage. However, because most existing fire detection systems detect fire in the early stage of combustion when smoke is generated, rapid fire prevention actions may be delayed. Therefore we propose an early fire detection system that can perform early fire detection at a reasonable cost using LSTM, a deep learning model based on multi-gas sensors with high selectivity in the early stage of decomposition rather than the smoke generation stage. This system combines multiple gas sensors to achieve faster detection speeds than traditional sensors. In addition, through window sliding techniques and model light-weighting, the false alarm rate is low while maintaining the same high accuracy as existing deep learning. This shows that the proposed fire early detection system is a meaningful research in the disaster and engineering fields.
PDF

Multi-agent Q-learning based Admission Control Mechanism in Heterogeneous Wireless Networks for Multiple Services

Chen, Jiamei;Xu, Yubin;Ma, Lin;Wang, Yao
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- 제7권10호
- /
- pp.2376-2394
- /
- 2013
In order to ensure both of the whole system capacity and users QoS requirements in heterogeneous wireless networks, admission control mechanism should be well designed. In this paper, Multi-agent Q-learning based Admission Control Mechanism (MQACM) is proposed to handle new and handoff call access problems appropriately. MQACM obtains the optimal decision policy by using an improved form of single-agent Q-learning method, Multi-agent Q-learning (MQ) method. MQ method is creatively introduced to solve the admission control problem in heterogeneous wireless networks in this paper. In addition, different priorities are allocated to multiple services aiming to make MQACM perform even well in congested network scenarios. It can be observed from both analysis and simulation results that our proposed method not only outperforms existing schemes with enhanced call blocking probability and handoff dropping probability performance, but also has better network universality and stability than other schemes.
https://doi.org/10.3837/tiis.2013.10.003 인용 PDF KSCI

Feedback-Based Iterative Learning Control for MIMO LTI Systems

Doh, Tae-Yong;Ryoo, Jung-Rae
- International Journal of Control, Automation, and Systems
- /
- 제6권2호
- /
- pp.269-277
- /
- 2008
This paper proposes a necessary and sufficient condition of convergence in the $L_2$-norm sense for a feedback-based iterative learning control (ILC) system including a multi-input multi-output (MIMO) linear time-invariant (LTI) plant. It is shown that the convergence conditions for a nominal plant and an uncertain plant are equal to the nominal performance condition and the robust performance condition in the feedback control theory, respectively. Moreover, no additional effort is required to design an iterative learning controller because the performance weighting matrix is used as an iterative learning controller. By proving that the least upper bound of the $L_2$-norm of the remaining tracking error is less than that of the initial tracking error, this paper shows that the iterative learning controller combined with the feedback controller is more effective to reduce the tracking error than only the feedback controller. The validity of the proposed method is verified through computer simulations.
PDF KSCI

학습 평가 분석을 이용한 웹기반 코스 스케쥴링 멀티 에이전트 시스템 (A Course Scheduling Multi-Agent System using Learning Evaluation Analysis)

박재표;이광형;이종희;전문석
- 컴퓨터교육학회논문지
- /
- 제7권1호
- /
- pp.97-106
- /
- 2004
최근 학습자의 요구에 맞는 코스웨어의 주문이 증가하고 있는 추세이며 그에 따라 웹 기반 교육 시스템에 효율적이고 자동화된 교육 에이전트의 필요성이 인식되고 있다. 본 논문에서는 취약성 분석 알고리즘을 이용한 학습자 중심의 코스 스케쥴링 멀티 에이전트 시스템을 제안한다. 제안한 시스템은 먼저 학습자의 학습 평가 결과를 분석하고 학습자의 학습 성취도를 계산하며, 이 성취도를 에이전트의 스케줄에 적응하여 학습자에게 적합한 코스를 제공하고, 학습자는 이러한 코스에 따라 능력에 맞는 반복된 학습을 통하여 적극적인 완전학습을 수행하게 된다.
PDF

Reinforcement Learning Control using Self-Organizing Map and Multi-layer Feed-Forward Neural Network

Lee, Jae-Kang;Kim, Il-Hwan
- 제어로봇시스템학회:학술대회논문집
- /
- 제어로봇시스템학회 2003년도 ICCAS
- /
- pp.142-145
- /
- 2003
Many control applications using Neural Network need a priori information about the objective system. But it is impossible to get exact information about the objective system in real world. To solve this problem, several control methods were proposed. Reinforcement learning control using neural network is one of them. Basically reinforcement learning control doesn't need a priori information of objective system. This method uses reinforcement signal from interaction of objective system and environment and observable states of objective system as input data. But many methods take too much time to apply to real-world. So we focus on faster learning to apply reinforcement learning control to real-world. Two data types are used for reinforcement learning. One is reinforcement signal data. It has only two fixed scalar values that are assigned for each success and fail state. The other is observable state data. There are infinitive states in real-world system. So the number of observable state data is also infinitive. This requires too much learning time for applying to real-world. So we try to reduce the number of observable states by classification of states with Self-Organizing Map. We also use neural dynamic programming for controller design. An inverted pendulum on the cart system is simulated. Failure signal is used for reinforcement signal. The failure signal occurs when the pendulum angle or cart position deviate from the defined control range. The control objective is to maintain the balanced pole and centered cart. And four states that is, position and velocity of cart, angle and angular velocity of pole are used for state signal. Learning controller is composed of serial connection of Self-Organizing Map and two Multi-layer Feed-Forward Neural Networks.
PDF

검색결과 625건 처리시간 0.028초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)