Search | Korea Science

Extended Q-Learning under Multiple Subtasks (복수의 부분작업을 처리할 수 있는 확정된 Q-Learning)

오도훈;이현숙;오경환
- Korean Journal of Cognitive Science
- /
- v.12 no.1_2
- /
- pp.25-34
- /
- 2001
지식을 관리하는 것에 주력했던 기존의 인공지능 연구 방향은 동적으로 움직이는 외부 환경에서 적응할 수 있는 시스템 구축으로 변화하고 있다. 이러한 시스템의 기본 능력을 이루는 많은 학습방법 중에서 비교적 최근에 제시된 강화학습은 일반적인 사례에 적용하기 쉽고 동적인 환경에서 뛰어난 적응 능력을 보여주었다. 이런 장점을 바탕으로 강화학습은 에이전트 연구에 많이 사용되고 있다. 하지만, 현재까지 연구결과는 강화학습으로 구축된 에이전트로 해결할 수 있는 작업의 난이도에 한계가 있음을 보이고 있다. 특히, 복수의 부분 작업으로 구성되어 있는 작업을 처리할 경우에 기본의 강화학습 방법은 문제 해결에 한계를 보여주고 있다. 본 논문에서는 복수의 부분 작업으로 구성된 작업이 왜 처리하기 힘든가를 분석하고, 이런 문제를 처리할 수 있는 방안을 제안한다. 본 논문에서 제안하고 있는 EQ-Learning의 강화학습 방법의 대표적인 Q-Learning을 확장시켜 문제를 해결한다. 이 방법은 각각의 부분 작업 해결 방안을 학습시키고 그 학습 결과들의 적절한 순서를 찾아내 전체 작업을 해결한다. EQ-Learning의 타당성을 검증하기 위해 격자 공간에서 복수의 부분작업으로 구성된 미로 문제를 통하여 실험하였다.
PDF

Y-HisOnto: A History Ontology Model for Q&A System (Y-HisOnto: Q&A 시스템에서의 활용을 위한 역사 온톨로지 모형)

Lee, In Keun;Jung, Jason J.;Hwang, Dosam
- Annual Conference on Human and Language Technology
- /
- 2013.10a
- /
- pp.156-159
- /
- 2013
본 논문에서는 시간 개념이 포함된 역사적 지식을 표현할 수 있는 사건 온톨로지(event ontology) 기반의 역사 온톨로지 모형인 Y-HisOnto 를 제안한다. 제안한 역사 온톨로지 모형은 기존의 온톨로지에서 사용되는 이진 관계(binary-relationship)로 표현된 단편적 지식들을 조합하여 다진 관계(n-ary relationship)를 이용하여 역사적 사건 관련 지식을 표현한다. 제안한 온톨로지 모형에 기반하여 사건 중심의 지식을 온톨로지로 구축하고, 사건 관련 질의에 대해 온톨로지 논리 검색 실험을 수행함으로써 제안한 온톨로지 모형이 Q&A 시스템에서 효과적으로 활용될 수 있음을 확인한다.
PDF

Performance Test of Paylad Data Receiving Equipment for STSAT-2 (과학기술위성 2호 탑재체데이터 수신시스템의 성능 시험)

Lee, Jong-Ju;Seo, In-Ho;Lee, Chol;Oh, Chi-Wook;Kim, Kyung-Hee;Park, Sung-Ok
- Journal of the Korean Society for Aeronautical & Space Sciences
- /
- v.35 no.4
- /
- pp.347-352
- /
- 2007
This paper describes the design and implementation of PFM(Proto Flight Model, PFM) of DRE(Data Receiving Equipment, DRE) for Science and Technology Satellite 2(STSAT-2) and the results of integration performance test. DRE components are X-band receiver, DCE(Data Combine Equipment, DCE) and RAC(Receiving and Archiving Computer, RAC). DCE consists of I&Q data combiner and ECL signal distributor. RAC consists of DRC(Data Receiving Card) and ST2RAS(STSAT-2 Receiving and Archinving Software). X-band receiver receives 10Mbps QPSK I, Q satellite data and sends the data to DCE. DRC stores the I&Q combine data from DCE to RAID. The pre-processing program sorts and stores to satellite status data and payload data. The performance of DRE in the functional and space environments test satisfies the requirements of STSAT-2.
https://doi.org/10.5139/JKSAS.2007.35.4.347 인용 PDF KSCI

Area-Based Q-learning Algorithm to Search Target Object of Multiple Robots (다수 로봇의 목표물 탐색을 위한 Area-Based Q-learning 알고리즘)

Yoon, Han-Ul;Sim, Kwee-Bo
- Journal of the Korean Institute of Intelligent Systems
- /
- v.15 no.4
- /
- pp.406-411
- /
- 2005
In this paper, we present the area-based Q-learning to search a target object using multiple robot. To search the target in Markovian space, the robots should recognize their surrounding at where they are located and generate some rules to act upon by themselves. Under area-based Q-learning, a robot, first of all, obtains 6-distances from itself to environment by infrared sensor which are hexagonally allocated around itself. Second, it calculates 6-areas with those distances then take an action, i.e., turn and move toward where the widest space will be guaranteed. After the action is taken, the value of Q will be updated by relative formula at the state. We set up an experimental environment with five small mobile robots, obstacles, and a target object, and tried to search for a target object while navigating in a unknown hallway where some obstacles were placed. In the end of this paper, we presents the results of three algorithms - a random search, area-based action making (ABAM), and hexagonal area-based Q-teaming.
https://doi.org/10.5391/JKIIS.2005.15.4.406 인용 PDF KSCI

Electrically tunable current mode high Q- bandpass filter

Tongkulboriboon, Seangrawee;Petchakit, Wijittra;Kiranon, Wiwat
- 제어로봇시스템학회:학술대회논문집
- /
- 2005.06a
- /
- pp.237-240
- /
- 2005
A novel current mode high Q bandpass filter with electronically tuable values of Q based on second generation current controlled conveyor CCCIIs is presented. The circuit offers the advantages of using a few passive elements. The center frequency and pole-Q can be independently adjusted by via dc bias current of CCCIIs, It is shown from SPICE simulation that the results agree well with theoretical analysis
PDF

Nonlinear control for robot manipulator (로보트 매니퓰레이터에 대한 비선형 제어)

이종용;이승원;이상효
- 제어로봇시스템학회:학술대회논문집
- /
- 1990.10a
- /
- pp.263-268
- /
- 1990
This paper deals with the manipulator with actuator described by equation D over bar(q) $q^{...}$ = u-p over bar (q, $q^{.}$, $q^{..}$) with a control input u. We imploy a simple method of control design which bas two stages. First, a global linearization is performed to yield a decoupled controllable linear system. Then a controller is designed for this linear system. We provide a rigorous analysis Of the effect of uncertain dynamics, which we study using robustness results In time domain based on a Lyapunav equation and the total stability theorem. I)sing this approach we simulate the performance of controller about a robotic manipulator with actuator.tor.r.
PDF

Region-based Q- learning For Autonomous Mobile Robot Navigation (자율 이동 로봇의 주행을 위한 영역 기반 Q-learning)

차종환;공성학;서일홍
- 제어로봇시스템학회:학술대회논문집
- /
- 2000.10a
- /
- pp.174-174
- /
- 2000
Q-learning, based on discrete state and action space, is a most widely used reinforcement Learning. However, this requires a lot of memory and much time for learning all actions of each state when it is applied to a real mobile robot navigation using continuous state and action space Region-based Q-learning is a reinforcement learning method that estimates action values of real state by using triangular-type action distribution model and relationship with its neighboring state which was defined and learned before. This paper proposes a new Region-based Q-learning which uses a reward assigned only when the agent reached the target, and get out of the Local optimal path with adjustment of random action rate. If this is applied to mobile robot navigation, less memory can be used and robot can move smoothly, and optimal solution can be learned fast. To show the validity of our method, computer simulations are illusrated.
PDF

Lane Change Methodology for Autonomous Vehicles Based on Deep Reinforcement Learning (심층강화학습 기반 자율주행차량의 차로변경 방법론)

DaYoon Park;SangHoon Bae;Trinh Tuan Hung;Boogi Park;Bokyung Jung
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.22 no.1
- /
- pp.276-290
- /
- 2023
Several efforts in Korea are currently underway with the goal of commercializing autonomous vehicles. Hence, various studies are emerging on autonomous vehicles that drive safely and quickly according to operating guidelines. The current study examines the path search of an autonomous vehicle from a microscopic viewpoint and tries to prove the efficiency required by learning the lane change of an autonomous vehicle through Deep Q-Learning. A SUMO was used to achieve this purpose. The scenario was set to start with a random lane at the starting point and make a right turn through a lane change to the third lane at the destination. As a result of the study, the analysis was divided into simulation-based lane change and simulation-based lane change applied with Deep Q-Learning. The average traffic speed was improved by about 40% in the case of simulation with Deep Q-Learning applied, compared to the case without application, and the average waiting time was reduced by about 2 seconds and the average queue length by about 2.3 vehicles.
https://doi.org/10.12815/kits.2023.22.1.276 인용 PDF

Traffic Control using Q-Learning Algorithm (Q 학습을 이용한 교통 제어 시스템)

Zheng, Zhang;Seung, Ji-Hoon;Kim, Tae-Yeong;Chong, Kil-To
- Journal of the Korea Academia-Industrial cooperation Society
- /
- v.12 no.11
- /
- pp.5135-5142
- /
- 2011
A flexible mechanism is proposed in this paper to improve the dynamic response performance of a traffic flow control system in an urban area. The roads, vehicles, and traffic control systems are all modeled as intelligent systems, wherein a wireless communication network is used as the medium of communication between the vehicles and the roads. The necessary sensor networks are installed in the roads and on the roadside upon which reinforcement learning is adopted as the core algorithm for this mechanism. A traffic policy can be planned online according to the updated situations on the roads, based on all the information from the vehicles and the roads. This improves the flexibility of traffic flow and offers a much more efficient use of the roads over a traditional traffic control system. The optimum intersection signals can be learned automatically online. An intersection control system is studied as an example of the mechanism using Q-learning based algorithm, and simulation results showed that the proposed mechanism can improve the traffic efficiency and the waiting time at the signal light by more than 30% in various conditions compare to the traditional signaling system.
https://doi.org/10.5762/KAIS.2011.12.11.5135 인용 PDF KSCI

Fuzzy Q-learning using Weighted Eligibility (가중 기여도를 이용한 퍼지 Q-learning)

정석일;이연정
- Proceedings of the Korean Institute of Intelligent Systems Conference
- /
- 2000.11a
- /
- pp.163-167
- /
- 2000
The eligibility is used to solve the credit-assignment problem which is one of important problems in reinforcement learning. Conventional eligibilities which are accumulating eligibility and replacing eligibility make ineffective use of rewards acquired in learning process. Because only an executed action in a visited state is learned by these eligibilities. Thus, we propose a new eligibility, called the weighted eligibility with which not only an executed action but also neighboring actions in a visited state are to be learned. The fuzzy Q-learning algorithm using proposed eligibility is applied to a cart-pole balancing problem, which shows improvement of learning speed.
PDF

Search Result 1,008, Processing Time 0.032 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)