• Title/Summary/Keyword: 데모-기반 학습

Search Result 8, Processing Time 0.02 seconds

Learning Relational Instance-Based Policies from User Demonstrations (사용자 데모를 이용한 관계적 개체 기반 정책 학습)

  • Park, Chan-Young;Kim, Hyun-Sik;Kim, In-Cheol
    • Journal of KIISE:Software and Applications
    • /
    • v.37 no.5
    • /
    • pp.363-369
    • /
    • 2010
  • Demonstration-based learning has the advantage that a user can easily teach his/her robot new task knowledge just by demonstrating directly how to perform the task. However, many previous demonstration-based learning techniques used a kind of attribute-value vector model to represent their state spaces and policies. Due to the limitation of this model, they suffered from both low efficiency of the learning process and low reusability of the learned policy. In this paper, we present a new demonstration-based learning method, in which the relational model is adopted in place of the attribute-value model. Applying the relational instance-based learning to the training examples extracted from the records of the user demonstrations, the method derives a relational instance-based policy which can be easily utilized for other similar tasks in the same domain. A relational policy maps a context, represented as a pair of (state, goal), to a corresponding action to be executed. In this paper, we give a detail explanation of our demonstration-based relational policy learning method, and then analyze the effectiveness of our learning method through some experiments using a robot simulator.

Combining Imitation Learning and Reinforcement Learning for Visual-Language Navigation Agents (시각-언어 이동 에이전트를 위한 모방 학습과 강화 학습의 결합)

  • Oh, Suntaek;Kim, Incheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2020.05a
    • /
    • pp.559-562
    • /
    • 2020
  • 시각-언어 이동 문제는 시각 이해와 언어 이해 능력을 함께 요구하는 복합 지능 문제이다. 본 논문에서는 시각-언어 이동 에이전트를 위한 새로운 학습 모델을 제안한다. 이 모델은 데모 데이터에 기초한 모방 학습과 행동 보상에 기초한 강화 학습을 함께 결합한 복합 학습을 채택하고 있다. 따라서 이 모델은 데모 데이타에 편향될 수 있는 모방 학습의 문제와 상대적으로 낮은 데이터 효율성을 갖는 강화 학습의 문제를 상호 보완적으로 해소할 수 있다. 또한, 제안 모델은 서로 다른 두 학습 간에 발생 가능한 학습 불균형도 고려하여 손실 정규화를 포함하고 있다. 또, 제안 모델에서는 기존 연구들에서 사용되어온 목적지 기반 보상 함수의 문제점을 발견하고, 이를 해결하기 위해 설계된 새로은 최적 경로 기반 보상 함수를 이용한다. 본 논문에서는 Matterport3D 시뮬레이션 환경과 R2R 벤치마크 데이터 집합을 이용한 다양한 실들을 통해, 제안 모델의 높은 성능을 입증하였다.

Combining Imitation Learning with Reinforcement Learning for Efficient Manipulation Policy Acquisition (물체 조작 정책의 효율적 습득을 위한 모방 학습과 강화 학습의 결합)

  • Jung, EunJin;Lee, SangJoon;Kim, Incheol
    • Proceedings of the Korea Information Processing Society Conference
    • /
    • 2018.10a
    • /
    • pp.759-762
    • /
    • 2018
  • 최근 들어 점차 지능형 서비스 로봇들이 인간의 실생활 속으로 들어옴에 따라, 로봇 스스로 다양한 물체들을 효과적으로 조작할 수 있는 지식을 습득하는 기계 학습 기술들이 매우 주목을 받고 있다. 전통적으로 로봇 행위 학습 분야에는 강화 학습 혹은 심층 강화 학습 기술들이 주로 많이 적용되어 왔으나, 이들은 대부분 물체 조작 작업과 같이 다차원 연속 상태 공간과 행동 공간에서 최적의 행동 정책을 학습하는데 여러가지 한계점을 가지고 있다. 따라서 본 논문에서는 전문가의 데모 데이터를 활용해 보다 효율적으로 물체 조작 행위들을 학습할 수 있는 모방 학습과 강화 학습의 통합 프레임워크를 제안한다. 이 통합 프레임워크는 학습의 효율성을 향상시키기 위해, 기존의 GAIL 학습 체계를 토대로 PPO 기반 강화 학습 단계의 도입, 보상 함수의 확장, 상태 유사도 기반 데모 선택 전략의 채용 등을 새롭게 시도한 것이다. 다양한 성능 비교 실험들을 통해, 본 논문에서 제안한 통합 학습 프레임워크인 PGAIL의 우수성을 확인할 수 있었다.

Hybrid Learning for Vision-and-Language Navigation Agents (시각-언어 이동 에이전트를 위한 복합 학습)

  • Oh, Suntaek;Kim, Incheol
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.9 no.9
    • /
    • pp.281-290
    • /
    • 2020
  • The Vision-and-Language Navigation(VLN) task is a complex intelligence problem that requires both visual and language comprehension skills. In this paper, we propose a new learning model for visual-language navigation agents. The model adopts a hybrid learning that combines imitation learning based on demo data and reinforcement learning based on action reward. Therefore, this model can meet both problems of imitation learning that can be biased to the demo data and reinforcement learning with relatively low data efficiency. In addition, the proposed model uses a novel path-based reward function designed to solve the problem of existing goal-based reward functions. In this paper, we demonstrate the high performance of the proposed model through various experiments using both Matterport3D simulation environment and R2R benchmark dataset.

A Study on Development and Use of a Demonstration-Based Architectural Design Class Operation Model for Improving Architectural Thinking Abilities of Under-Motivated Learners (건축설계 학습부진자들의 건축적 사고 개선을 위한 데모 기반 설계수업 운영모형 개발 및 활용 사례연구)

  • Lee, Do-Young;Chung, Hyun-Mi
    • Journal of the Architectural Institute of Korea Planning & Design
    • /
    • v.36 no.3
    • /
    • pp.49-58
    • /
    • 2020
  • Based on Merrill's instructional theory, this study pursued to develop a demonstration-based architectural design class operation model for the 3rd year undergraduate students taking a Spring semester design studio class. The model was designed and used particularly to improve architectural thinking abilities of under-motivated learners. Learning effects of the model were examined based on the preliminary data obtained for 3 consecutive years, 2017 through 2019. A total of 52 students were participated in the class and observed by the instructor. Once developed, the model has been continually updated and improved based on results of each class operation. Five types of demo. were used in the model. First, direct contacts of the instructor with under-motivated learners were turned out to be the most preferred demo(demo. 4), while watching and listening of the demo(demo.3) between the instructor and motivated learners taking place in class was ranked at the second place. Belief of under-motivated learners on the instructor as a professional should be highly valued for improving their architectural thinking abilities. Second, motivated peers' direct help for under-motivated ones was placed in the third rank. Social attitudes of under-motivated learners towards accepting motivated ones' helps were determined the particular demo's appropriateness. Third, a set of guidelines for operating the model in undergraduate design studio classes were developed and suggested.

A Study between Online Entrepreneurship Education and Entrepreneurship: Based on PBL(Problem-Based Learning) and Flipped Learning (기업가정신 온라인교육의 효과성 검증: 플립러닝 및 PBL 기반 기업가정신교육 적용 사례)

  • Nam, Jung Min
    • Asia-Pacific Journal of Business Venturing and Entrepreneurship
    • /
    • v.12 no.2
    • /
    • pp.31-40
    • /
    • 2017
  • This study validate effectiveness of Online Entrepreneurship Education based on PBL(Problem-based Learning) and flipped learning. This study reveals online education of entrepreneurship based on PBL and flipped learning method has positive effect on personal entrepreneurship, will to be an entrepreneur, and problem-solving skills. First, the results show that entrepreneurship education based on PBL and flipped learning can improve entrepreneurship more than a previous learning method. Second, PBL and flipped learning based online education affects will to be an entrepreneur in positive way. Experimental group who experienced problem solving activity and flipped learning has more will to be entrepreneur than control group who takes previous learning method. Lastly, PBL and flipped learning method based entrepreneurship education also has positive effect on personal problem-solving techniques. This results show that online entrepreneurship education based on PBL and flipped learning has positive impacts on entrepreneurship, will to be an entrepreneur, and improving problem-solving skills significantly.

  • PDF

Behavior Pattern Analysis System based on Temporal Histogram of Moving Object Coordinates. (이동 객체 좌표의 시간적 히스토그램 기반 행동패턴분석시스템)

  • Lee, Jae-kwang;Lee, Kyu-won
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.05a
    • /
    • pp.571-575
    • /
    • 2015
  • This paper propose a temporal histogram -based behavior pattern analysis algorithm to analyze the movement features of moving objects from the image inputted in real-time. For the purpose of tracking and analysis of moving objects, it needs to be performed background learning which separated moving objects from the background. Moving object is extracted as a background learning after identifying the object by using the center of gravity and the coordinate correlation is performed by the object tracking. The start frame of each of the tracked object, the end frame, the coordinates information and size information are stored and managed by the linked list. Temporal histogram defines movement features pattern using x, y coordinates based on time axis, it compares each coordinates of objects for understanding its movement features and behavior pattern. Behavior pattern analysis system based on temporal histogram confirmed high tracking rate over 95% with sustaining high processing speed 45~50fps through the demo experiment.

  • PDF

Structure Design of Surveillance Location-Based UAV Motor Primitives (감시 위치 기반의 UAV 모터프리미티브의 구조 설계)

  • Kwak, Jeonghoon;Sung, Yunsick
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.5 no.4
    • /
    • pp.181-186
    • /
    • 2016
  • Recently, the surveillance system research has focused because Unmanned Aerial Vehicle(UAV) has the ability to monitor wide area. When the wide area are monitored, controlling UAVs repeatedly by pilots invokes the cost problem to operate UAVs. If monitoring path can be defined in advance, the cost problem can be solved by controlling UAVs autonomously based on the monitoring path. The traditional approach generates multiple motor primitives based on flied GPS locations. However, the monitoring points by UAVs are not considered by the generated motor primitives, the surveillance by UAVs is not performed properly. This paper proposes a motor primitive structure for surveillance UAVs to be flied autonomously. Motor primitives are generated automatically by setting surveillance points to denote surveillance targets accurately.