• 제목/요약/키워드: Continuous action space

검색결과 32건 처리시간 0.03초

CONTINUOUS SHADOWING AND STABILITY FOR GROUP ACTIONS

  • Kim, Sang Jin
    • 대한수학회지
    • /
    • 제56권1호
    • /
    • pp.53-65
    • /
    • 2019
  • Recently, Chung and Lee [2] introduced the notion of topological stability for a finitely generated group action, and proved a group action version of the Walters's stability theorem. In this paper, we introduce the concepts of continuous shadowing and continuous inverse shadowing of a finitely generated group action on a compact metric space X with respect to various classes of admissible pseudo orbits and study the relationships between topological stability and continuous shadowing and continuous inverse shadowing property of group actions. Moreover, we introduce the notion of structural stability for a finitely generated group action, and we prove that an expansive action on a compact manifold is structurally stable if and only if it is continuous inverse shadowing.

Actor-Critic Algorithm with Transition Cost Estimation

  • Sergey, Denisov;Lee, Jee-Hyong
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • 제16권4호
    • /
    • pp.270-275
    • /
    • 2016
  • We present an approach for acceleration actor-critic algorithm for reinforcement learning with continuous action space. Actor-critic algorithm has already proved its robustness to the infinitely large action spaces in various high dimensional environments. Despite that success, the main problem of the actor-critic algorithm remains the same-speed of convergence to the optimal policy. In high dimensional state and action space, a searching for the correct action in each state takes enormously long time. Therefore, in this paper we suggest a search accelerating function that allows to leverage speed of algorithm convergence and reach optimal policy faster. In our method, we assume that actions may have their own distribution of preference, that independent on the state. Since in the beginning of learning agent act randomly in the environment, it would be more efficient if actions were taken according to the some heuristic function. We demonstrate that heuristically-accelerated actor-critic algorithm learns optimal policy faster, using Educational Process Mining dataset with records of students' course learning process and their grades.

POINTWISE CONTINUOUS SHADOWING AND STABILITY IN GROUP ACTIONS

  • Dong, Meihua;Jung, Woochul;Lee, Keonhee
    • 충청수학회지
    • /
    • 제32권4호
    • /
    • pp.509-524
    • /
    • 2019
  • Let Act(G, X) be the set of all continuous actions of a finitely generated group G on a compact metric space X. In this paper, we study the concepts of topologically stable points and continuous shadowable points of a group action T ∈ Act(G, X). We show that if T is expansive then the set of continuous shadowable points is contained in the set of topologically stable points.

모션 히스토리 영상 및 기울기 방향성 히스토그램과 적출 모델을 사용한 깊이 정보 기반의 연속적인 사람 행동 인식 시스템 (Depth-Based Recognition System for Continuous Human Action Using Motion History Image and Histogram of Oriented Gradient with Spotter Model)

  • 음혁민;이희진;윤창용
    • 한국지능시스템학회논문지
    • /
    • 제26권6호
    • /
    • pp.471-476
    • /
    • 2016
  • 본 논문은 깊이 정보를 기반으로 모션 히스토리 영상 및 기울기 방향성 히스토그램과 적출 모델을 사용하여 연속적인 사람 행동들을 인식하는 시스템을 설명하고 연속적인 행동 인식 시스템에서 인식 성능을 개선하기 위해 행동 적출을 수행하는 적출 모델을 제안한다. 본 시스템의 구성은 전처리 과정, 사람 행동 및 적출 모델링 그리고 연속적인 사람 행동 인식으로 이루어져 있다. 전처리 과정에서는 영상 분할과 시공간 템플릿 기반의 특징을 추출하기 위하여 Depth-MHI-HOG 방법을 사용하였으며, 추출된 특징들은 사람 행동 및 적출 모델링 과정을 통해 시퀀스들로 생성된다. 이 생성된 시퀀스들과 은닉 마르코프 모델을 사용하여 정의된 각각의 행동에 적합한 사람 행동 모델과 제안된 적출 모델을 생성한다. 연속적인 사람 행동 인식은 연속적인 행동 시퀀스에서 적출 모델에 의해 의미 있는 행동과 의미 없는 행동을 분할하는 행동 적출과 의미 있는 행동 시퀀스에 대한 모델의 확률 값들을 비교하여 연속적으로 사람 행동들을 인식한다. 실험 결과를 통해 제안된 모델이 연속적인 행동 인식 시스템에서 인식 성능을 효과적으로 개선하는 것을 검증한다.

자율 이동 로봇의 주행을 위한 영역 기반 Q-learning (Region-based Q- learning For Autonomous Mobile Robot Navigation)

  • 차종환;공성학;서일홍
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2000년도 제15차 학술회의논문집
    • /
    • pp.174-174
    • /
    • 2000
  • Q-learning, based on discrete state and action space, is a most widely used reinforcement Learning. However, this requires a lot of memory and much time for learning all actions of each state when it is applied to a real mobile robot navigation using continuous state and action space Region-based Q-learning is a reinforcement learning method that estimates action values of real state by using triangular-type action distribution model and relationship with its neighboring state which was defined and learned before. This paper proposes a new Region-based Q-learning which uses a reward assigned only when the agent reached the target, and get out of the Local optimal path with adjustment of random action rate. If this is applied to mobile robot navigation, less memory can be used and robot can move smoothly, and optimal solution can be learned fast. To show the validity of our method, computer simulations are illusrated.

  • PDF

Continuity of directional entropy for a class of $Z^2$-actions

  • Park, Kyewon-K.
    • 대한수학회지
    • /
    • 제32권3호
    • /
    • pp.573-582
    • /
    • 1995
  • J.Milnor[Mi2] has introduced the notion of directional entropy in his study of Cellular Automata. Cellular Automaton map can be considered as a continuous map from a space $K^Z^n$ to itself which commute with the translation of the lattice $Z^n$. Since the space $K^Z^n$ is compact, map S is uniformly continuous. Hence S is a block map(a finite code)[He]. (S is said to have a finite memory.) In the case of n = 1, we have a shift map, T on $K^Z$, and a block map S and they together generate a $Z^2$ action.

  • PDF

A NOTE ON LIFTING TRANSFORMATION GROUPS

  • Cho, Sung Ki;Park, Choon Sung
    • Korean Journal of Mathematics
    • /
    • 제5권2호
    • /
    • pp.169-176
    • /
    • 1997
  • The purpose of this note is to compare two known results related to the lifting problem of an action of a topological group G on a G-space X to a coverring space of X.

  • PDF

주사경로 추적을 통한 성별 주시정보 획득특성 - 카페 공간을 대상으로 - (Analysis of Features to Acquire Observation Information by Sex through Scanning Path Tracing - With the Object of Space in Cafe -)

  • 최계영
    • 한국실내디자인학회논문집
    • /
    • 제23권5호
    • /
    • pp.76-85
    • /
    • 2014
  • When conscious and unconscious exploring information of space-visitors which is contained in the information acquired in the process of seeing any space is analyzed, it can be found what those visitors pick up as factors in the space for its selection as visual information in order to put it into action. This study, with the object of the space reproduced in three dimensions from the cafe which was visited for conversation, has analyzed the process of acquiring space-information by sex to find out the features of scanning path, findings of which are the followings. First, the rate of scanning type of males was "Combination (50.5%)- Circulation (31.0%) and that of females "Horizontal (32.5%) - Combination (32.1%)", which shows that there was a big difference by sex in the scanning path which took place in the process of observing any space. Second, when the features of continuous observation frequency by sex is looked into, the trends of increased "horizontal" scanning and decreased "Combination" scanning of both showed the same as the frequency of continuous observations increased, while in case of "Circulation" scanning, that of females was found to decrease but that of males showed the aspect of confusion. Third, the 'Combination' scanning of males was found strong at the short observation time with three times of continuous observation frequency defined as "Attention Concentration" while the distinct feature was seen that the scanning type was dispersed to "combination-circulation" as the frequency of continuous observation increased. Females start the information acquirement with "combination-circulation" but in the process of visual appreciation they showed a strong "Horizontal" These scanning features can be defined as those by sex for acquiring space information and therefore are very significant because they are fundamental studies which will enable any customized space-design by sex.

퍼지 클러스터링을 이용한 강화학습의 함수근사 (Function Approximation for Reinforcement Learning using Fuzzy Clustering)

  • 이영아;정경숙;정태충
    • 정보처리학회논문지B
    • /
    • 제10B권6호
    • /
    • pp.587-592
    • /
    • 2003
  • 강화학습을 적용하기에 적합한 많은 실세계의 제어 문제들은 연속적인 상태 또는 행동(continuous states or actions)을 갖는다. 연속 값을 갖는 문제인 경우, 상태공간의 크기가 거대해져서 모든 상태-행동 쌍을 학습하는데 메모리와 시간상의 문제가 있다. 이를 해결하기 위하여 학습된 유사한 상태로부터 새로운 상태에 대한 추측을 하는 함수 근사 방법이 필요하다. 본 논문에서는 1-step Q-learning의 함수 근사를 위하여 퍼지 클러스터링을 기초로 한 Fuzzy Q-Map을 제안한다. Fuzzy Q-Map은 데이터에 대한 각 클러스터의 소속도(membership degree)를 이용하여 유사한 상태들을 군집하고 행동을 선택하고 Q값을 참조했다. 또한 승자(winner)가 되는 퍼지 클러스터의 중심과 Q값은 소속도와 TD(Temporal Difference) 에러를 이용하여 갱신하였다. 본 논문에서 제안한 방법은 마운틴 카 문제에 적용한 결과, 빠른 수렴 결과를 보였다.

함수근사와 규칙추출을 위한 클러스터링을 이용한 강화학습 (Reinforcement Learning with Clustering for Function Approximation and Rule Extraction)

  • 이영아;홍석미;정태충
    • 한국정보과학회논문지:소프트웨어및응용
    • /
    • 제30권11호
    • /
    • pp.1054-1061
    • /
    • 2003
  • 강화학습의 대표적인 알고리즘인 Q-Learning은 상태공간의 모든 상태-행동 쌍(state-action pairs)의 평가값이 수렴할 때까지 반복해서 경험하여 최적의 전략(policy)을 얻는다. 상태공간을 구성하는 요소(feature)들이 많거나 요소의 데이타 형태가 연속형(continuous)인 경우, 상태공간은 지수적으로 증가하게 되어, 모든 상태들을 반복해서 경험해야 하고 모든 상태-행동 쌍의 Q값을 저장하는 것은 시간과 메모리에 있어서 어려운 문제이다. 본 논문에서는 온라인으로 학습을 진행하면서 비슷한 상황의 상태들을 클러스터링(clustering)하고 새로운 경험에 적응해서 클러스터(cluster)의 수정(update)을 반복하여, 분류된 최적의 전략(policy)을 얻는 새로운 함수근사(function approximation)방법인 Q-Map을 소개한다. 클러스터링으로 인해 정교한 제어가 필요한 상태(state)는 규칙(rule)으로 추출하여 보완하였다. 미로환경과 마운틴 카 문제를 제안한 Q-Map으로 실험한 결과 분류된 지식을 얻을 수 있었으며 가시화된(explicit) 지식의 형태인 규칙(rule)으로도 쉽게 변환할 수 있었다.