• Title/Summary/Keyword: Gradient Ascent

Search Result 11, Processing Time 0.023 seconds

Self-Organized Reinforcement Learning Using Fuzzy Inference for Stochastic Gradient Ascent Method

  • K, K.-Wong;Akio, Katuki
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2001.10a
    • /
    • pp.96.3-96
    • /
    • 2001
  • In this paper the self-organized and fuzzy inference used stochastic gradient ascent method is proposed. Fuzzy rule and fuzzy set increase as occasion demands autonomously according to the observation information. And two rules(or two fuzzy sets)becoming to be similar each other as progress of learning are unified. This unification causes the reduction of a number of parameters and learning time. Using fuzzy inference and making a rule with an appropriate state division, our proposed method makes it possible to construct a robust reinforcement learning system.

  • PDF

A New Constant Modulus Algorithm based on Maximum Probability Criterion

  • Kim, Nam-Yong
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.34 no.2A
    • /
    • pp.85-90
    • /
    • 2009
  • In this paper, as an alternative to constant modulus algorithm based on MSE, maximization of the probability that equalizer output power is equal to the constant modulus of the transmitted symbols is introduced. The proposed algorithm using the gradient ascent method to the maximum probability criterion has superior convergence and steady-state MSE performance, and the error samples of the proposed algorithm exhibit more concentrated density functions in blind equalization environments. Simulation results indicate that the proposed training has a potential advantage versus MSE training for the constant modulus approach to blind equalization.

On Robust Principal Component using Analysis Neural Networks (신경망을 이용한 로버스트 주성분 분석에 관한 연구)

  • Kim, Sang-Min;Oh, Kwang-Sik;Park, Hee-Joo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.7 no.1
    • /
    • pp.113-118
    • /
    • 1996
  • Principal component analysis(PCA) is an essential technique for data compression and feature extraction, and has been widely used in statistical data analysis, communication theory, pattern recognition, and image processing. Oja(1992) found that a linear neuron with constrained Hebbian learning rule can extract the principal component by using stochastic gradient ascent method. In practice real data often contain some outliers. These outliers will significantly deteriorate the performances of the PCA algorithms. In order to make PCA robust, Xu & Yuille(1995) applied statistical physics to the problem of robust principal component analysis(RPCA). Devlin et.al(1981) obtained principal components by using techniques such as M-estimation. The propose of this paper is to investigate from the statistical point of view how Xu & Yuille's(1995) RPCA works under the same simulation condition as in Devlin et.al(1981).

  • PDF

Antenna Placement Designs for Distributed Antenna Systems with Multiple-Antenna Ports (다중 안테나 포트를 장착한 분산 안테나 시스템에서의 안테나 설계 방법)

  • Lee, Changhee;Park, Eunsung;Lee, Inkyu
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.37A no.10
    • /
    • pp.865-875
    • /
    • 2012
  • In this paper, we optimize antenna locations for a distributed antenna system (DAS) with distributed antenna (DA) ports equipped with multiple antennas under per-DA port power constraint. Maximum ratio transmission and scaled zero-forcing beamforming are employed for single-user and multi-user DAS, respectively. Instead of maximizing the cell average ergodic sum rate, we focus on a lower bound of the expected signal-to-noise ratio (SNR) for the single-cell scenario and the expected signal-to-leakage ratio (SLR) for the two-cell scenario to determine antenna locations. For the single-cell case, optimization of the SNR criterion generates a closed form solution in comparison to conventional iterative algorithms. Also, a gradient ascent algorithm is proposed to solve the SLR criterion for the two-cell scenario. Simulation results show that DAS with antenna locations obtained from the proposed algorithms achieve capacity gains over traditional centralized antenna systems.

Localization and a Distributed Local Optimal Solution Algorithm for a Class of Multi-Agent Markov Decision Processes

  • Chang, Hyeong-Soo
    • International Journal of Control, Automation, and Systems
    • /
    • v.1 no.3
    • /
    • pp.358-367
    • /
    • 2003
  • We consider discrete-time factorial Markov Decision Processes (MDPs) in multiple decision-makers environment for infinite horizon average reward criterion with a general joint reward structure but a factorial joint state transition structure. We introduce the "localization" concept that a global MDP is localized for each agent such that each agent needs to consider a local MDP defined only with its own state and action spaces. Based on that, we present a gradient-ascent like iterative distributed algorithm that converges to a local optimal solution of the global MDP. The solution is an autonomous joint policy in that each agent's decision is based on only its local state.cal state.

Performance Comparison of Crawling Robots Trained by Reinforcement Learning Methods (강화학습에 의해 학습된 기는 로봇의 성능 비교)

  • Park, Ju-Yeong;Jeong, Gyu-Baek;Mun, Yeong-Jun
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.04a
    • /
    • pp.33-36
    • /
    • 2007
  • 최근에 인공지능 분야에서는, 국내외적으로 강화학습(reinforcement learning)에 관한 관심이 크게 증폭되고 있다. 강화학습의 최근 경향을 살펴보면, 크게 가치함수를 직접 활용하는 방법(value function-based methods), 제어 전략에 대한 탐색을 활용하는 방법(policy search methods), 그리고 액터-크리틱 방법(actor-critic methods)의 세가지 방향으로 발전하고 있음을 알 수 있다. 본 논문에서는 이중 세 번째 부류인 액터-크리틱 방법 중 NAC(natural actor-critic) 기법의 한 종류인 RLS-NAC(recursive least-squares based natural actor-critic) 알고리즘을 다양한 트레이스 감쇠계수를 사용하여 연속제어입력(real-valued control inputs)으로 제어되는 Kimura의 기는 로봇에 대해 적용해보고, 그 성능을 기존의 SGA(stochastic gradient ascent) 알고리즘을 이용하여 학습한 경우와 비교해보도록 한다.

  • PDF

Robot Control via SGA-based Reinforcement Learning Algorithms (SGA 기반 강화학습 알고리즘을 이용한 로봇 제어)

  • 박주영;김종호;신호근
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2004.10a
    • /
    • pp.63-66
    • /
    • 2004
  • The SGA(stochastic gradient ascent) algorithm is one of the most important tools in the area of reinforcement learning, and has been applied to a wide range of practical problems. In particular, this learning method was successfully applied by Kimura et a1. [1] to the control of a simple creeping robot which has finite number of control input choices. In this paper, we considered the application of the SGA algorithm to Kimura's robot control problem for the case that the control input is not confined to a finite set but can be chosen from a infinite subset of the real numbers. We also developed a MATLAB-based robot animation program, which showed the effectiveness of the training algorithms vividly.

  • PDF

Suppressing Artefacts in the ECG by Independent Component Analysis (독립성분 분석기법에 의한 심전도 신호의 왜곡 보정)

  • Kim, Jeong-Hwan;Kim, Kyeong-Seop;Kim, Hyun-Tae;Lee, Jeong-Whan
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.6
    • /
    • pp.825-832
    • /
    • 2013
  • In this study, Independent Component Analysis (ICA) algorithms are suggested to extract the original ECG part from the mixed signal contaminated with the unwanted frequency components and especially 60Hz power line disturbances. With this aim, we implement a novel method to suppress the baseline-wandering disturbances and power line artefacts contained in patch-electrodes sensory ECG data by separating the unmixed signal with finding the optimal weight W based on Kurtosis value. With applying brutal force and gradient ascent searching algorithm to find W, we can conclude that the unwanted frequency components especially in the ambulatory ECG data can be eliminated by Independent Component Analysis.

GAN-based camouflage pattern generation parameter optimization system for improving assimilation rate with environment (야생 환경과의 동화율 개선을 위한 GAN 알고리즘 기반 위장 패턴 생성 파라미터 최적화 시스템)

  • Park, JunHyeok;Park, Seungmin;Cho, Dae-Soo
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2022.07a
    • /
    • pp.511-512
    • /
    • 2022
  • 동물무늬는 서식지에 따라 야생에서 천적으로부터 살아남을 수 있는 중요한 역할을 한다. 동물무늬의 역할 중 하나인 자연과 야생 환경에서 천적의 눈을 피해 위장하는 기능이 있기 때문인데 본 논문에서는 기존 위장무늬의 개선을 위한 GAN 알고리즘 기반 위장 패턴 생성모델을 제안한다. 이 모델은 단순히 색상만을 사용하여 위장무늬의 윤곽선을 Blur 처리를 해서 사람의 관측을 흐리게 만드는 기존의 모델의 단순함을 보완하여 GAN 알고리즘의 활용기술인 Deep Dream을 활용하여 경사 상승법을 통해 특정 층의 필터 값을 조절하여 원하는 부분에 대한 구분되는 패턴을 생성할 수 있어 색뿐만 아니라 위장의 기능이 있는 동물무늬와 섞어 자연과 야생 환경에서 더욱 동화율이 높아진 위장 패턴을 생성하고자 한다.

  • PDF

Actor-Critic Reinforcement Learning System with Time-Varying Parameters

  • Obayashi, Masanao;Umesako, Kosuke;Oda, Tazusa;Kobayashi, Kunikazu;Kuremoto, Takashi
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.138-141
    • /
    • 2003
  • Recently reinforcement learning has attracted attention of many researchers because of its simple and flexible learning ability for any environments. And so far many reinforcement learning methods have been proposed such as Q-learning, actor-critic, stochastic gradient ascent method and so on. The reinforcement learning system is able to adapt to changes of the environment because of the mutual action with it. However when the environment changes periodically, it is not able to adapt to its change well. In this paper we propose the reinforcement learning system that is able to adapt to periodical changes of the environment by introducing the time-varying parameters to be adjusted. It is shown that the proposed method works well through the simulation study of the maze problem with aisle that opens and closes periodically, although the conventional method with constant parameters to be adjusted does not works well in such environment.

  • PDF