• 제목/요약/키워드: Stochastic Learning

검색결과 141건 처리시간 0.035초

강화학습법을 이용한 유역통합 저수지군 운영 (Basin-Wide Multi-Reservoir Operation Using Reinforcement Learning)

  • 이진희;심명필
    • 한국수자원학회:학술대회논문집
    • /
    • 한국수자원학회 2006년도 학술발표회 논문집
    • /
    • pp.354-359
    • /
    • 2006
  • The analysis of large-scale water resources systems is often complicated by the presence of multiple reservoirs and diversions, the uncertainty of unregulated inflows and demands, and conflicting objectives. Reinforcement learning is presented herein as a new approach to solving the challenging problem of stochastic optimization of multi-reservoir systems. The Q-Learning method, one of the reinforcement learning algorithms, is used for generating integrated monthly operation rules for the Keum River basin in Korea. The Q-Learning model is evaluated by comparing with implicit stochastic dynamic programming and sampling stochastic dynamic programming approaches. Evaluation of the stochastic basin-wide operational models considered several options relating to the choice of hydrologic state and discount factors as well as various stochastic dynamic programming models. The performance of Q-Learning model outperforms the other models in handling of uncertainty of inflows.

  • PDF

STOCHASTIC GRADIENT METHODS FOR L2-WASSERSTEIN LEAST SQUARES PROBLEM OF GAUSSIAN MEASURES

  • YUN, SANGWOON;SUN, XIANG;CHOI, JUNG-IL
    • Journal of the Korean Society for Industrial and Applied Mathematics
    • /
    • 제25권4호
    • /
    • pp.162-172
    • /
    • 2021
  • This paper proposes stochastic methods to find an approximate solution for the L2-Wasserstein least squares problem of Gaussian measures. The variable for the problem is in a set of positive definite matrices. The first proposed stochastic method is a type of classical stochastic gradient methods combined with projection and the second one is a type of variance reduced methods with projection. Their global convergence are analyzed by using the framework of proximal stochastic gradient methods. The convergence of the classical stochastic gradient method combined with projection is established by using diminishing learning rate rule in which the learning rate decreases as the epoch increases but that of the variance reduced method with projection can be established by using constant learning rate. The numerical results show that the present algorithms with a proper learning rate outperforms a gradient projection method.

강화학습기법을 이용한 TSP의 해법 (A Learning based Algorithm for Traveling Salesman Problem)

  • 임준묵;배성민;서재준
    • 대한산업공학회지
    • /
    • 제32권1호
    • /
    • pp.61-73
    • /
    • 2006
  • This paper deals with traveling salesman problem(TSP) with the stochastic travel time. Practically, the travel time between demand points changes according to day and time zone because of traffic interference and jam. Since the almost pervious studies focus on TSP with the deterministic travel time, it is difficult to apply those results to logistics problem directly. But many logistics problems are strongly related with stochastic situation such as stochastic travel time. We need to develop the efficient solution method for the TSP with stochastic travel time. From the previous researches, we know that Q-learning technique gives us to deal with stochastic environment and neural network also enables us to calculate the Q-value of Q-learning algorithm. In this paper, we suggest an algorithm for TSP with the stochastic travel time integrating Q-learning and neural network. And we evaluate the validity of the algorithm through computational experiments. From the simulation results, we conclude that a new route obtained from the suggested algorithm gives relatively more reliable travel time in the logistics situation with stochastic travel time.

Hyperbolic Quotient 경쟁학습 신경회로망을 사용한 비선형 확률시스템 제어에 관한 연구 (A Study on a Stochastic Nonlinear System Control Using Hyperbolic Quotient Competitive Learning Neural Networks)

  • 석진욱;조성원;최경삼
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 1998년도 추계학술대회 학술발표 논문집
    • /
    • pp.346-352
    • /
    • 1998
  • In this paper, we give some geometric condition for a stochastic nonlinear system and we propose a control method for a stochastic nonlinear system using neural networks. Since a competitive learning neural networks has been developed based on the stochastic approximation method, it is regarded as a stochastic recursive filter algorithm. In addition, we provide a filtering and control condition for a stochastic nonlinear system, called perfect filtering condition, in a viewpoint of stochastic geometry. The stochastic nonlinear system satisfying the perfect filtering condition is decoupled with a deterministic part and purely semi martingale part. Hence, the above system can be controlled by conventional control laws and various intelligent control laws. Computer simulation shows that the stochastic nonlinear system satisfying the perfect filtering condition is controllable. and the proposed neural controller is more efficient than the conventional LQG controller and the canoni al LQ-Neural controller.

  • PDF

확률론적 최적제어와 기계학습을 이용한 동적 트레이딩 전략에 관한 고찰 (Investigations on Dynamic Trading Strategy Utilizing Stochastic Optimal Control and Machine Learning)

  • 박주영;양동수;박경욱
    • 한국지능시스템학회논문지
    • /
    • 제23권4호
    • /
    • pp.348-353
    • /
    • 2013
  • 최근들어, 확률론적 최적제어를 포함한 제어이론과 각종 기계학습 기반 인공지능 방법론은 금융공학 분야의 주요 도구로 자리를 잡아 가고 있다. 본 논문에서는 평균회귀 현상을 보이는 시장을 위한 페어 트레이딩 전략 분야와 추세 추종형 트레이딩 전략 분야에 대해 확률론적 최적제어 이론을 활용한 최신 논문 몇 편을 간단히 살펴보고, 보다 융통성 있고 접근성이 좋은 도구를 확보하기 위하여 확률론적 최적제어이론과 기계학습 기법을 동시에 응용하는 전략을 고려한다. 예시를 위하여 실시한 시뮬레이션은 본 논문에서 고려한 전략이 실제 금융시장 데이터를 대상으로 적용될 때 고무적인 결과를 제공할 수 있음을 보여준다.

확률 펄스 신경회로망의 On-chip 학습 알고리즘 (On-chip Learning Algorithm in Stochastic Pulse Neural Network)

  • 김응수;조덕연;박태진
    • 한국지능시스템학회논문지
    • /
    • 제10권3호
    • /
    • pp.270-279
    • /
    • 2000
  • 본 논문은 확률 펄스연산을 이용한 신경회로망이 on-Chip학습 알고리즘에 대해 기술하였다. 확률 펄스 연산은 임이의 펄스열에서 1과 0이 발생할 확률을 통해 표현된 수를 사용하여 계산하는 것을 일컫는다. 이러한 확률연산을 신경회로망에 적용하면 하드웨어 구현먼적을 줄일 수 있다는 것과 확률적인 특징으로 인해 지역 최소값으로부터 빠져 나와 광역 최적해에 도달할 수 있다는 장점을 갖고 있다. 또한 본 연구에서는 칩 냅에 학습할 수 있는 on-chip학습 알고리즘을 역전파 학습 알고리즘으로부터 유도하였다. 이렇게 유도된 알고리즘을 검증하기 위하여 비선형 패턴분리문제를 모의실험 하였다. 도한 활자체 및 필기체 숫자 인식에도 적용하여 좋은 결과를 얻었다.

  • PDF

Self-Organized Reinforcement Learning Using Fuzzy Inference for Stochastic Gradient Ascent Method

  • K, K.-Wong;Akio, Katuki
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 2001년도 ICCAS
    • /
    • pp.96.3-96
    • /
    • 2001
  • In this paper the self-organized and fuzzy inference used stochastic gradient ascent method is proposed. Fuzzy rule and fuzzy set increase as occasion demands autonomously according to the observation information. And two rules(or two fuzzy sets)becoming to be similar each other as progress of learning are unified. This unification causes the reduction of a number of parameters and learning time. Using fuzzy inference and making a rule with an appropriate state division, our proposed method makes it possible to construct a robust reinforcement learning system.

  • PDF

Stochastic MAC-layer Interference Model for Opportunistic Spectrum Access: A Weighted Graphical Game Approach

  • Zhao, Qian;Shen, Liang;Ding, Cheng
    • Journal of Communications and Networks
    • /
    • 제18권3호
    • /
    • pp.411-419
    • /
    • 2016
  • This article investigates the problem of distributed channel selection in opportunistic spectrum access networks from a perspective of interference minimization. The traditional physical (PHY)-layer interference model is for information theoretic analysis. When practical multiple access mechanisms are considered, the recently developed binary medium access control (MAC)-layer interference model in the previous work is more useful, in which the experienced interference of a user is defined as the number of competing users. However, the binary model is not accurate in mathematics analysis with poor achievable performance. Therefore, we propose a real-valued one called stochastic MAC-layer interference model, where the utility of a player is defined as a function of the aggregate weight of the stochastic interference of competing neighbors. Then, the distributed channel selection problem in the stochastic MAC-layer interference model is formulated as a weighted stochastic MAC-layer interference minimization game and we proved that the game is an exact potential game which exists one pure strategy Nash equilibrium point at least. By using the proposed stochastic learning-automata based uncoupled algorithm with heterogeneous learning parameter (SLA-H), we can achieve suboptimal convergence averagely and this result can be verified in the simulation. Moreover, the simulated results also prove that the proposed stochastic model can achieve higher throughput performance and faster convergence behavior than the binary one.

신뢰도 추정을 위한 분산 학습 신경 회로망 (A variance learning neural network for confidence estimation)

  • 조영빈;권대갑;이경래
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 1996년도 한국자동제어학술회의논문집(국내학술편); 포항공과대학교, 포항; 24-26 Oct. 1996
    • /
    • pp.1173-1176
    • /
    • 1996
  • Multilayer feedforward networks may be applied to identify the deterministic relationship between input and output data. When the results from the network require a high level of assurance, considering of the stochastic relationship between the data may be very important. The variance is one of the useful parameters to represent the stochastic relationship. This paper presents a new algorithm for a multilayer feedforward network to learn the variance of dispersed data without preliminary calculation of variance. In this paper, the network with this learning algorithm is named as a variance learning neural network(VALEAN). Computer simulation examples are utilized for the demonstration and the evaluation of VALEAN.

  • PDF

신뢰도 추정을 위한 분산 학습 신경 회로망 (A Variance Learning Neural Network for Confidence Estimation)

  • 조영빈;권대갑
    • 한국정밀공학회지
    • /
    • 제14권6호
    • /
    • pp.121-127
    • /
    • 1997
  • Multilayer feedforward networks may be applied to identify the deterministic relationship between input and output data. When the results from the network require a high level of assurance, consideration of the stochastic relationship between the input and output data may be very important. Variance is one of the effective parameters to deal with the stochastic relationship. This paper presents a new algroithm for a multilayer feedforward network to learn the variance of dispersed data without preliminary calculation of variance. In this paper, the network with this learning algorithm is named as a variance learning neural network(VALEAN). Computer simulation examples are utilized for the demonstration and the evaluation of VALEAN.

  • PDF