• Title/Summary/Keyword: stochastic DP

Search Result 4, Processing Time 0.018 seconds

Basin-Wide Multi-Reservoir Operation Using Reinforcement Learning (강화학습법을 이용한 유역통합 저수지군 운영)

  • Lee, Jin-Hee;Shim, Myung-Pil
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2006.05a
    • /
    • pp.354-359
    • /
    • 2006
  • The analysis of large-scale water resources systems is often complicated by the presence of multiple reservoirs and diversions, the uncertainty of unregulated inflows and demands, and conflicting objectives. Reinforcement learning is presented herein as a new approach to solving the challenging problem of stochastic optimization of multi-reservoir systems. The Q-Learning method, one of the reinforcement learning algorithms, is used for generating integrated monthly operation rules for the Keum River basin in Korea. The Q-Learning model is evaluated by comparing with implicit stochastic dynamic programming and sampling stochastic dynamic programming approaches. Evaluation of the stochastic basin-wide operational models considered several options relating to the choice of hydrologic state and discount factors as well as various stochastic dynamic programming models. The performance of Q-Learning model outperforms the other models in handling of uncertainty of inflows.

  • PDF

Development of Stochastic Markov Process Model for Maintenance of Armor Units of Rubble-Mound Breakwaters (경사제 피복재의 유지관리를 위한 추계학적 Markov 확률모형의 개발)

  • Lee, Cheol-Eung
    • Journal of Korean Society of Coastal and Ocean Engineers
    • /
    • v.25 no.2
    • /
    • pp.52-62
    • /
    • 2013
  • A stochastic Markov process (MP) model has been developed for evaluating the probability of failure of the armor unit of rubble-mound breakwaters as a function of time. The mathematical MP model could have been formulated by combining the counting process or renewal process (CP/RP) on the load occurrences with the damage process (DP) on the cumulative damage events, and applied to the armor units of rubble-mound breakwaters. Transition probabilities have been estimated by Monte-Carlo simulation (MCS) technique with the definition of damage level of armor units, and very well satisfies some conditions constrained in the probabilistic and physical views. The probabilities of failure have been also compared and investigated in process of time which have been calculated according to the variations of return period and safety factor being the important variables related to design of armor units of rubble-mound breakwater. In particular, it can be quantitatively found how the prior damage levels can effect on the sequent probabilities of failure. Finally, two types of methodology have been in this study proposed to evaluate straightforwardly the repair times which are indispensable to the maintenance of armor units of rubble-mound breakwaters and shown several simulation results including the cost analyses.

Application of Recent Approximate Dynamic Programming Methods for Navigation Problems (주행문제를 위한 최신 근사적 동적계획법의 적용)

  • Min, Dae-Hong;Jung, Keun-Woo;Kwon, Ki-Young;Park, Joo-Young
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.21 no.6
    • /
    • pp.737-742
    • /
    • 2011
  • Navigation problems include the task of determining the control input under various constraints for systems such as mobile robots subject to uncertain disturbance. Such tasks can be modeled as constrained stochastic control problems. In order to solve these control problems, one may try to utilize the dynamic programming(DP) methods which rely on the concept of optimal value function. However, in most real-world problems, this trial would give us many difficulties; for examples, the exact system model may not be known; the computation of the optimal control policy may be impossible; and/or a huge amount of computing resource may be in need. As a strategy to overcome the difficulties of DP, one can utilize ADP(approximate dynamic programming) methods, which find suboptimal control policies resorting to approximate value functions. In this paper, we apply recently proposed ADP methods to a class of navigation problems having complex constraints, and observe the resultant performance characteristics.