• Title/Summary/Keyword: Policy Optimization

Search Result 293, Processing Time 0.037 seconds

OPTIMIZATION OF CHP OPERATION WITH HEAT AND ELECTRICITY CONSTRAINTS (열과 전기 제약을 고려한 최적화 CHP 운전)

  • Nguyen, Minh Y;Choi, Nack-Hyun;Aziza, Aziza;Yoon, Yong-Tae
    • Proceedings of the KIEE Conference
    • /
    • 2008.11a
    • /
    • pp.457-459
    • /
    • 2008
  • This paper presents the optimization of CHP (Combined heat and power) plant under deregulated market. In this case, a boiler is added as different source for heat providing, that gives flexible and efficient operation for the plant. The purpose of optimization is to maximize the profit in period of 24 hours by making unit commitment decision, called "optimal policy". In this paper, Dynamic Programming method is introduced as the effective and efficient method. Finally, an example is solved to illustrate the optimal Policy of such a CHP and boiler.

  • PDF

Joint Optimization of Age Replacement and Spare Provisioning Policy (수명교체와 예비품 재고 정책의 통합 최적화)

  • Lim, Sung-Uk;Park, Young-Taek
    • Journal of Korean Society for Quality Management
    • /
    • v.40 no.1
    • /
    • pp.88-91
    • /
    • 2012
  • Joint optimization of preventive age replacement and inventory policy is considered in this paper. There are three decision variables in the problem: (i) preventive replacement age of the operating unit, (ii) order quantity per order and (iii) reorder point for spare replenishment. Preventive replacement age and order quantity are jointly determined so as to minimize the expected cost rate, and then the reorder point for meeting a desired service level is found. A numerical example is included to explain the joint optimization model.

Pareto-Based Multi-Objective Optimization for Two-Block Class-Based Storage Warehouse Design

  • Sooksaksun, Natanaree
    • Industrial Engineering and Management Systems
    • /
    • v.11 no.4
    • /
    • pp.331-338
    • /
    • 2012
  • This research proposes a Pareto-based multi-objective optimization approach to class-based storage warehouse design, considering a two-block warehouse that operates under the class-based storage policy in a low-level, picker-to-part and narrow aisle warehousing system. A mathematical model is formulated to determine the number of aisles, the length of aisle and the partial length of each pick aisle to allocate to each product class that minimizes the travel distance and maximizes the usable storage space. A solution approach based on multiple objective particle swarm optimization is proposed to find the Pareto front of the problems. Numerical examples are given to show how to apply the proposed algorithm. The results from the examples show that the proposed algorithm can provide design alternatives to conflicting warehouse design decisions.

A Study on Asset Allocation Using Proximal Policy Optimization (근위 정책 최적화를 활용한 자산 배분에 관한 연구)

  • Lee, Woo Sik
    • Journal of the Korean Society of Industry Convergence
    • /
    • v.25 no.4_2
    • /
    • pp.645-653
    • /
    • 2022
  • Recently, deep reinforcement learning has been applied to a variety of industries, such as games, robotics, autonomous vehicles, and data cooling systems. An algorithm called reinforcement learning allows for automated asset allocation without the requirement for ongoing monitoring. It is free to choose its own policies. The purpose of this paper is to carry out an empirical analysis of the performance of asset allocation strategies. Among the strategies considered were the conventional Mean- Variance Optimization (MVO) and the Proximal Policy Optimization (PPO). According to the findings, the PPO outperformed both its benchmark index and the MVO. This paper demonstrates how dynamic asset allocation can benefit from the development of a reinforcement learning algorithm.

A Maintenance Design of Connected-(r, s)-out-of-(m, n) F System Using Simulated Annealing (시뮬레이티드 어닐링을 이용한(m, n)중 연속(r,s) : F 시스템의 정비모형)

  • Lee, Sangheon;Kang, Youngtai;Shin, Dongyeul
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.34 no.1
    • /
    • pp.98-107
    • /
    • 2008
  • The purpose of this paper is to present an optimization scheme that aims at minimizing the expected cost per unittime. This study considers a linear connected-(r, s)-ouI-of-(m, n):f lattice system whose components are orderedlike the elements of a linear (m, n)-matrix. We assume that all components are in the state 1 (operating) or 0(failed) and identical and s-independent. The system fails whenever at least one connected (r, s)-submatrix offailed components occurs. To find the optimal threshold of maintenance intervention, we use a simulatedannealing(SA) algorithm for the cost optimization procedure. The expected cost per unit time is obtained byMonte Carlo simulation. We also has made sensitivity analysis to the different cost parameters. In this study,utility maintenance model is constructed so that minimize the expense under full equipment policy throughcomparison for the full equipment policy and preventive maintenance policy. The full equipment cycle and unitcost rate are acquired by simulated annealing algorithm. The SA algorithm is appeared to converge fast inmulti-component system that is suitable to optimization decision problem.

A Study about the Usefulness of Reinforcement Learning in Business Simulation Games using PPO Algorithm (경영 시뮬레이션 게임에서 PPO 알고리즘을 적용한 강화학습의 유용성에 관한 연구)

  • Liang, Yi-Hong;Kang, Sin-Jin;Cho, Sung Hyun
    • Journal of Korea Game Society
    • /
    • v.19 no.6
    • /
    • pp.61-70
    • /
    • 2019
  • In this paper, we apply reinforcement learning in the field of management simulation game to check whether game agents achieve autonomously given goal. In this system, we apply PPO (Proximal Policy Optimization) algorithm in the Unity Machine Learning (ML) Agent environment and the game agent is designed to automatically find a way to play. Five game scenario simulation experiments were conducted to verify their usefulness. As a result, it was confirmed that the game agent achieves the goal through learning despite the change of environment variables in the game.

Optimal Policy for (s, S) Inventory System Characterized by Renewal Arrival Process of Demand through Simulation Sensitivity Analysis (수요가 재생 도착과정을 따르는 (s, S) 재고 시스템에서 시뮬레이션 민감도 분석을 이용한 최적 전략)

  • 권치명
    • Journal of the Korea Society for Simulation
    • /
    • v.12 no.3
    • /
    • pp.31-40
    • /
    • 2003
  • This paper studies an optimal policy for a certain class of (s, S) inventory control systems, where the demands are characterized by the renewal arrival process. To minimize the average cost over a simulation period, we apply a stochastic optimization algorithm which uses the gradients of parameters, s and S. We obtain the gradients of objective function with respect to ordering amount S and reorder point s via a combined perturbation method. This method uses the infinitesimal perturbation analysis and the smoothed perturbation analysis alternatively according to occurrences of ordering event changes. The optimal estimates of s and S from our simulation results are quite accurate. We consider that this may be due to the estimated gradients of little noise from the regenerative system simulation, and their effect on search procedure when we apply the stochastic optimization algorithm. The directions for future study stemming from this research pertain to extension to the more general inventory system with regard to demand distribution, backlogging policy, lead time, and inter-arrival times of demands. Another direction involves the efficiency of stochastic optimization algorithm related to searching procedure for an improving point of (s, S).

  • PDF

Policy implication of nuclear energy's potential for energy optimization and CO2 mitigation: A case study of Fujian, China

  • Peng, Lihong;Zhang, Yi;Li, Feng;Wang, Qian;Chen, Xiaochou;Yu, Ang
    • Nuclear Engineering and Technology
    • /
    • v.51 no.4
    • /
    • pp.1154-1162
    • /
    • 2019
  • China is undertaking an energy reform from fossil fuels to clean energy to accomplish $CO_2$ intensity (CI) reduction commitments. After hydropower, nuclear energy is potential based on breadthwise comparison with the world and analysis of government energy consumption (EC) plan. This paper establishes a CI energy policy response forecasting model based on national and provincial EC plans. This model is then applied in Fujian Province to predict its CI from 2016 to 2020. The result shows that CI declines at a range of 43%-53% compared to that in 2005 considering five conditions of economic growth in 2020. Furthermore, Fujian will achieve the national goals in advance because EC is controlled and nuclear energy ratio increased to 16.4% (the proportion of non-fossil in primary energy is 26.7%). Finally, the development of nuclear energy in China and the world are analyzed, and several policies for energy optimization and CI reduction are proposed.

A Study on the Application of PIDO Technique for the Maintenance Policy Optimization Considering the Performance-Based Logistics Support System (성과기반 군수지원체계의 정비정책 최적화를 위한 PIDO 기법 적용에 관한 연구)

  • Ju, Hyun-Jun;Lee, Jae-Chon
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.15 no.2
    • /
    • pp.632-637
    • /
    • 2014
  • In this paper the concept of the performance-based logistics (PBL) support for weapon systems is discussed and an enhancement is studied such that prior to the Operational phase, the development of the PBL can begin from the Engineering & Manufacturing Development (EMD) phase together with multiple performance indices considered. The genetic algorithm should be considered for the complex system to solve the maintenance policy optimization. In particular, the requirement of repair level analysis model is developed based on reflecting the PBL concept. To decide the maintenance policy prior to Operational phase in accordance with customer requirements, the PIDO(Process Integration and Design Optimization) technique useful in choosing the performance indices and changing the constraints was used. The genetic algorithm of PIDO tool, like PIAnO and ModelCenter, was verified that it could be applied to optimize the maintenance policy.

Learning Optimal Trajectory Generation for Low-Cost Redundant Manipulator using Deep Deterministic Policy Gradient(DDPG) (저가 Redundant Manipulator의 최적 경로 생성을 위한 Deep Deterministic Policy Gradient(DDPG) 학습)

  • Lee, Seunghyeon;Jin, Seongho;Hwang, Seonghyeon;Lee, Inho
    • The Journal of Korea Robotics Society
    • /
    • v.17 no.1
    • /
    • pp.58-67
    • /
    • 2022
  • In this paper, we propose an approach resolving inaccuracy of the low-cost redundant manipulator workspace with low encoder and low stiffness. When the manipulators are manufactured with low-cost encoders and low-cost links, the robots can run into workspace inaccuracy issues. Furthermore, trajectory generation based on conventional forward/inverse kinematics without taking into account inaccuracy issues will introduce the risk of end-effector fluctuations. Hence, we propose an optimization for the trajectory generation method based on the DDPG (Deep Deterministic Policy Gradient) algorithm for the low-cost redundant manipulators reaching the target position in Euclidean space. We designed the DDPG algorithm minimizing the distance along with the jacobian condition number. The training environment is selected with an error rate of randomly generated joint spaces in a simulator that implemented real-world physics, the test environment is a real robotic experiment and demonstrated our approach.