• 제목/요약/키워드: Policy Iteration

검색결과 18건 처리시간 0.024초

Markovian 접근방법을 이용한 직렬생산시스템의 검사정책 (Markovian Approach of Inspection Policy in a Serial Manufacturing System)

  • 정영배;황의철
    • 산업경영시스템학회지
    • /
    • 제11권17호
    • /
    • pp.81-85
    • /
    • 1988
  • This paper presents a model that considers combinations of rework, repair, replacement and scrapping. Policy-Iteration method of inspection is proposed for a serial manufacturing system whose repair cost, scrap cost and inspection cost. when it fails, can be formulated by Markovian approach. Policy-Iteration stops when new inspection policy is the same as previous inspection policy. A numerial example is presented.

  • PDF

다기 전력 시스템의 안정화를 위한 탐색화된 정책 반복법 기반 적응형 강인 제어기 설계 (Design of an Adaptive Robust Controller Based on Explorized Policy Iteration for the Stabilization of Multimachine Power Systems)

  • 전태윤;박진배
    • 제어로봇시스템학회논문지
    • /
    • 제20권11호
    • /
    • pp.1118-1124
    • /
    • 2014
  • This paper proposes a novel controller design scheme for multimachine power systems based on the explorized policy iteration. Power systems have several uncertainties on system dynamics due to the various effects of interconnections between generators. To solve this problem, the proposed method solves the LQR (Linear Quadratic Regulation) problem of isolated subsystems without the knowledge of a system matrix and the interconnection parameters of multimachine power systems. By selecting the proper performance indices, it guarantees the stability and convergence of the LQ optimal control. To implement the proposed scheme, the least squares based online method is also investigated in terms of PE (Persistency of Excitation), interconnection parameters and exploration signals. Finally, the performance and effectiveness of the proposed algorithm are demonstrated by numerical simulations of three-machine power systems with governor controllers.

Dynamic Task Scheduling Via Policy Iteration Scheduling Approach for Cloud Computing

  • Hu, Bin;Xie, Ning;Zhao, Tingting;Zhang, Xiaotong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • 제11권3호
    • /
    • pp.1265-1278
    • /
    • 2017
  • Dynamic task scheduling is one of the most popular research topics in the cloud computing field. The cloud scheduler dynamically provides VM resources to variable cloud tasks with different scheduling strategies in cloud computing. In this study, we utilized a valid model to describe the dynamic changes of both computing facilities (such as hardware updating) and request task queuing. We built a novel approach called Policy Iteration Scheduling (PIS) to globally optimize the independent task scheduling scheme and minimize the total execution time of priority tasks. We performed experiments with randomly generated cloud task sets and varied the performance of VM resources using Poisson distributions. The results show that PIS outperforms other popular schedulers in a typical cloud computing environment.

연속시간 선형시스템에 대한 탐색화된 정책반복법 (Explorized Policy Iteration For Continuous-Time Linear Systems)

  • 이재영;전태윤;최윤호;박진배
    • 전기학회논문지
    • /
    • 제61권3호
    • /
    • pp.451-458
    • /
    • 2012
  • This paper addresses the problem that policy iteration (PI) for continuous-time (CT) systems requires explorations of the state space which is known as persistency of excitation in adaptive control community, and as a result, proposes a PI scheme explorized by an additional probing signal to solve the addressed problem. The proposed PI method efficiently finds in online fashion the related CT linear quadratic (LQ) optimal control without knowing the system matrix A, and guarantees the stability and convergence to the LQ optimal control, which is proven in this paper in the presence of the probing signal. A design method for the probing signal is also presented to balance the exploration of the state space and the control performance. Finally, several simulation results are provided to verify the effectiveness of the proposed explorized PI method.

Policy Iteration Algorithm Based Fault Tolerant Tracking Control: An Implementation on Reconfigurable Manipulators

  • Li, Yuanchun;Xia, Hongbing;Zhao, Bo
    • Journal of Electrical Engineering and Technology
    • /
    • 제13권4호
    • /
    • pp.1740-1751
    • /
    • 2018
  • This paper proposes a novel fault tolerant tracking control (FTTC) scheme for a class of nonlinear systems with actuator failures based on the policy iteration (PI) algorithm and the adaptive fault observer. The estimated actuator failure from an adaptive fault observer is utilized to construct an improved performance index function that reflects the failure, regulation and control simultaneously. With the help of the proper performance index function, the FTTC problem can be transformed into an optimal control problem. The fault tolerant tracking controller is composed of the desired controller and the approximated optimal feedback one. The desired controller is developed to maintain the desired tracking performance at the steady-state, and the approximated optimal feedback controller is designed to stabilize the tracking error dynamics in an optimal manner. By establishing a critic neural network, the PI algorithm is utilized to solve the Hamilton-Jacobi-Bellman equation, and then the approximated optimal feedback controller can be derived. Based on Lyapunov technique, the uniform ultimate boundedness of the closed-loop system is proven. The proposed FTTC scheme is applied to reconfigurable manipulators with two degree of freedoms in order to test the effectiveness via numerical simulation.

Seamless Mobility of Heterogeneous Networks Based on Markov Decision Process

  • Preethi, G.A.;Chandrasekar, C.
    • Journal of Information Processing Systems
    • /
    • 제11권4호
    • /
    • pp.616-629
    • /
    • 2015
  • A mobile terminal will expect a number of handoffs within its call duration. In the event of a mobile call, when a mobile node moves from one cell to another, it should connect to another access point within its range. In case there is a lack of support of its own network, it must changeover to another base station. In the event of moving on to another network, quality of service parameters need to be considered. In our study we have used the Markov decision process approach for a seamless handoff as it gives the optimum results for selecting a network when compared to other multiple attribute decision making processes. We have used the network cost function for selecting the network for handoff and the connection reward function, which is based on the values of the quality of service parameters. We have also examined the constant bit rate and transmission control protocol packet delivery ratio. We used the policy iteration algorithm for determining the optimal policy. Our enhanced handoff algorithm outperforms other previous multiple attribute decision making methods.

환자 우선순위를 고려한 수술실 예약 : 이진검색을 활용한 수정 평가치반복법 (Operating Room Reservation Problem Considering Patient Priority : Modified Value Iteration Method with Binary Search)

  • 민대기
    • 산업공학
    • /
    • 제24권4호
    • /
    • pp.274-280
    • /
    • 2011
  • Delayed access to surgery may lead to deterioration in the patient condition, poor clinical outcomes, increase in the probability of emergency admission, or even death. The purpose of this work is to decide the number of patients selected from a waiting list and to schedule them in accordance with the operating room capacity in the next period. We formulate the problem as an infinite horizon Markov Decision Process (MDP), which attempts to strike a balance between the patient waiting times and overtime works. Structural properties of the proposed model are investigated to facilitate the solution procedure. The proposed procedure modifies the conventional value iteration method along with the binary search technique. An example of the optimal policy is provided, and computational results are given to show that the proposed procedure improves computational efficiency.

Machine Maintenance Policy Using Partially Observable Markov Decision Process

  • Pak, Pyoung Ki;Kim, Dong Won;Jeong, Byung Ho
    • 품질경영학회지
    • /
    • 제16권2호
    • /
    • pp.1-9
    • /
    • 1988
  • This paper considers a machine maintenance problem. The machine's condition is partially known by observing the machine's output products. This problem is formulated as an infinite horizon partially observable Markov decison process to find an optimal maintenance policy. However, even though the optimal policy of the model exists, finding the optimal policy is very time consuming. Thus, the intends of this study is to find ${\varepsilon}-optimal$ stationary policy minimizing the expected discounted total cost of the system, ${\varepsilon}-optimal$ policy is found by using a modified version of the well-known policy iteration algorithm. A numerical example is also shown.

  • PDF

Partially Observable Markov Decision Process with Lagged Information over Infinite Horizon

  • Jeong, Byong-Ho;Kim, Soung-Hie
    • 한국경영과학회지
    • /
    • 제16권1호
    • /
    • pp.135-146
    • /
    • 1991
  • This paper shows the infinite horizon model of Partially Observable Markov Decision Process with lagged information. The lagged information is uncertain delayed observation of the process under control. Even though the optimal policy of the model exists, finding the optimal policy is very time consuming. Thus, the aim of this study is to find an .eplison.-optimal stationary policy minimizing the expected discounted total cost of the model. .EPSILON.- optimal policy is found by using a modified version of the well known policy iteration algorithm. The modification focuses to the value determination routine of the algorithm. Some properties of the approximation functions for the expected discounted cost of a stationary policy are presented. The expected discounted cost of a stationary policy is approximated based on these properties. A numerical example is also shown.

  • PDF

인공지능 기계학습 방법 비교와 학습을 통한 디지털 신호변화 (Digital signal change through artificial intelligence machine learning method comparison and learning)

  • 이덕균;박지은
    • 디지털융복합연구
    • /
    • 제17권10호
    • /
    • pp.251-258
    • /
    • 2019
  • 앞으로의 시대는 인공지능을 이용한 다양한 분야에 다양한 제품이2 생성될 것이다. 이러한 시대에 인공지능의 학습 방법의 동작 원리를 알고 이를 정확하게 활용하는 것은 상당히 중요한 문제이다. 이 논문은 지금까지 알려진 인공지능 학습 방법을 소개한다. 인공지능의 학습은 수학의 고정점 반복 방법(fixed point iteration method)을 기반으로 하고 있다. 이 방법을 기반으로 수렴 속도를 조절한 GD(Gradient Descent) 방법, 그리고 쌓여가는 양을 누적하는 Momentum 방법, 마지막으로 이러한 방법을 적절히 혼합한 Adam(Adaptive Moment Estimation) 방법 등이 있다. 이 논문에서는 각 방법의 장단점을 설명한다. 특히, Adam 방법은 조정 능력을 포함하고 있어 기계학습의 강도를 조정할 수 있다. 그리고 이러한 방법들이 디지털 신호에 어떠한 영향을 미치는 지에 대하여 분석한다. 이러한 디지털 신호의 학습과정에서의 변화는 앞으로 인공지능을 이용한 작업 및 연구를 수행함에 있어 정확한 활용과 정확한 판단의 기준이 될 것이다.