• Title/Summary/Keyword: Markov decision process

Search Result 130, Processing Time 0.024 seconds

Markov Chain based Packet Scheduling in Wireless Heterogeneous Networks

  • Mansouri, Wahida Ali;Othman, Salwa Hamda;Asklany, Somia
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.3
    • /
    • pp.1-8
    • /
    • 2022
  • Supporting real-time flows with delay and throughput constraints is an important challenge for future wireless networks. In this paper, we develop an optimal scheduling scheme to optimally choose the packets to transmit. The optimal transmission strategy is based on an observable Markov decision process. The novelty of the work focuses on a priority-based probabilistic packet scheduling strategy for efficient packet transmission. This helps in providing guaranteed services to real time traffic in Heterogeneous Wireless Networks. The proposed scheduling mechanism is able to optimize the desired performance. The proposed scheduler improves the overall end-to-end delay, decreases the packet loss ratio, and reduces blocking probability even in the case of congested network.

Design of Markov Decision Process Based Dialogue Manager (마르코프 의사결정 과정에 기반한 대화 관리자 설계)

  • Choi, Joon-Ki;Eun, Ji-Hyun;Chang, Du-Seong;Kim, Hyun-Jeong;Koo, Myong-Wan
    • Proceedings of the KSPS conference
    • /
    • 2006.11a
    • /
    • pp.14-18
    • /
    • 2006
  • The role of dialogue manager is to select proper actions based on observed environment and inferred user intention. This paper presents stochastic model for dialogue manager based on Markov decision process. To build a mixed initiative dialogue manager, we used accumulated user utterance, previous act of dialogue manager, and domain dependent knowledge as the input to the MDP. We also used dialogue corpus to train the automatically optimized policy of MDP with reinforcement learning algorithm. The states which have unique and intuitive actions were removed from the design of MDP by using the domain knowledge. The design of dialogue manager included the usage of natural language understanding and response generator to build short message based remote control of home networked appliances.

  • PDF

A Joint Allocation Algorithm of Computing and Communication Resources Based on Reinforcement Learning in MEC System

  • Liu, Qinghua;Li, Qingping
    • Journal of Information Processing Systems
    • /
    • v.17 no.4
    • /
    • pp.721-736
    • /
    • 2021
  • For the mobile edge computing (MEC) system supporting dense network, a joint allocation algorithm of computing and communication resources based on reinforcement learning is proposed. The energy consumption of task execution is defined as the maximum energy consumption of each user's task execution in the system. Considering the constraints of task unloading, power allocation, transmission rate and calculation resource allocation, the problem of joint task unloading and resource allocation is modeled as a problem of maximum task execution energy consumption minimization. As a mixed integer nonlinear programming problem, it is difficult to be directly solve by traditional optimization methods. This paper uses reinforcement learning algorithm to solve this problem. Then, the Markov decision-making process and the theoretical basis of reinforcement learning are introduced to provide a theoretical basis for the algorithm simulation experiment. Based on the algorithm of reinforcement learning and joint allocation of communication resources, the joint optimization of data task unloading and power control strategy is carried out for each terminal device, and the local computing model and task unloading model are built. The simulation results show that the total task computation cost of the proposed algorithm is 5%-10% less than that of the two comparison algorithms under the same task input. At the same time, the total task computation cost of the proposed algorithm is more than 5% less than that of the two new comparison algorithms.

Parcel Locker Locations and Dynamic Vehicle Routing Problem with Traffic Congestion (교통 체증을 고려한 물품 보관함 위치 및 동적 차량 경로 문제)

  • Chaehyun Kim;Gitae Kim
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.47 no.2
    • /
    • pp.168-175
    • /
    • 2024
  • Due to the complexity of urban area, the city vehicle routing problem has been a difficult problem. The problem has involved factors such as parking availability, road conditions, and traffic congestion, all of which increase transportation costs and delivery times. To resolve this problem, one effective solution can be the use of parcel lockers located near customer sites, where products are stored for customers to pick up. When a vehicle delivers products to a designated parcel locker, customers in the vicinity must pick up their products from that locker. Recently, identifying optimal locations for these parcel lockers has become an important research issue. This paper addresses the parcel locker location problem within the context of urban traffic congestion. By considering dynamic environmental factors, we propose a Markov decision process model to tackle the city vehicle routing problem. To ensure more real situations, we have used optimal paths for distances between two nodes. Numerical results demonstrate the viability of our model and solution strategy.

Determination of Ship Collision Avoidance Path using Deep Deterministic Policy Gradient Algorithm (심층 결정론적 정책 경사법을 이용한 선박 충돌 회피 경로 결정)

  • Kim, Dong-Ham;Lee, Sung-Uk;Nam, Jong-Ho;Furukawa, Yoshitaka
    • Journal of the Society of Naval Architects of Korea
    • /
    • v.56 no.1
    • /
    • pp.58-65
    • /
    • 2019
  • The stability, reliability and efficiency of a smart ship are important issues as the interest in an autonomous ship has recently been high. An automatic collision avoidance system is an essential function of an autonomous ship. This system detects the possibility of collision and automatically takes avoidance actions in consideration of economy and safety. In order to construct an automatic collision avoidance system using reinforcement learning, in this work, the sequential decision problem of ship collision is mathematically formulated through a Markov Decision Process (MDP). A reinforcement learning environment is constructed based on the ship maneuvering equations, and then the three key components (state, action, and reward) of MDP are defined. The state uses parameters of the relationship between own-ship and target-ship, the action is the vertical distance away from the target course, and the reward is defined as a function considering safety and economics. In order to solve the sequential decision problem, the Deep Deterministic Policy Gradient (DDPG) algorithm which can express continuous action space and search an optimal action policy is utilized. The collision avoidance system is then tested assuming the $90^{\circ}$intersection encounter situation and yields a satisfactory result.

Machine Maintenance Policy Using Partially Observable Markov Decision Process

  • Pak, Pyoung Ki;Kim, Dong Won;Jeong, Byung Ho
    • Journal of Korean Society for Quality Management
    • /
    • v.16 no.2
    • /
    • pp.1-9
    • /
    • 1988
  • This paper considers a machine maintenance problem. The machine's condition is partially known by observing the machine's output products. This problem is formulated as an infinite horizon partially observable Markov decison process to find an optimal maintenance policy. However, even though the optimal policy of the model exists, finding the optimal policy is very time consuming. Thus, the intends of this study is to find ${\varepsilon}-optimal$ stationary policy minimizing the expected discounted total cost of the system, ${\varepsilon}-optimal$ policy is found by using a modified version of the well-known policy iteration algorithm. A numerical example is also shown.

  • PDF

Hierarchical Power Management Architecture and Optimal Local Control Policy for Energy Efficient Networks

  • Wei, Yifei;Wang, Xiaojun;Fialho, Leonardo;Bruschi, Roberto;Ormond, Olga;Collier, Martin
    • Journal of Communications and Networks
    • /
    • v.18 no.4
    • /
    • pp.540-550
    • /
    • 2016
  • Since energy efficiency has become a significant concern for network infrastructure, next-generation network devices are expected to have embedded advanced power management capabilities. However, how to effectively exploit the green capabilities is still a big challenge, especially given the high heterogeneity of devices and their internal architectures. In this paper, we introduce a hierarchical power management architecture (HPMA) which represents physical components whose power can be monitored and controlled at various levels of a device as entities. We use energy aware state (EAS) as the power management setting mode of each device entity. The power policy controller is capable of getting information on how many EASes of the entity are manageable inside a device, and setting a certain EAS configuration for the entity. We propose the optimal local control policy which aims to minimize the router power consumption while meeting the performance constraints. A first-order Markov chain is used to model the statistical features of the network traffic load. The dynamic EAS configuration problem is formulated as a Markov decision process and solved using a dynamic programming algorithm. In addition, we demonstrate a reference implementation of the HPMA and EAS concept in a NetFPGA frequency scaled router which has the ability of toggling among five operating frequency options and/or turning off unused Ethernet ports.

Machine Diagnosis and Maintenance Policy Generation Using Adaptive Decision Tree and Shortest Path Problem (적응형 의사결정 트리와 최단 경로법을 이용한 기계 진단 및 보전 정책 수립)

  • 백준걸
    • Journal of the Korean Operations Research and Management Science Society
    • /
    • v.27 no.2
    • /
    • pp.33-49
    • /
    • 2002
  • CBM (Condition-Based Maintenance) has increasingly drawn attention in industry because of its many benefits. CBM Problem Is characterized as a state-dependent scheduling model that demands simultaneous maintenance actions, each for an attribute that influences on machine condition. This problem is very hard to solve within conventional Markov decision process framework. In this paper, we present an intelligent machine maintenance scheduler, for which a new incremental decision tree learning method as evolutionary system identification model and shortest path problem as schedule generation model are developed. Although our approach does not guarantee an optimal scheduling policy in mathematical viewpoint, we verified through simulation based experiment that the intelligent scheduler is capable of providing good scheduling policy that can be used in practice.

On the Analysis of DS/CDMA Multi-hop Packet Radio Network with Auxiliary Markov Transient Matrix. (보조 Markov 천이행렬을 이용한 DS/CDMA 다중도약 패킷무선망 분석)

  • 이정재
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.19 no.5
    • /
    • pp.805-814
    • /
    • 1994
  • In this paper, we introduce a new method which is available for analyzing the throughput of the packet radio network by using the auxiliary Markov transient matrix with a failure state and a success state. And we consider the effect of symbol error for the network state(X, R) consisted of the number of transmitting PRU X and receiving PRU R. We examine the packet radio network of a continuous time Markov chain model, and the direct sequence binary phase shift keying CDMA radio channel with hard decision Viterbi decoding and bit-by-bit changing spreading code. For the unslotted distributed multi-hop packet radio network, we assume that the packet error due to a symbol error of radio channel has Poisson process, and the time period of an error occurrence is exponentially distributed. Through the throughputs which are found as a function of radio channel parameters, such as the received signal to noise ratio and chips of spreading code per symbol, and of network parameters, such as the number of PRU and offered traffic rate, it is shown that this composite analysis enables us to combine the Markovian packet radio network model with a coded DS/BPSK CDMA radio channel.

  • PDF

Optimal LNG Procurement Policy in a Spot Market Using Dynamic Programming (동적 계획법을 이용한 LNG 현물시장에서의 포트폴리오 구성방법)

  • Ryu, Jong-Hyun
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.41 no.3
    • /
    • pp.259-266
    • /
    • 2015
  • Among many energy resources, natural gas has recently received a remarkable amount of attention, particularly from the electrical generation industry. This is in part due to increasing shale gas production, providing an environment-friendly fossil fuel, and high risk of nuclear power. Because South Korea, the world's second largest LNG importing nation after Japan, has no international natural gas pipelines and relies on imports in the form of LNG, the natural gas has been traditionally procured by long term LNG contracts at relatively high price. Thus, there is a need of developing an Asian LNG trading hub, where LNG can be traded at more competitive spot prices. In a natural gas spot market, the amount of natural gas to be bought should be carefully determined considering a limited storage capacity and future pricing dynamics. In this work, the problem to find the optimal amount of natural gas in a spot market is formulated as a Markov decision process (MDP) in risk neutral environment and the optimal base stock policy which depends on a stage and price is established. Taking into account price and demand uncertainties, the basestock target levels are simply approximated from dynamic programming. The simulation results show that the basestock policy can be one of effective ways for procurement of LNG in a spot market.