• Title/Summary/Keyword: Markov game

Search Result 33, Processing Time 0.021 seconds

Non-Cooperative Game Joint Hidden Markov Model for Spectrum Allocation in Cognitive Radio Networks

  • Jiao, Yan
    • International journal of advanced smart convergence
    • /
    • v.7 no.1
    • /
    • pp.15-23
    • /
    • 2018
  • Spectrum allocation is a key operation in cognitive radio networks (CRNs), where secondary users (SUs) are usually selfish - to achieve itself utility maximization. In view of this context, much prior lit literature proposed spectrum allocation base on non-cooperative game models. However, the most of them proposed non-cooperative game models based on complete information of CRNs. In practical, primary users (PUs) in a dynamic wireless environment with noise uncertainty, shadowing, and fading is difficult to attain a complete information about them. In this paper, we propose a non-cooperative game joint hidden markov model scheme for spectrum allocation in CRNs. Firstly, we propose a new hidden markov model for SUs to predict the sensing results of competitors. Then, we introduce the proposed hidden markov model into the non-cooperative game. That is, it predicts the sensing results of competitors before the non-cooperative game. The simulation results show that the proposed scheme improves the energy efficiency of networks and utilization of SUs.

Network Security Situation Assessment Method Based on Markov Game Model

  • Li, Xi;Lu, Yu;Liu, Sen;Nie, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.2414-2428
    • /
    • 2018
  • In order to solve the problem that the current network security situation assessment methods just focus on the attack behaviors, this paper proposes a kind of network security situation assessment method based on Markov Decision Process and Game theory. The method takes the Markov Game model as the core, and uses the 4 levels data fusion to realize the evaluation of the network security situation. In this process, the Nash equilibrium point of the game is used to determine the impact on the network security. Experiments show that the results of this method are basically consistent with the expert evaluation data. As the method takes full account of the interaction between the attackers and defenders, it is closer to reality, and can accurately assess network security situation.

The Ramp-Rate Constraint Effects on the Generators' Equilibrium Strategy in Electricity Markets

  • Joung, Man-Ho;Kim, Jin-Ho
    • Journal of Electrical Engineering and Technology
    • /
    • v.3 no.4
    • /
    • pp.509-513
    • /
    • 2008
  • In this paper, we investigate how generators' ramp-rate constraints may influence their equilibrium strategy formulation. In the market model proposed in this study, the generators' ramp-rate constraints are explicitly represented. In order to fully characterize the inter-temporal nature of the ramp-rate constraints, a dynamic game model is presented. The subgame perfect Nash equilibrium is adopted as the solution of the game and the backward induction procedure for the solution of the game is designed in this paper. The inter-temporal nature of the ramp-rate constraints results in the Markov property of the game, and we have found that the Markov property of the game significantly simplifies the subgame perfect Nash equilibrium characterization. Finally, a simple electricity market numerical illustration is presented for the successful application of the approach proposed.

STOPPING TIMES IN THE GAME ROCK-PAPER-SCISSORS

  • Jeong, Kyeonghoon;Yoo, Hyun Jae
    • Bulletin of the Korean Mathematical Society
    • /
    • v.56 no.6
    • /
    • pp.1497-1510
    • /
    • 2019
  • In this paper we compute the stopping times in the game Rock-Paper-Scissors. By exploiting the recurrence relation we compute the mean values of stopping times. On the other hand, by constructing a transition matrix for a Markov chain associated with the game, we get also the distribution of the stopping times and thereby we compute the mean stopping times again. Then we show that the mean stopping times increase exponentially fast as the number of the participants increases.

Optimal Network Defense Strategy Selection Based on Markov Bayesian Game

  • Wang, Zengguang;Lu, Yu;Li, Xi;Nie, Wei
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.11
    • /
    • pp.5631-5652
    • /
    • 2019
  • The existing defense strategy selection methods based on game theory basically select the optimal defense strategy in the form of mixed strategy. However, it is hard for network managers to understand and implement the defense strategy in this way. To address this problem, we constructed the incomplete information stochastic game model for the dynamic analysis to predict multi-stage attack-defense process by combining Bayesian game theory and the Markov decision-making method. In addition, the payoffs are quantified from the impact value of attack-defense actions. Based on previous statements, we designed an optimal defense strategy selection method. The optimal defense strategy is selected, which regards defense effectiveness as the criterion. The proposed method is feasibly verified via a representative experiment. Compared to the classical strategy selection methods based on the game theory, the proposed method can select the optimal strategy of the multi-stage attack-defense process in the form of pure strategy, which has been proved more operable than the compared ones.

Markov Decision Process for Curling Strategies (MDP에 의한 컬링 전략 선정)

  • Bae, Kiwook;Park, Dong Hyun;Kim, Dong Hyun;Shin, Hayong
    • Journal of Korean Institute of Industrial Engineers
    • /
    • v.42 no.1
    • /
    • pp.65-72
    • /
    • 2016
  • Curling is compared to the Chess because of variety and importance of strategies. For winning the Curling game, selecting optimal strategies at decision making points are important. However, there is lack of research on optimal strategies for Curling. 'Aggressive' and 'Conservative' strategies are common strategies of Curling; nevertheless, even those two strategies have never been studied before. In this study, Markov Decision Process would be applied for Curling strategy analysis. Those two strategies are defined as actions of Markov Decision Process. By solving the model, the optimal strategy could be found at any in-game states.

A Markov Game based QoS Control Scheme for the Next Generation Internet of Things (미래 사물인터넷을 위한 마르코프 게임 기반의 QoS 제어 기법)

  • Kim, Sungwook
    • Journal of KIISE
    • /
    • v.42 no.11
    • /
    • pp.1423-1429
    • /
    • 2015
  • The Internet of Things (IoT) is a new concept associated with the future Internet, and it has recently become a popular concept to build a dynamic, global network infrastructure. However, the deployment of IoT creates difficulties in satisfying different Quality of Service (QoS) requirements and achieving rapid service composition and deployment. In this paper, we propose a new QoS control scheme for IoT systems. The Markov game model is applied in our proposed scheme to effectively allocate IoT resources while maximizing system performance. The results of our study are validated by running a simulation to prove that the proposed scheme can promptly evaluate current IoT situations and select the best action. Thus, our scheme approximates the optimum system performance.

Some Recent Results of Approximation Algorithms for Markov Games and their Applications

  • 장형수
    • Proceedings of the Korean Society of Computational and Applied Mathematics Conference
    • /
    • 2003.09a
    • /
    • pp.15-15
    • /
    • 2003
  • We provide some recent results of approximation algorithms for solving Markov Games and discuss their applications to problems that arise in Computer Science. We consider a receding horizon approach as an approximate solution to two-person zero-sum Markov games with an infinite horizon discounted cost criterion. We present error bounds from the optimal equilibrium value of the game when both players take “correlated” receding horizon policies that are based on exact or approximate solutions of receding finite horizon subgames. Motivated by the worst-case optimal control of queueing systems by Altman, we then analyze error bounds when the minimizer plays the (approximate) receding horizon control and the maximizer plays the worst case policy. We give two heuristic examples of the approximate receding horizon control. We extend “parallel rollout” and “hindsight optimization” into the Markov game setting within the framework of the approximate receding horizon approach and analyze their performances. From the parallel rollout approach, the minimizing player seeks to combine dynamically multiple heuristic policies in a set to improve the performances of all of the heuristic policies simultaneously under the guess that the maximizing player has chosen a fixed worst-case policy. Given $\varepsilon$>0, we give the value of the receding horizon which guarantees that the parallel rollout policy with the horizon played by the minimizer “dominates” any heuristic policy in the set by $\varepsilon$, From the hindsight optimization approach, the minimizing player makes a decision based on his expected optimal hindsight performance over a finite horizon. We finally discuss practical implementations of the receding horizon approaches via simulation and applications.

  • PDF

A redistribution model for spatially dependent Parrondo games (공간의존 파론도 게임의 재분배 모형)

  • Lee, Jiyeon
    • Journal of the Korean Data and Information Science Society
    • /
    • v.27 no.1
    • /
    • pp.121-130
    • /
    • 2016
  • An ansemble of N players arranged in a circle play a spatially dependent Parrondo game B. One player is randomly selected to play game B, which is based on the toss of a biased coin, with the amount of the bias depending on states of the selected player's two nearest neighbors. The player wins one unit with heads and loses one unit with tails. In game A' the randomly chosen player transfers one unit of capital to another player who is randomly chosen among N - 1 players. Game A' is fair with respect to the ensemble's total profit. The games are said to exhibit the Parrondo effect if game B is losing and the random mixture game C is winning and the reverse-Parrondo effect if game B is winning and the random mixture game C is losing. We compute the exact mean profits for games B and C by applying a state space reduction method with lumped Markov chains and we sketch the Parrondo and reverse-Parrondo regions for $3{\leq}N{\leq}6$.

Deep Q-Network based Game Agents (심층 큐 신경망을 이용한 게임 에이전트 구현)

  • Han, Dongki;Kim, Myeongseop;Kim, Jaeyoun;Kim, Jung-Su
    • The Journal of Korea Robotics Society
    • /
    • v.14 no.3
    • /
    • pp.157-162
    • /
    • 2019
  • The video game Tetris is one of most popular game and it is well known that its game rule can be modelled as MDP (Markov Decision Process). This paper presents a DQN (Deep Q-Network) based game agent for Tetris game. To this end, the state is defined as the captured image of the Tetris game board and the reward is designed as a function of cleared lines by the game agent. The action is defined as left, right, rotate, drop, and their finite number of combinations. In addition to this, PER (Prioritized Experience Replay) is employed in order to enhance learning performance. To train the network more than 500000 episodes are used. The game agent employs the trained network to make a decision. The performance of the developed algorithm is validated via not only simulation but also real Tetris robot agent which is made of a camera, two Arduinos, 4 servo motors, and artificial fingers by 3D printing.