Zero Sum Games

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 309 Experts worldwide ranked by ideXlab platform

Christopher Umans - One of the best experts on this subject based on the ideXlab platform.

  • On the Complexity of Succinct Zero-Sum Games
    computational complexity, 2008
    Co-Authors: Lance Fortnow, Russell Impagliazzo, Valentine Kabanets, Christopher Umans
    Abstract:

    We study the complexity of solving succinct Zero-Sum Games, i.e., the Games whose payoff matrix M is given implicitly by a Boolean circuit C such that M ( i , j ) = C ( i , j ). We complement the known EXP -hardness of computing the exact value of a succinct Zero-Sum game by several results on approximating the value. (1) We prove that approximating the value of a succinct Zero-Sum game to within an additive error is complete for the class promise - $$S^{p}_{2}$$ , the “promise” version of $$S^{p}_{2}$$ . To the best of our knowledge, it is the first natural problem shown complete for this class. (2) We describe a ZPP ^ NP algorithm for constructing approximately optimal strategies, and hence for approximating the value, of a given succinct Zero-Sum game. As a corollary, we obtain, in a uniform fashion, several complexity-theoretic results, e.g., a ZPP ^ NP algorithm for learning circuits for SAT (Bshouty et al., JCSS , 1996) and a recent result by Cai ( JCSS , 2007) that $$S^{p}_{2} \subseteq$$ ZPP ^ NP . (3) We observe that approximating the value of a succinct Zero-Sum game to within a multiplicative factor is in PSPACE , and that it cannot be in promise - $$S^{p}_{2}$$ unless the polynomial-time hierarchy collapses. Thus, under a reasonable complexity-theoretic asSumption, multiplicative-factor approximation of succinct Zero-Sum Games is strictly harder than additive-error approximation.

  • on the complexity of succinct Zero Sum Games
    Conference on Computational Complexity, 2005
    Co-Authors: Lance Fortnow, Russell Impagliazzo, Valentine Kabanets, Christopher Umans
    Abstract:

    We study the complexity of solving succinct Zero-Sum Games, i.e., the Games whose payoff matrix M is given implicitly by a Boolean circuit C such that M(i, j) = C(i, j). We complement the known EXP-hardness of computing the exact value of a succinct Zero-Sum game by several results on approximating the value. (1) We prove that approximating the value of a succinct Zero-Sum game to within an additive factor is complete for the class promise-S/sub 2//sup p/, the "promise" version of S/sub 2//sup p/. To the best of our knowledge, it is the first natural problem shown complete for this class. (2) We describe a ZPP/sup NP/ algorithm for constructing approximately optimal strategies, and hence for approximating the value, of a given succinct Zero-Sum game. As a corollary, we obtain, in a uniform fashion, several complexity-theoretic results, e.g., a ZPP/sup NP/ algorithm for learning circuits for SAT (Bshouty et al., 1996) and a recent result by Cai (2001) that S/sub 2//sup p/ /spl sube/ ZPP/sup NP/. (3) We observe that approximating the value of a succinct Zero-Sum game to within a multiplicative factor is in PSPACE, and that it cannot be in promise-S/sub 2//sup p/ unless the polynomial-time hierarchy collapses. Thus, under a reasonable complexity-theoretic asSumption, multiplicative factor approximation of succinct Zero-Sum Games is strictly harder than additive factor approximation.

  • On the Complexity of Succinct Zero-Sum Games (Preliminary Version)
    2004
    Co-Authors: Lance Fortnow, Russell Impagliazzo, Valentine Kabanets, Christopher Umans
    Abstract:

    We study the complexity of solving succinct Zero-Sum Games, i.e., the Games whose payoff matrix M is given implicitly by a Boolean circuit C such that M(i, j) = C(i, j). We complement the known EXP-hardness of computing the exact value of a succinct Zero-Sum game by several results on approximating the value. (1) We prove that approximating the value of a succinct Zero-Sum game to within an additive factor is complete for the class promise-S p 2, the “promise”

  • IEEE Conference on Computational Complexity - On the complexity of succinct Zero-Sum Games
    20th Annual IEEE Conference on Computational Complexity (CCC'05), 1
    Co-Authors: Lance Fortnow, Russell Impagliazzo, Valentine Kabanets, Christopher Umans
    Abstract:

    We study the complexity of solving succinct Zero-Sum Games, i.e., the Games whose payoff matrix M is given implicitly by a Boolean circuit C such that M(i, j) = C(i, j). We complement the known EXP-hardness of computing the exact value of a succinct Zero-Sum game by several results on approximating the value. (1) We prove that approximating the value of a succinct Zero-Sum game to within an additive factor is complete for the class promise-S/sub 2//sup p/, the "promise" version of S/sub 2//sup p/. To the best of our knowledge, it is the first natural problem shown complete for this class. (2) We describe a ZPP/sup NP/ algorithm for constructing approximately optimal strategies, and hence for approximating the value, of a given succinct Zero-Sum game. As a corollary, we obtain, in a uniform fashion, several complexity-theoretic results, e.g., a ZPP/sup NP/ algorithm for learning circuits for SAT (Bshouty et al., 1996) and a recent result by Cai (2001) that S/sub 2//sup p/ /spl sube/ ZPP/sup NP/. (3) We observe that approximating the value of a succinct Zero-Sum game to within a multiplicative factor is in PSPACE, and that it cannot be in promise-S/sub 2//sup p/ unless the polynomial-time hierarchy collapses. Thus, under a reasonable complexity-theoretic asSumption, multiplicative factor approximation of succinct Zero-Sum Games is strictly harder than additive factor approximation.

Alexander J. Zaslavski - One of the best experts on this subject based on the ideXlab platform.

  • Dynamic Zero-Sum Games with Linear Constraints
    Turnpike Theory of Continuous-Time Linear Optimal Control Problems, 2015
    Co-Authors: Alexander J. Zaslavski
    Abstract:

    In this chapter we study the existence and turnpike properties of approximate solutions for a class of dynamic continuous-time two-player Zero-Sum Games without using convexity-concavity asSumptions. We describe the structure of approximate solutions which is independent of the length of the interval, for all sufficiently large intervals and show that approximate solutions are determined mainly by the objective function, and are essentially independent of the choice of interval and endpoint conditions.

  • Dynamic Discrete-Time Zero-Sum Games
    Turnpike Phenomenon and Infinite Horizon Optimal Control, 2014
    Co-Authors: Alexander J. Zaslavski
    Abstract:

    In this chapter we study the existence and structure of solutions for dynamic discrete-time two-player Zero-Sum Games and establish a turnpike result. This result describes the structure of approximate solutions which is independent of the length of the interval, for all sufficiently large intervals. We also show that for each initial state there exists a pair of overtaking equilibria strategies over an infinite horizon.

  • Structure of approximate solutions of dynamic continuous time Zero-Sum Games
    Journal of Dynamics & Games, 2014
    Co-Authors: Alexander J. Zaslavski
    Abstract:

    In this paper we study a turnpike property of approximate solutions for a class of dynamic continuous-time two-player Zero-Sum Games. These properties describe the structure of approximate solutions which is independent of the length of the interval, for all sufficiently large intervals.

  • Turnpike properties of approximate solutions of dynamic discrete time Zero-Sum Games
    Journal of Dynamics & Games, 2014
    Co-Authors: Alexander J. Zaslavski
    Abstract:

    We study existence and turnpike properties of approximate solutions for a class of dynamic discrete-time two-player Zero-Sum Games without using convexity-concavity asSumptions. We describe the structure of approximate solutions which is independent of the length of the interval, for all sufficiently large intervals and show that approximate solutions are determined mainly by the objective function, and are essentially independent of the choice of interval and endpoint conditions.

  • The turnpike property for dynamic discrete time Zero-Sum Games
    Abstract and Applied Analysis, 1999
    Co-Authors: Alexander J. Zaslavski
    Abstract:

    We consider a class of dynamic discrete-time two-player Zero-Sum Games. We show that for a generic cost function and each initial state, there exists a pair of overtaking equilibria strategies over an infinite horizon. We also establish that for a generic cost function f, there exists a pair of stationary equilibria strategies (xf,yf) such that each pair of “approximate” equilibria strategies spends almost all of its time in a small neighborhood of (xf,yf).

Georgios Piliouras - One of the best experts on this subject based on the ideXlab platform.

  • Vortices Instead of Equilibria in MinMax Optimization: Chaos and Butterfly Effects of Online Learning in Zero-Sum Games
    arXiv: Computer Science and Game Theory, 2019
    Co-Authors: Yun Kuen Cheung, Georgios Piliouras
    Abstract:

    We establish that algorithmic experiments in Zero-Sum Games "fail miserably" to confirm the unique, sharp prediction of maxmin equilibration. Contradicting nearly a century of economic thought that treats Zero-Sum Games nearly axiomatically as the exemplar symbol of economic stability, we prove that no meaningful prediction can be made about the day-to-day behavior of online learning dynamics in Zero-Sum Games. Concretely, Multiplicative Weights Updates (MWU) with constant step-size is Lyapunov chaotic in the dual (payoff) space. Simply put, let's asSume that an observer asks the agents playing Matching-Pennies whether they prefer Heads or Tails (and by how much in terms of aggregate payoff so far). The range of possible answers consistent with any arbitrary small set of initial conditions blows up exponentially with time everywhere in the payoff space. This result is robust both algorithmically as well as game theoretically: 1) Algorithmic robustness: Chaos is robust to agents using any of a general sub-family of Follow-the-Regularized-Leader (FTRL) algorithms, the well known regret-minimizing dynamics, even when agents mix-and-match dynamics, use different or slowly decreasing step-sizes. 2) Game theoretic robustness: Chaos is robust to all affine variants of Zero-Sum Games (strictly competitive Games), network variants with arbitrary large number of agents and even to competitive settings beyond these. Our result is in stark contrast with the time-average convergence of online learning to (approximate) Nash equilibrium, a result widely reported as "(weak) convergence to equilibrium".

  • AAMAS - Multi-Agent Learning in Network Zero-Sum Games is a Hamiltonian System
    2019
    Co-Authors: James P. Bailey, Georgios Piliouras
    Abstract:

    Zero-Sum Games are natural, if informal, analogues of closed physical systems where no energy/utility can enter or exit. This analogy can be extended even further if we consider Zero-Sum network (polymatrix) Games where multiple agents interact in a closed economy. Typically, (network) Zero-Sum Games are studied from the perspective of Nash equilibria. Nevertheless, this comes in contrast with the way we typically think about closed physical systems, e.g., Earth-moon systems which move perpetually along recurrent trajectories of constant energy. We establish a formal and robust connection between multi-agent systems and Hamiltonian dynamics -- the same dynamics that describe conservative systems in physics. Specifically, we show that no matter the size, or network structure of such closed economies, even if agents use different online learning dynamics from the standard class of Follow-the-Regularized-Leader, they yield Hamiltonian dynamics. This approach generalizes the known connection to Hamiltonians for the special case of replicator dynamics in two agent Zero-Sum Games developed by Hofbauer. Moreover, our results extend beyond Zero-Sum settings and provide a type of a Rosetta stone (see e.g. Table 1) that helps to translate results and techniques between online optimization, convex analysis, Games theory, and physics.

  • Multi-Agent Learning in Network Zero-Sum Games is a Hamiltonian System
    arXiv: Computer Science and Game Theory, 2019
    Co-Authors: James P. Bailey, Georgios Piliouras
    Abstract:

    Zero-Sum Games are natural, if informal, analogues of closed physical systems where no energy/utility can enter or exit. This analogy can be extended even further if we consider Zero-Sum network (polymatrix) Games where multiple agents interact in a closed economy. Typically, (network) Zero-Sum Games are studied from the perspective of Nash equilibria. Nevertheless, this comes in contrast with the way we typically think about closed physical systems, e.g., Earth-moon systems which move perpetually along recurrent trajectories of constant energy. We establish a formal and robust connection between multi-agent systems and Hamiltonian dynamics -- the same dynamics that describe conservative systems in physics. Specifically, we show that no matter the size, or network structure of such closed economies, even if agents use different online learning dynamics from the standard class of Follow-the-Regularized-Leader, they yield Hamiltonian dynamics. This approach generalizes the known connection to Hamiltonians for the special case of replicator dynamics in two agent Zero-Sum Games developed by Hofbauer. Moreover, our results extend beyond Zero-Sum settings and provide a type of a Rosetta stone (see e.g. Table 1) that helps to translate results and techniques between online optimization, convex analysis, Games theory, and physics.

  • NeurIPS - Fast and Furious Learning in Zero-Sum Games: Vanishing Regret with Non-Vanishing Step Sizes
    2019
    Co-Authors: James P. Bailey, Georgios Piliouras
    Abstract:

    We show for the first time that it is possible to reconcile in online learning in Zero-Sum Games two seemingly contradictory objectives: vanishing time-average regret and non-vanishing step sizes. This phenomenon, that we coin ``fast and furious" learning in Games, sets a new benchmark about what is possible both in max-min optimization as well as in multi-agent systems. Our analysis does not depend on introducing a carefully tailored dynamic. Instead we focus on the most well studied online dynamic, gradient descent. Similarly, we focus on the simplest textbook class of Games, two-agent two-strategy Zero-Sum Games, such as Matching Pennies. Even for this simplest of benchmarks the best known bound for total regret, prior to our work, was the trivial one of $O(T)$, which is immediately applicable even to a non-learning agent. Based on a tight understanding of the geometry of the non-equilibrating trajectories in the dual space we prove a regret bound of $\Theta(\sqrt{T})$ matching the well known optimal bound for adaptive step sizes in the online setting. This guarantee holds for all fixed step-sizes without having to know the time horizon in advance and adapt the fixed step-size accordingly.As a corollary, we establish that even with fixed learning rates the time-average of mixed strategies, utilities converge to their exact Nash equilibrium values. We also provide experimental evidence suggesting the stronger regret bound holds for all Zero-Sum Games.

  • EC - Multiplicative Weights Update in Zero-Sum Games
    Proceedings of the 2018 ACM Conference on Economics and Computation, 2018
    Co-Authors: James P. Bailey, Georgios Piliouras
    Abstract:

    We study the classic setting where two agents compete against each other in a Zero-Sum game by applying the Multiplicative Weights Update (MWU) algorithm. In a twist of the standard approach of [Freund and Schapire 1999], we focus on the K-L divergence from the equilibrium but instead of providing an upper bound about the rate of increase we provide a nonnegative lower bound for Games with interior equilibria. This implies movement away from equilibria and towards the boundary. In the case of Zero-Sum Games without interior equilibria convergence to the boundary (and in fact to the minimal product of subsimplexes that contains all Nash equilibria) follows via an orthogonal argument. In that subspace divergence from the set of NE applies for all nonequilibrium initial conditions via the first argument. We argue the robustness of this non-equilibrating behavior by considering the following generalizations: Step size: Agents may be using different and even decreasing step sizes. Dynamics: Agents may be using Follow-the-Regularized-Leader algorithms and possibly apply different regularizers (e.g. MWU versus Gradient Descent). We also consider a linearized version of MWU. More than two agents: Multiple agents can interact via arbitrary networks of Zero-Sum polymatrix Games and their affine variants. Our results come in stark contrast with the standard interpretation of the behavior of MWU (and more generally regret minimizing dynamics) in Zero-Sum Games, which is typically referred to as "converging to equilibrium". If equilibria are indeed predictive even for the benchmark class of Zero-Sum Games, agents in practice must deviate robustly from the axiomatic perspective of optimization driven dynamics as captured by MWU and variants and apply carefully tailored equilibrium-seeking behavioral dynamics.

Sylvain Sorin - One of the best experts on this subject based on the ideXlab platform.

  • Zero-Sum Games: The Finite Case
    Universitext, 2019
    Co-Authors: Rida Laraki, Jérôme Renault, Sylvain Sorin
    Abstract:

    This chapter studies Zero-Sum Games, where the sets of strategies are finite. We prove the minmax theorem of von Neumann, which states that if a game is played with mixed strategies (probability distribution on strategies) then the value exists, as well as an optimal mixed strategy for each player. We then consider extensions such as Loomis’ theorem and Ville’s theorem. Finally, we introduce and study the convergence of the learning process “Fictitious Play”, where the initial game is repeated and each player plays at every period a best response to the average of the strategies played by his opponent in the past.

  • Zero-Sum Games: The General Case
    Universitext, 2019
    Co-Authors: Rida Laraki, Jérôme Renault, Sylvain Sorin
    Abstract:

    This chapter considers general Zero-Sum Games and proves various minmax theorems. We start with Sion’s theorem with geometrical and topological asSumptions on the game: convex compact strategy sets and payoff functions quasi-concave upper semi-continuous in the first variable and quasi-convex lower semi-continuous in the second variable. Then we prove the standard minmax theorem in mixed strategies for measurable bounded payoff functions, extending von Neumann’s theorem, under topological asSumptions: compact Hausdorff strategy sets and payoff functions upper semi-continuous in the first variable and lower semi-continuous in the second variable. We conclude the chapter with the introduction of the value operator and the notion of a derived game, which play an important role in the operator approach to repeated Games.

  • LIMIT VALUE OF DYNAMIC Zero-Sum Games WITH VANISHING STAGE DURATION
    Mathematics of Operations Research, 2018
    Co-Authors: Sylvain Sorin
    Abstract:

    We consider two-person Zero-Sum Games where the players control, at discrete times {tn} induced by a partition Π of ℝ+, a continuous time Markov process. We prove that the limit of the values υΠ exist as the mesh of Π goes to 0. The analysis covers the cases of (1) stochastic Games (where both players know the state), and (2) Games with unknown state and symmetric signals. The proof is by reduction to deterministic differential Games.

  • Limit value of dynamic Zero-Sum Games with vanishing stage duration
    arXiv: Optimization and Control, 2016
    Co-Authors: Sylvain Sorin
    Abstract:

    We consider two person Zero-Sum Games where the players control, at discrete times {tn} induced by a partition $\Pi$ of R + , a continuous time Markov state process. We prove that the limit of the values v$\Pi$ exist as the mesh of $\Pi$ goes to 0. The analysis covers the cases of : 1) stochastic Games (where both players know the state) 2) symmetric no information. The proof is by reduction to a deterministic differential game.

  • BEST RESPONSE DYNAMICS FOR CONTINUOUS Zero{Sum Games
    Discrete & Continuous Dynamical Systems - B, 2006
    Co-Authors: Josef Hofbauer, Sylvain Sorin
    Abstract:

    We study best response dynamics in continuous time for continuous concave-convex Zero-Sum Games and prove convergence of its trajectories to the set of saddle points, thus providing a dynamical proof of the minmax theorem. Consequences for the corresponding discrete time process with small or diminishing step-sizes are established, including convergence of the fictitious play procedure.

Manuela Veloso - One of the best experts on this subject based on the ideXlab platform.

  • AAAI - Thresholded rewards: acting optimally in timed, Zero-Sum Games
    2007
    Co-Authors: Colin Mcmillen, Manuela Veloso
    Abstract:

    In timed, Zero-Sum Games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermediate reward to be the difference between our score and our opponent's score; the "true" reward of a win, loss, or tie is determined at the end of a game by applying a threshold function to the cumulative intermediate reward. We introduce thresholded-rewards problems to capture this dependency of the final reward outcome on the cumulative intermediate reward. Thresholded-rewards problems reflect different real-world stochastic planning domains, especially Zero-Sum Games, in which time and score need to be considered. We investigate the application of thresholded rewards to finite-horizon Markov Decision Processes (MDPs). In general, the optimal policy for a thresholded-rewards MDP will be non-stationary, depending on the number of time steps remaining and the cumulative intermediate reward. We introduce an efficient value iteration algorithm that solves thresholded-rewards MDPs exactly, but with running time quadratic on the number of states in the MDP and the length of the time horizon. We investigate a number of heuristic-based techniques that efficiently find approximate solutions for MDPs with large state spaces or long time horizons.

  • Plenary learning to select team strategies in finite-timed Zero-Sum Games
    2007
    Co-Authors: Manuela Veloso
    Abstract:

    Games, by definition, offer the challenge of the presence of an opponent, to which a playing strategy should respond. In finite-timed Zero-Sum Games, the strategy should enable to win the game within a limited playing time. Motivated by robot soccer, in this talk, we will present several approaches towards learning to select team strategies in such finite-timed Zero-Sum Games. We will introduce an adaptive playbook approach with implicit opponent modeling, in which multiple team strategies are represented as variable weighted plays. We will discuss different plays as a function of different game situations and opponents. In conclusion, we will present an MDP-based learing algorithm to reason in particular about current score and game time left. Through extensive simulated empirical studies, we will demonstrate the effectiveness of the learning approach. In addition, the talk will include illustrative examples from robot soccer. The major part of this work is in conjunction with my PhD student Colin McMillen.

  • AI*IA - Learning to Select Team Strategies in Finite-Timed Zero-Sum Games
    Lecture Notes in Computer Science, 1
    Co-Authors: Manuela Veloso
    Abstract:

    Games, by definition, offer the challenge of the presence of an opponent, to which a playing strategy should respond. In finite-timed Zero-Sum Games, the strategy should enable to win the game within a limited playing time. Motivated by robot soccer, in this talk, we will present several approaches towards learning to select team strategies in such finite-timed Zero-Sum Games. We will introduce an adaptive playbook approach with implicit opponent modeling, in which multiple team strategies are represented as variable weighted plays. We will discuss different plays as a function of different game situations and opponents. In conclusion, we will present an MDP-based learning algorithm to reason in particular about current score and game time left. Through extensive simulated empirical studies, we will demonstrate the effectiveness of the learning approach. In addition, the talk will include illustrative examples from robot soccer. The major part of this work is in conjunction with my PhD student Colin McMillen.