Extensive Form Game

14,000,000 Leading Edge Experts on the ideXlab platform

Scan Science and Technology

Contact Leading Edge Experts & Companies

Scan Science and Technology

Contact Leading Edge Experts & Companies

The Experts below are selected from a list of 4752 Experts worldwide ranked by ideXlab platform

Tuomas Sandholm - One of the best experts on this subject based on the ideXlab platform.

  • bandit linear optimization for sequential decision making and Extensive Form Games
    arXiv: Computer Science and Game Theory, 2021
    Co-Authors: Gabriele Farina, Robin Schmucker, Tuomas Sandholm
    Abstract:

    Tree-Form sequential decision making (TFSDM) extends classical one-shot decision making by modeling tree-Form interactions between an agent and a potentially adversarial environment. It captures the online decision-making problems that each player faces in an Extensive-Form Game, as well as Markov decision processes and partially-observable Markov decision processes where the agent conditions on observed history. Over the past decade, there has been considerable effort into designing online optimization methods for TFSDM. Virtually all of that work has been in the full-feedback setting, where the agent has access to counterfactuals, that is, inFormation on what would have happened had the agent chosen a different action at any decision node. Little is known about the bandit setting, where that assumption is reversed (no counterfactual inFormation is available), despite this latter setting being well understood for almost 20 years in one-shot decision making. In this paper, we give the first algorithm for the bandit linear optimization problem for TFSDM that offers both (i) linear-time iterations (in the size of the decision tree) and (ii) $O(\sqrt{T})$ cumulative regret in expectation compared to any fixed strategy, at all times $T$. This is made possible by new results that we derive, which may have independent uses as well: 1) geometry of the dilated entropy regularizer, 2) autocorrelation matrix of the natural sampling scheme for sequence-Form strategies, 3) construction of an unbiased estimator for linear losses for sequence-Form strategies, and 4) a refined regret analysis for mirror descent when using the dilated entropy regularizer.

  • polynomial time computation of optimal correlated equilibria in two player Extensive Form Games with public chance moves and beyond
    arXiv: Computer Science and Game Theory, 2020
    Co-Authors: Gabriele Farina, Tuomas Sandholm
    Abstract:

    Unlike normal-Form Games, where correlated equilibria have been studied for more than 45 years, Extensive-Form correlation is still generally not well understood. Part of the reason for this gap is that the sequential nature of Extensive-Form Games allows for a richness of behaviors and incentives that are not possible in normal-Form settings. This richness translates to a significantly different complexity landscape surrounding Extensive-Form correlated equilibria. As of today, it is known that finding an optimal Extensive-Form correlated equilibrium (EFCE), Extensive-Form coarse correlated equilibrium (EFCCE), or normal-Form coarse correlated equilibrium (NFCCE) in a two-player Extensive-Form Game is computationally tractable when the Game does not include chance moves, and intractable when the Game involves chance moves. In this paper we significantly refine this complexity threshold by showing that, in two-player Games, an optimal correlated equilibrium can be computed in polynomial time, provided that a certain condition is satisfied. We show that the condition holds, for example, when all chance moves are public, that is, both players observe all chance moves. This implies that an optimal EFCE, EFCCE and NFCCE can be computed in polynomial time in the Game size in two-player Games with public chance moves, providing the biggest positive complexity result surrounding Extensive-Form correlation in more than a decade.

  • faster algorithms for Extensive Form Game solving via improved smoothing functions
    Mathematical Programming, 2020
    Co-Authors: Christian Kroer, Kevin Waugh, Fatma Kilinckarzan, Tuomas Sandholm
    Abstract:

    Sparse iterative methods, in particular first-order methods, are known to be among the most effective in solving large-scale two-player zero-sum Extensive-Form Games. The convergence rates of these methods depend heavily on the properties of the distance-generating function that they are based on. We investigate both the theoretical and practical perFormance improvement of first-order methods (FOMs) for solving Extensive-Form Games through better design of the dilated entropy function—a class of distance-generating functions related to the domains associated with the Extensive-Form Games. By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential Games that has only a logarithmic dependence on the branching factor of the player. This result improves the overall convergence rate of several FOMs working with dilated entropy function by a factor of $$\Omega (b^dd)$$, where b is the branching factor of the player, and d is the depth of the Game tree. Thus far, counterfactual regret minimization methods have been faster in practice, and more popular, than FOMs despite their theoretically inferior convergence rates. Using our new weighting scheme and a practical parameter tuning procedure we show that, for the first time, the excessive gap technique, a classical FOM, can be made faster than the counterfactual regret minimization algorithm in practice for large Games, and that the aggressive stepsize scheme of CFR+ is the only reason that the algorithm is faster in practice.

  • a unified framework for Extensive Form Game abstraction with bounds
    Neural Information Processing Systems, 2018
    Co-Authors: Christian Kroer, Tuomas Sandholm
    Abstract:

    Abstraction has long been a key component in the practical solving of large-scale Extensive-Form Games. Despite this, abstraction remains poorly understood. There have been some recent theoretical results but they have been confined to specific assumptions on abstraction structure and are specific to various disjoint types of abstraction, and specific solution concepts, for example, exact Nash equilibria or strategies with bounded immediate regret. In this paper we present a unified framework for analyzing abstractions that can express all types of abstractions and solution concepts used in prior papers with perFormance guarantees---while maintaining comparable bounds on abstraction quality. Moreover, our framework gives an exact decomposition of abstraction error in a much broader class of Games, albeit only in an ex-post sense, as our results depend on the specific strategy chosen. Nonetheless, we use this ex-post decomposition along with slightly weaker assumptions than prior work to derive generalizations of prior bounds on abstraction quality. We also show, via counterexample, that such assumptions are necessary for some Games. Finally, we prove the first bounds for how $\epsilon$-Nash equilibria computed in abstractions perForm in the original Game. This is important because often one cannot afford to compute an exact Nash equilibrium in the abstraction. All our results apply to general-sum n-player Games.

  • smoothing method for approximate Extensive Form perfect equilibrium
    arXiv: Computer Science and Game Theory, 2017
    Co-Authors: Christian Kroer, Tuomas Sandholm
    Abstract:

    Nash equilibrium is a popular solution concept for solving imperfect-inFormation Games in practice. However, it has a major drawback: it does not preclude suboptimal play in branches of the Game tree that are not reached in equilibrium. Equilibrium refinements can mend this issue, but have experienced little practical adoption. This is largely due to a lack of scalable algorithms. Sparse iterative methods, in particular first-order methods, are known to be among the most effective algorithms for computing Nash equilibria in large-scale two-player zero-sum Extensive-Form Games. In this paper, we provide, to our knowledge, the first extension of these methods to equilibrium refinements. We develop a smoothing approach for behavioral perturbations of the convex polytope that encompasses the strategy spaces of players in an Extensive-Form Game. This enables one to compute an approximate variant of Extensive-Form perfect equilibria. Experiments show that our smoothing approach leads to solutions with dramatically stronger strategies at inFormation sets that are reached with low probability in approximate Nash equilibria, while retaining the overall convergence rate associated with fast algorithms for Nash equilibrium. This has benefits both in approximate equilibrium finding (such approximation is necessary in practice in large Games) where some probabilities are low while possibly heading toward zero in the limit, and exact equilibrium computation where the low probabilities are actually zero.

David K Levine - One of the best experts on this subject based on the ideXlab platform.

  • steady state learning and nash equilibrium
    Econometrica, 1993
    Co-Authors: Drew Fudenberg, David K Levine
    Abstract:

    We study the steady states of a system in which players learn about the strategies their opponents are playing by updating their Bayesian priors in light of their observations. Players are matched at random to play a fixed Extensive-Form Game, and each player observes the realized actions in his own matches, but not the intended off-path play of his opponents or the realized actions in other matches. Because players are assumed to live finite lives, there are steady states in which learning continually takes place. If lifetimes are long and players are very patient, the steady state distribution of actions approximates that of a Nash equilibrium.

  • steady state learning and nash equilibrium
    1993
    Co-Authors: Drew Fudenberg, David K Levine
    Abstract:

    The authors study the steady states of a system in which players learn about the strategies their opponents are playing by updating their Bayesian priors in light of their observations. Players are matched.at random to play a fixed Extensive-Form Game and each player observes the realized actions in his own matches but not the intended off-path play of his opponents or the realized actions in other matches. Because players are assumed to live finite lives, there are steady states in which learning continually takes place. If lifetimes are long and players are very patient, the steady state distribution of actions approximates those of a Nash equilibrium. Copyright 1993 by The Econometric Society. (This abstract was borrowed from another version of this item.)

Pedro A Ortega - One of the best experts on this subject based on the ideXlab platform.

  • subjectivity bayesianism and causality
    Pattern Recognition Letters, 2015
    Co-Authors: Pedro A Ortega
    Abstract:

    "Subjectivity" is studied by comparing Lacanian and Bayesian probability theory.Causality is explained as arising from a two-player Game with imperfect inFormation.A measure-theoretic model of the interactive subject is introduced. Display Omitted Bayesian probability theory is one of the most successful frameworks to model reasoning under uncertainty. Its defining property is the interpretation of probabilities as degrees of belief in propositions about the state of the world relative to an inquiring subject. This essay examines the notion of subjectivity by drawing parallels between Lacanian theory and Bayesian probability theory, and concludes that the latter must be enriched with causal interventions to model agency. The central contribution of this work is an abstract model of the subject that accommodates causal interventions in a measure-theoretic Formalisation. This Formalisation is obtained through a Game-theoretic Ansatz based on modelling the inside and outside of the subject as an Extensive-Form Game with imperfect inFormation between two players. Finally, I illustrate the expressiveness of this model with an example of causal induction.

  • subjectivity bayesianism and causality
    arXiv: Artificial Intelligence, 2014
    Co-Authors: Pedro A Ortega
    Abstract:

    Bayesian probability theory is one of the most successful frameworks to model reasoning under uncertainty. Its defining property is the interpretation of probabilities as degrees of belief in propositions about the state of the world relative to an inquiring subject. This essay examines the notion of subjectivity by drawing parallels between Lacanian theory and Bayesian probability theory, and concludes that the latter must be enriched with causal interventions to model agency. The central contribution of this work is an abstract model of the subject that accommodates causal interventions in a measure-theoretic Formalisation. This Formalisation is obtained through a Game-theoretic Ansatz based on modelling the inside and outside of the subject as an Extensive-Form Game with imperfect inFormation between two players. Finally, I illustrate the expressiveness of this model with an example of causal induction.

Christian Kroer - One of the best experts on this subject based on the ideXlab platform.

  • faster algorithms for Extensive Form Game solving via improved smoothing functions
    Mathematical Programming, 2020
    Co-Authors: Christian Kroer, Kevin Waugh, Fatma Kilinckarzan, Tuomas Sandholm
    Abstract:

    Sparse iterative methods, in particular first-order methods, are known to be among the most effective in solving large-scale two-player zero-sum Extensive-Form Games. The convergence rates of these methods depend heavily on the properties of the distance-generating function that they are based on. We investigate both the theoretical and practical perFormance improvement of first-order methods (FOMs) for solving Extensive-Form Games through better design of the dilated entropy function—a class of distance-generating functions related to the domains associated with the Extensive-Form Games. By introducing a new weighting scheme for the dilated entropy function, we develop the first distance-generating function for the strategy spaces of sequential Games that has only a logarithmic dependence on the branching factor of the player. This result improves the overall convergence rate of several FOMs working with dilated entropy function by a factor of $$\Omega (b^dd)$$, where b is the branching factor of the player, and d is the depth of the Game tree. Thus far, counterfactual regret minimization methods have been faster in practice, and more popular, than FOMs despite their theoretically inferior convergence rates. Using our new weighting scheme and a practical parameter tuning procedure we show that, for the first time, the excessive gap technique, a classical FOM, can be made faster than the counterfactual regret minimization algorithm in practice for large Games, and that the aggressive stepsize scheme of CFR+ is the only reason that the algorithm is faster in practice.

  • a unified framework for Extensive Form Game abstraction with bounds
    Neural Information Processing Systems, 2018
    Co-Authors: Christian Kroer, Tuomas Sandholm
    Abstract:

    Abstraction has long been a key component in the practical solving of large-scale Extensive-Form Games. Despite this, abstraction remains poorly understood. There have been some recent theoretical results but they have been confined to specific assumptions on abstraction structure and are specific to various disjoint types of abstraction, and specific solution concepts, for example, exact Nash equilibria or strategies with bounded immediate regret. In this paper we present a unified framework for analyzing abstractions that can express all types of abstractions and solution concepts used in prior papers with perFormance guarantees---while maintaining comparable bounds on abstraction quality. Moreover, our framework gives an exact decomposition of abstraction error in a much broader class of Games, albeit only in an ex-post sense, as our results depend on the specific strategy chosen. Nonetheless, we use this ex-post decomposition along with slightly weaker assumptions than prior work to derive generalizations of prior bounds on abstraction quality. We also show, via counterexample, that such assumptions are necessary for some Games. Finally, we prove the first bounds for how $\epsilon$-Nash equilibria computed in abstractions perForm in the original Game. This is important because often one cannot afford to compute an exact Nash equilibrium in the abstraction. All our results apply to general-sum n-player Games.

  • smoothing method for approximate Extensive Form perfect equilibrium
    arXiv: Computer Science and Game Theory, 2017
    Co-Authors: Christian Kroer, Tuomas Sandholm
    Abstract:

    Nash equilibrium is a popular solution concept for solving imperfect-inFormation Games in practice. However, it has a major drawback: it does not preclude suboptimal play in branches of the Game tree that are not reached in equilibrium. Equilibrium refinements can mend this issue, but have experienced little practical adoption. This is largely due to a lack of scalable algorithms. Sparse iterative methods, in particular first-order methods, are known to be among the most effective algorithms for computing Nash equilibria in large-scale two-player zero-sum Extensive-Form Games. In this paper, we provide, to our knowledge, the first extension of these methods to equilibrium refinements. We develop a smoothing approach for behavioral perturbations of the convex polytope that encompasses the strategy spaces of players in an Extensive-Form Game. This enables one to compute an approximate variant of Extensive-Form perfect equilibria. Experiments show that our smoothing approach leads to solutions with dramatically stronger strategies at inFormation sets that are reached with low probability in approximate Nash equilibria, while retaining the overall convergence rate associated with fast algorithms for Nash equilibrium. This has benefits both in approximate equilibrium finding (such approximation is necessary in practice in large Games) where some probabilities are low while possibly heading toward zero in the limit, and exact equilibrium computation where the low probabilities are actually zero.

  • faster first order methods for Extensive Form Game solving
    Economics and Computation, 2015
    Co-Authors: Christian Kroer, Kevin Waugh, Fatma Kilinckarzan, Tuomas Sandholm
    Abstract:

    We study the problem of computing a Nash equilibrium in large-scale two-player zero-sum Extensive-Form Games. While this problem can be solved in polynomial time, first-order or regret-based methods are usually preferred for large Games. Regret-based methods have largely been favored in practice, in spite of their theoretically inferior convergence rates. In this paper we investigate the acceleration of first-order methods both theoretically and experimentally. An important component of many first-order methods is a distance-generating function. Motivated by this, we investigate a specific distance-generating function, namely the dilated entropy function, over treeplexes, which are convex polytopes that encompass the strategy spaces of perfect-recall Extensive-Form Games. We develop significantly stronger bounds on the associated strong convexity parameter. In terms of Extensive-Form Game solving, this improves the convergence rate of several first-order methods by a factor of O((#inFormation sets ⋅ depth ⋅ M)/(2depth)) where M is the maximum value of the l1 norm over the treeplex encoding the strategy spaces. Experimentally, we investigate the perFormance of three first-order methods (the excessive gap technique, mirror prox, and stochastic mirror prox) and compare their perFormance to the regret-based algorithms. In order to instantiate stochastic mirror prox, we develop a class of gradient sampling schemes for Game trees. Equipped with our distance-generating function and sampling scheme, we find that mirror prox and the excessive gap technique outperForm the prior regret-based methods for finding medium accuracy solutions

  • Extensive Form Game imperfect recall abstractions with bounds
    arXiv: Computer Science and Game Theory, 2014
    Co-Authors: Christian Kroer, Tuomas Sandholm
    Abstract:

    Imperfect-recall abstraction has emerged as the leading paradigm for practical large-scale equilibrium computation in incomplete-inFormation Games. However, imperfect-recall abstractions are poorly understood, and only weak algorithm-specific guarantees on solution quality are known. In this paper, we show the first general, algorithm-agnostic, solution quality guarantees for Nash equilibria and approximate self-trembling equilibria computed in imperfect-recall abstractions, when implemented in the original (perfect-recall) Game. Our results are for a class of Games that generalizes the only previously known class of imperfect-recall abstractions where any results had been obtained. Further, our analysis is tighter in two ways, each of which can lead to an exponential reduction in the solution quality error bound. We then show that for Extensive-Form Games that satisfy certain properties, the problem of computing a bound-minimizing abstraction for a single level of the Game reduces to a clustering problem, where the increase in our bound is the distance function. This reduction leads to the first imperfect-recall abstraction algorithm with solution quality bounds. We proceed to show a divide in the class of abstraction problems. If payoffs are at the same scale at all inFormation sets considered for abstraction, the input Forms a metric space. Conversely, if this condition is not satisfied, we show that the input does not Form a metric space. Finally, we use these results to experimentally investigate the quality of our bound for single-level abstraction.

Drew Fudenberg - One of the best experts on this subject based on the ideXlab platform.

  • steady state learning and nash equilibrium
    Econometrica, 1993
    Co-Authors: Drew Fudenberg, David K Levine
    Abstract:

    We study the steady states of a system in which players learn about the strategies their opponents are playing by updating their Bayesian priors in light of their observations. Players are matched at random to play a fixed Extensive-Form Game, and each player observes the realized actions in his own matches, but not the intended off-path play of his opponents or the realized actions in other matches. Because players are assumed to live finite lives, there are steady states in which learning continually takes place. If lifetimes are long and players are very patient, the steady state distribution of actions approximates that of a Nash equilibrium.

  • steady state learning and nash equilibrium
    1993
    Co-Authors: Drew Fudenberg, David K Levine
    Abstract:

    The authors study the steady states of a system in which players learn about the strategies their opponents are playing by updating their Bayesian priors in light of their observations. Players are matched.at random to play a fixed Extensive-Form Game and each player observes the realized actions in his own matches but not the intended off-path play of his opponents or the realized actions in other matches. Because players are assumed to live finite lives, there are steady states in which learning continually takes place. If lifetimes are long and players are very patient, the steady state distribution of actions approximates those of a Nash equilibrium. Copyright 1993 by The Econometric Society. (This abstract was borrowed from another version of this item.)