Algorithmic Game Theory

game theory
- we want to mathematically model the interactions between rational agents
normal-form game … $(P,A,u)$ $(P, A, u)$
- $P$ … finite set of $n$ players
- $A=A_1\times\dots\times A_n$ $A = A_{1} \times \dots \times A_{n}$ … set of action profiles
  - $A$ … set of all possible states of the game
  - $A_i$ … set of actions available to player $i$
- $u=(u_1,\dots,u_n)$ $u = (u_{1}, \dots, u_{n})$
  - $u_i:A\to\mathbb R$ … utility function for player $i$
  - assigns happiness of each player to every possible state of the game
  - sometimes is used cost function instead, $c_i=-u_i$
- knowing the utility function, every player $i$ choose an action $a_i$ from $A_i$ (all players at the same time)
strategies
- prescription how player $i$ chooses his actions from $A_i$
- pure strategy $s_i$ $s_{i}$ of player $i$ $i$ is an action from $A_i$ $A_{i}$
  - "select a single action and play it"
  - pure-strategy profile … $(s_1,\dots,s_n)$ where $s_i\in A_i$ for each player $i$
- mixed strategy $s_i$ $s_{i}$ of player $i$ $i$ is a probability distribution over $A_i$ $A_{i}$
  - $\forall a_i\in A_i:s_i(a_i)\in[0,1]$ so that $\sum_{a_i\in A_i} s_i(a_i)=1$
  - $s_i(a_i)$ … probability that $i$ chooses $a_i$ as their action
  - $S_i$ … set of all mixed strategies of player $i$
  - mixed-strategy profile … $(s_1,\dots,s_n)$ where $s_i\in S_i$ for each player $i$
- every pure strategy is a special case of a mixed strategy
expected payoff
- in $G=(P,A,u)$ , the expected payoff for player $i$ of the mixed-strategy profile $s=(s_1,\dots,s_n)$ is $u_i(s)=\sum_{a=(a_1,\dots,a_n)\in A}u_i(a)\cdot \prod_{j=1}^n s_j(a_j)$
linearity of the expected payoff
- $u_i(s)=\sum_{a_i\in A_i}s_i(a_i)\cdot u_i(a_i;s_{-i})$
- notation
  - $s_{-i}=(s_1,\dots,s_{i-1},s_{i+1},\dots,s_n)$
  - $(s'_i;s_{-i})=(s_1,\dots,s_{i-1},s'_i,s_{i+1},\dots,s_n)$
normal form games – examples
- games for two players
- bimatrix games – we can use two matrices to express utilities
- prisoner's dilemma – testify × remain silent
  - paradoxically, the only stable solution is when both testify
- matching pennies
  - each player chooses heads or tails
  - if pennies match, player 1 wins, otherwise player 2 wins
  - zero-sum game (as well as rock-paper-scissors)
- battle of sexes (manželský spor)
  - (husband, wife)
  - football × opera
  - if they disagree → (0,0)
    - they attend the events separately
  - if they agree
    - (football, football) → (2,1)
    - (opera, opera) → (1,2)
  - there's not a best outcome/solution
- game of chicken (hra na zbabělce)
  - two drives drive towards each other on a collision course
  - turn × go straight
  - (turn, turn) → (0,0)
  - (turn, go straight) → (1,-1)
  - (go straight, turn) → (1,-1)
  - (go straight, go straight) → (-10,-10)
    - both die
best response of player $i$ $i$ to a strategy profile $s_{-i}$ $s_{- i}$
- mixed strategy $s^*_{i}$ such that $u_i(s_i^*;s_{-i})\geq u_i(s'_i;s_{-i})$ for each $s_i'\in S_i$
- if player $i$ knew the strategies of other players, the player would choose this one
- it maximizes his expected payoff if others play $s_{-i}$
Nash equilibrium
- a solution concept
- for a normal-form game $G=(P,A,u)$ of $n$ players, a Nash equilibrium (NE) in $G$ is a strategy profile (…)
remarks
- neither best responses nor Nash equilibria are determined uniquely
- rock-paper-scissors – unique mixed Nash equilibrium (both players play everything with probability 1/3)
- battle of sexes – three Nash equilibria, two pure and one mixed
- in every game, there is a Nash equilibrium
preparations for the proof of Nash's theorem
- the proof is essentially topological
- we need Brouwer fixed-point theorem
- df: for $d\in\mathbb N$ , a subset $X$ of $\mathbb R^d$ is compact $\equiv$ $X$ is closed and bounded
- df: $Y\subseteq\mathbb R^d$ $Y \subseteq R^{d}$ is convex $\equiv$ $\equiv$ every line segment containing two points from $Y$ $Y$ is fully contained in $Y$ $Y$
  - formally: $(\forall x,y\in Y)(\forall t\in[0,1]):tx+(1-t)y\in Y$
- df: for $n$ $n$ affinely independent point s $x_1,\dots,x_n\in\mathbb R^d$ $x_{1}, \dots, x_{n} \in R^{d}$ , an $(n-1)$ $(n - 1)$ -simplex $\Delta_n$ $Δ_{n}$ on $x_1,\dots,x_n$ $x_{1}, \dots, x_{n}$ is the set of convex combinations of the points $x_1,\dots,x_n$ $x_{1}, \dots, x_{n}$
  - each simplex is a compact convex set in $\mathbb R^d$
- lemma
  - (…)
  - “you cannot mess up compactness by using cartesian product”
- Brouwer's Fixed Point Theorem
  - $\forall d\in\mathbb N$ , let $K$ be a non-empty compact convex set in $\mathbb R^d$ and $f:K\to K$ be a continuous mapping
  - then, there exists a fixed point $x_0\in K$ for $f$ , that is, $f(x_0)=x_0$
  - https://www.youtube.com/watch?v=csInNn6pfT4&t=268s
proof of Nash's theorem
- let $G=(P,A,u)$ be a normal-form game of $n$ players
- $S_i$ … set of mixed strategies of player $i$
- we want to apply Brouwser's theorem, thus we need to find a suitable compact convex body $K$ and a continuous mapping $f:K\to K$ whose fixed points are NE in $G$
- we start with $K$ $K$ , let $K=S_1\times\dots\times S_n$ $K = S_{1} \times \dots \times S_{n}$ be the set of all mixed strategies
  - we verify that $K$ is compact and convex
  - by definition, each $S_i$ is a simplex which is compact and convex
  - by lemma, the set $K$ is compact
  - for any strategy profiles $s\in K$ and $s'\in K$ and a number $t\in [0,1]$
  - …
- …
Nash's theorem requires finite numbers of players and actions
- consider 2-player game “who guesses larger number wins”
the proof is non-constructive
- how to find NE efficiently?
- proofs of Brouwer's fixed point theorem are non-constructive as well
Pareto optimality
- another solution concept
- we want to capture “the best” state of a game, it might be difficult (as in battle of sexes)
- a strategy profile $s$ $s$ in $G$ $G$ Pareto dominates $s'$ $s^{'}$ , written $s'\prec s$ $s^{'} ≺ s$ if, for every player $i$ $i$ , $u_i(s)\geq u_i(s')$ $u_{i} (s) \geq u_{i} (s^{'})$ and there exists a player $j$ $j$ such that $u_j(s)\gt u_j(s')$ $u_{j} (s) > u_{j} (s^{'})$
  - …
- a strategy profile $s\in S$ $s \in S$ is Pareto optimal if there does not exist another strategy profile $s'\in S$ $s^{'} \in S$ that Pareto dominates $s$ $s$
  - in zero-sum games, all strategy profiles are Pareto-optimal
  - not all NE are Pareto-optimal (the NE in Prisoner's dilemma)
- Vilfredo Pareto
  - Pareto principle: for many outcomes, roughly 80% of consequences come from 20% of the causes
finding Nash equilibria
- let's start with two-player zero-sum games
  - rock paper scissors, chess, table tennis
- let $G=(P,A=A_1\times A_2,u)$ $G = (P, A = A_{1} \times A_{2}, u)$ be a zero-sum game
  - that is $u_1(a)+u_2(a)=0$ for every $a\in A$
  - we can use one payoff matrix $M$ where $M_{ij}=u_1(i,j)=-u_2(i,j)$
  - for a strategy profile $(s_1,s_2)$ , we write $x_i=s_1(i)$ and $y_j=s_2(j)$ representing $(s_1,s_2)$ with mixed strategy vectors $x=(x_1,\dots,x_m)$ and $y=(y_1,\dots,y_n)$ that satisfy $\sum x_i=1$ and $\sum y_j=1$
  - the expected payoff of player 1 then equals $u_1(s)=\sum_{a=(i,j)\in A}u_1(a)s_1(i)s_2(j)=\sum_{i}\sum_j M_{ij} x_iy_j=x^TMy=-u_2(s)$
- player's 2 best response to a strategy $x$ $x$ of 1 is a vector $y\in S_2$ $y \in S_{2}$ that minimizes $x^TMy$ $x^{T} M y$
  - $\beta(x)=\min_y x^TMy$ … best expected payoff of 2
- player's 1 best response to a strategy $y$ $y$ of 2 is a vector $x\in S_1$ $x \in S_{1}$ that maximizes $x^TMy$ $x^{T} M y$
  - $\alpha(y)=\max_x x^TMy$
- …
- the minimax theorem
  - for every zero-sum game, worst-case optimal strategies for both players exist and can be efficiently computed
  - there is a number $v$ such that, for any worst-case optimal strategies $x^*$ and $y^*$ , the strategy profile $(x^*,y^*)$ is a Nash equilibrium and $\beta(x^*)=(x^*)^TMy^*=\alpha(y^*)=v$
- remarks
  - it was proved by John von Neuman
  - it tells us everything about zero-sum games
  - there are no secrets in zero-sum games – each player can always choose the worst-case optimal strategy
  - original proof uses Brouwer's theorem, we will use linear programming
- proof of the minimax theorem
  - geometry preliminaries
    - hyperplane
    - halfspace
    - polyhedron – intersection of finitely many halfspaces
    - polytope – bounded polyhedron
    - $d$ -dimensional polytope is simple … all its vertices are adjacent to exactly $d$ edges
  - linear programming preliminaries
    - constraints, objective function
    - Simplex method, Ellipsoid method
    - duality
  - the proof
    - we'll use the duality
    - first, we find the best response, actually the worst-case best response
    - …
- bimatrix games
  - example: Prisoner's dilemma, collaborative projects
  - brute force algorithm (to find NE)
    - best response condition
  - best response polyhedra
    - labels
  - analysis of the NASH problem (finding nash equilibria in bimatrix games)
    - as a decision problem it is trivial – there always exists a Nash equilibrium
      - so it is not NP complete
    - FNP complexity class (“functional NP”)
    - clearly, NASH belongs to FNP
      - checking whether a strategy profile is NE can be done using the best response condition
    - theorem: if NASH is FNP-complete, then NP = coNP
      - proof is in the lecutre notes (it is not required)
    - therefore, it is unlikely that NASH is FNP-complete
    - new complexity class
    - END-OF-THE-LINE problem
      - the graph can be exponentially large with respect to the input size
    - class PPAD
other solution concepts
- $\varepsilon$ -NE
- correlated equilibrium
  - linear program
    - any feasible solution will be a correlated equilibrium
    - we can construct the objective function to maximize the general wellbeing
regret minimization of an agent in a hostile environment
- we have an agent $A$ in an adversary environment
- there are $N$ available actions for $A$ in the set $X=\set{1,\dots,N}$
- at each step $t=1,\dots,T$ $t = 1, \dots, T$
  - our agent selects a probability distribution corresponding to the probability of choosing each action $a\in A$
  - adversary chooses a loss vector, loss $\in [-1,1]$
- to be able to tell how well is our agent doing, we compare his loss to the loss of the best agent for some comparison class $\mathcal A$ $A$
  - we will mostly consider the class $\mathcal A_X=\set{A_i:i\in X}$ , where an agent $A_i$ always chooses action $i$
- external regret … $R^T_A=L^T_A-\min\set{L_B^T:B\in\mathcal A_X}=L_A^T-\min\set{L^T_i:i\in X}$
- for now, we will consider loss $\in\set{0,1}$
- it might seem that the class $\mathcal A_X$ $A_{X}$ contains too simple agents
  - however, we show that large comparison classes lead to a very large regret
  - let $\mathcal A_\text{all}$ be the set of agents that assign probability 1 to an arbitrary action from $X$ in every step
  - …
- algorithms
  - greedy
  - randomized greedy
  - polynomial weights
- no-regret dynamics
- modern proof of the minimax theorem
- coarse correlated equilibrium
  - CE is always CCE
  - but CCE does not imply CE
- hierarchy of Nash equilibria
- converging to CCE
- regrets
  - external regret
    - we compare to agents playing only one action
  - internal regret
    - we compare to agents that play the same actions as my agent (with same probabilities), only replacing one action by some other one
  - swap regret
    - the same as internal but we allow to replace multiple actions (while keeping the probabilities)
- we have polynomial weights algorithm for external regret
  - how to construct algorithm for swap regret?
  - we will use black-box reduction
games in extensive form
- chess … extensive form = tree of the game
- in perfect-information game, all players know the node they are in
- in imperfect-information game, they don't
  - we partition decision nodes into information sets where all nodes belong to the same player and have the same moves
- we can convert any normal-form game of two players to an imperfect-information game in extensive form
- pure, mixed and behavioral strategies
  - mixed strategy = probability distribution over vectors
  - behavioral strategy = vector of probability distributions
- in general, the expressive power of behavioral strategies and mixed strategies are incomparable – but there is a class of extensive games for which the two definitions coincide
  - games of perfect recall
- finding NE in extensive games
- sequence form

Mechanism design

single item auctions
- seller
- $n$ $n$ bidders
  - each has valuation $v_i$
  - each privately communicates a bid $b_i$ to the seller
- seller decides
  - who receives the item (if anyone)
  - the selling price $p$
- bidders' utility
  - $u_i=0$ … if the bidder lost the auction
  - $u_i=v_i-p$ $u_{i} = v_{i} - p$ … if the bidder won
    - we will eventually assume that $p\leq v_i$
- design
  - selling the item for free to the bidder with the highest bid
    - not a good idea
    - “who can name the highest number”
  - selling the item to the bidder with the highest bid for the price equal to the bid
    - does not work that well
    - what should be the bid? the bidder does not know what to do
  - dominant strategy for bidder $i$ … strategy that maximizes $u_i$ no matter what the other bidders do
  - social surplus … $\sum_i v_ix_i$ $\sum_{i} v_{i} x_{i}$
    - $x_i$ is the indicator of the bidder winning the auction
    - to maximize the social surplus for single item auction, we need the bidder with the highest valuation to win
  - an auction is awesome if it satisifies…
    - strong incentive guarantees
      - DSIC
    - strong performance guarantees
    - computational efficiency
  - there exists an awesome auction!
    - Vickrey's auction
    - the winner pays the second highest bid
single parameter environments
- $v_i$ … valuation “per unit of the goods”
Myerson's lemma
maximizing expected revenue
virtual valuation
- $\varphi_i(x)=x-\frac{1-F}{f_i(x)}$
- where $\frac{1-F}{f_i(x)}$ can be regarded as the loss of not knowing the exact valuation
Bulow-Klemperer theorem
back to social surplus: knapsack auctions
- $n$ bidders
- each bidder has
  - public size $w_i\geq 0$
  - private valuation $v_i\geq 0$
- seller has a capacity $W\geq 0$
- feasible set $X$ … 0/1-vectors such that $\sum_ix_iw_i\leq W$
- we cannot be awesome :(
multi-parameter mechanism design
- $n$ bidders
- $\Omega$ … finite set of outcomes
- $\forall \omega\in\Omega:$ each bidder has a private valuation $v_i(\omega)\geq 0$
- $\forall\omega\in\Omega:$ each bidder submits his bids $b_i(\omega)\geq 0$
- our goal is to design a mechanism that selects an outcome $\omega\in\Omega$ so that it maximizes the social surplus $\sum_iv_i(\omega)$
- bidder A winning and bidder B winning are two separate outcomes → bidders can have opinions about each other bidder winning the item as well
- example: single-item auction
  - $\Omega=\set{\omega_1,\dots,\omega_n,\omega_\emptyset}$
- VCG mechanism
  - states that we can do DSIC social surplus maximization
- note: allocation rule can assume that the bidders bid truthfully if we consider DSIC – but the payment formula has to provide DSIC
proof of Myerson's lemma
- payment difference sandwich
  - $y,z$ are not functions but only coefficients (bids)
  - $x_i(y;b_{-i})-x_i(z;b_{-i})$ $x_{i} (y; b_{- i}) - x_{i} (z; b_{- i})$ has to be non-positive
    - because $z\cdot()\leq y\cdot()$ and $y\lt z$
revelation principle
- is DSIC too restrictive?
- if there is a dominant strategy, we can assume that it is truthtelling