# Algorithmic Game Theory

- game theory
	- we want to mathematically model the interactions between rational agents
- normal-form game … $(P,A,u)$
	- $P$ … finite set of $n$ players
	- $A=A_1\times\dots\times A_n$ … set of action profiles
		- $A$ … set of all possible states of the game
		- $A_i$ … set of actions available to player $i$
	- $u=(u_1,\dots,u_n)$
		- $u_i:A\to\mathbb R$ … utility function for player $i$
		- assigns happiness of each player to every possible state of the game
		- sometimes is used cost function instead, $c_i=-u_i$
	- knowing the utility function, every player $i$ choose an action $a_i$ from $A_i$ (all players at the same time)
- strategies
	- prescription how player $i$ chooses his actions from $A_i$
	- pure strategy $s_i$ of player $i$ is an action from $A_i$
		- "select a single action and play it"
		- pure-strategy profile … $(s_1,\dots,s_n)$ where $s_i\in A_i$ for each player $i$
	- mixed strategy $s_i$ of player $i$ is a probability distribution over $A_i$
		- $\forall a_i\in A_i:s_i(a_i)\in[0,1]$ so that $\sum_{a_i\in A_i} s_i(a_i)=1$
		- $s_i(a_i)$ … probability that $i$ chooses $a_i$ as their action
		- $S_i$ … set of all mixed strategies of player $i$
		- mixed-strategy profile … $(s_1,\dots,s_n)$ where $s_i\in S_i$ for each player $i$
	- every pure strategy is a special case of a mixed strategy
- expected payoff
	- in $G=(P,A,u)$, the expected payoff for player $i$ of the mixed-strategy profile $s=(s_1,\dots,s_n)$ is $$u_i(s)=\sum_{a=(a_1,\dots,a_n)\in A}u_i(a)\cdot \prod_{j=1}^n s_j(a_j)$$
- linearity of the expected payoff
	- $u_i(s)=\sum_{a_i\in A_i}s_i(a_i)\cdot u_i(a_i;s_{-i})$
	- notation
		- $s_{-i}=(s_1,\dots,s_{i-1},s_{i+1},\dots,s_n)$
		- $(s'_i;s_{-i})=(s_1,\dots,s_{i-1},s'_i,s_{i+1},\dots,s_n)$
- normal form games – examples
	- games for two players
	- bimatrix games – we can use two matrices to express utilities
	- prisoner's dilemma – testify × remain silent
		- paradoxically, the only stable solution is when both testify
	- matching pennies
		- each player chooses heads or tails
		- if pennies match, player 1 wins, otherwise player 2 wins
		- zero-sum game (as well as rock-paper-scissors)
	- battle of sexes (manželský spor)
		- (husband, wife)
		- football × opera
		- if they disagree → (0,0)
			- they attend the events separately
		- if they agree
			- (football, football) → (2,1)
			- (opera, opera) → (1,2)
		- there's not a best outcome/solution
	- game of chicken (hra na zbabělce)
		- two drives drive towards each other on a collision course
		- turn × go straight
		- (turn, turn) → (0,0)
		- (turn, go straight) → (1,-1)
		- (go straight, turn) → (1,-1)
		- (go straight, go straight) → (-10,-10)
			- both die
- best response of player $i$ to a strategy profile $s_{-i}$
	- mixed strategy $s^*_{i}$ such that $u_i(s_i^*;s_{-i})\geq u_i(s'_i;s_{-i})$ for each $s_i'\in S_i$
	- if player $i$ knew the strategies of other players, the player would choose this one
	- it maximizes his expected payoff if others play $s_{-i}$
- Nash equilibrium
	- a solution concept
	- for a normal-form game $G=(P,A,u)$ of $n$ players, a Nash equilibrium (NE) in $G$ is a strategy profile (…)
- remarks
	- neither best responses nor Nash equilibria are determined uniquely
	- rock-paper-scissors – unique mixed Nash equilibrium (both players play everything with probability 1/3)
	- battle of sexes – three Nash equilibria, two pure and one mixed
	- in every game, there is a Nash equilibrium
- preparations for the proof of Nash's theorem
	- the proof is essentially topological
	- we need Brouwer fixed-point theorem
	- df: for $d\in\mathbb N$, a subset $X$ of $\mathbb R^d$ is compact $\equiv$ $X$ is closed and bounded
	- df: $Y\subseteq\mathbb R^d$ is convex $\equiv$ every line segment containing two points from $Y$ is fully contained in $Y$
		- formally: $(\forall x,y\in Y)(\forall t\in[0,1]):tx+(1-t)y\in Y$
	- df: for $n$ affinely independent point s $x_1,\dots,x_n\in\mathbb R^d$, an $(n-1)$-simplex $\Delta_n$ on $x_1,\dots,x_n$ is the set of convex combinations of the points $x_1,\dots,x_n$
		- each simplex is a compact convex set in $\mathbb R^d$
	- lemma
		- (…)
		- “you cannot mess up compactness by using cartesian product”
	- Brouwer's Fixed Point Theorem
		- $\forall d\in\mathbb N$, let $K$ be a non-empty compact convex set in $\mathbb R^d$ and $f:K\to K$ be a continuous mapping
		- then, there exists a fixed point $x_0\in K$ for $f$, that is, $f(x_0)=x_0$
		- https://www.youtube.com/watch?v=csInNn6pfT4&t=268s
- proof of Nash's theorem
	- let $G=(P,A,u)$ be a normal-form game of $n$ players
	- $S_i$ … set of mixed strategies of player $i$
	- we want to apply Brouwser's theorem, thus we need to find a suitable compact convex body $K$ and a continuous mapping $f:K\to K$ whose fixed points are NE in $G$
	- we start with $K$, let $K=S_1\times\dots\times S_n$ be the set of all mixed strategies
		- we verify that $K$ is compact and convex
		- by definition, each $S_i$ is a simplex which is compact and convex
		- by lemma, the set $K$ is compact
		- for any strategy profiles $s\in K$ and $s'\in K$ and a number $t\in [0,1]$
		- …
	- …
- Nash's theorem requires finite numbers of players and actions
	- consider 2-player game “who guesses larger number wins”
- the proof is non-constructive
	- how to find NE efficiently?
	- proofs of Brouwer's fixed point theorem are non-constructive as well
- Pareto optimality
	- another solution concept
	- we want to capture “the best” state of a game, it might be difficult (as in battle of sexes)
	- a strategy profile $s$ in $G$ Pareto dominates $s'$, written $s'\prec s$ if, for every player $i$, $u_i(s)\geq u_i(s')$ and there exists a player $j$ such that $u_j(s)\gt u_j(s')$
		- …
	- a strategy profile $s\in S$ is Pareto optimal if there does not exist another strategy profile $s'\in S$ that Pareto dominates $s$
		- in zero-sum games, all strategy profiles are Pareto-optimal
		- not all NE are Pareto-optimal (the NE in Prisoner's dilemma)
	- Vilfredo Pareto
		- Pareto principle: for many outcomes, roughly 80% of consequences come from 20% of the causes
- finding Nash equilibria
	- let's start with two-player zero-sum games
		- rock paper scissors, chess, table tennis
	- let $G=(P,A=A_1\times A_2,u)$ be a zero-sum game
		- that is $u_1(a)+u_2(a)=0$ for every $a\in A$
		- we can use one payoff matrix $M$ where $M_{ij}=u_1(i,j)=-u_2(i,j)$
		- for a strategy profile $(s_1,s_2)$, we write $x_i=s_1(i)$ and $y_j=s_2(j)$ representing $(s_1,s_2)$ with mixed strategy vectors $x=(x_1,\dots,x_m)$ and $y=(y_1,\dots,y_n)$ that satisfy $\sum x_i=1$ and $\sum y_j=1$
		- the expected payoff of player 1 then equals $u_1(s)=\sum_{a=(i,j)\in A}u_1(a)s_1(i)s_2(j)=\sum_{i}\sum_j M_{ij} x_iy_j=x^TMy=-u_2(s)$
	- player's 2 best response to a strategy $x$ of 1 is a vector $y\in S_2$ that minimizes $x^TMy$
		- $\beta(x)=\min_y x^TMy$ … best expected payoff of 2
	- player's 1 best response to a strategy $y$ of 2 is a vector $x\in S_1$ that maximizes $x^TMy$
		- $\alpha(y)=\max_x x^TMy$
	- …
	- the minimax theorem
		- for every zero-sum game, worst-case optimal strategies for both players exist and can be efficiently computed
		- there is a number $v$ such that, for any worst-case optimal strategies $x^*$ and $y^*$, the strategy profile $(x^*,y^*)$ is a Nash equilibrium and $\beta(x^*)=(x^*)^TMy^*=\alpha(y^*)=v$
	- remarks
		- it was proved by John von Neuman
		- it tells us everything about zero-sum games
		- there are no secrets in zero-sum games – each player can always choose the worst-case optimal strategy
		- original proof uses Brouwer's theorem, we will use linear programming
	- proof of the minimax theorem
		- geometry preliminaries
			- hyperplane
			- halfspace
			- polyhedron – intersection of finitely many halfspaces
			- polytope – bounded polyhedron
			- $d$-dimensional polytope is simple … all its vertices are adjacent to exactly $d$ edges
		- linear programming preliminaries
			- constraints, objective function
			- Simplex method, Ellipsoid method
			- duality
		- the proof
			- we'll use the duality
			- first, we find the best response, actually the worst-case best response
			- …
	- bimatrix games
		- example: Prisoner's dilemma, collaborative projects
		- brute force algorithm (to find NE)
			- best response condition
		- best response polyhedra
			- labels
		- analysis of the NASH problem (finding nash equilibria in bimatrix games)
			- as a decision problem it is trivial – there always exists a Nash equilibrium
				- so it is not NP complete
			- FNP complexity class (“functional NP”)
			- clearly, NASH belongs to FNP
				- checking whether a strategy profile is NE can be done using the best response condition
			- theorem: if NASH is FNP-complete, then NP = coNP
				- proof is in the lecutre notes (it is not required)
			- therefore, it is unlikely that NASH is FNP-complete
			- new complexity class
			- END-OF-THE-LINE problem
				- the graph can be exponentially large with respect to the input size
			- class PPAD
- other solution concepts
	- $\varepsilon$-NE
	- correlated equilibrium
		- linear program
			- any feasible solution will be a correlated equilibrium
			- we can construct the objective function to maximize the general wellbeing
- regret minimization of an agent in a hostile environment
	- we have an agent $A$ in an adversary environment
	- there are $N$ available actions for $A$ in the set $X=\set{1,\dots,N}$
	- at each step $t=1,\dots,T$
		- our agent selects a probability distribution corresponding to the probability of choosing each action $a\in A$
		- adversary chooses a loss vector, loss $\in [-1,1]$
	- to be able to tell how well is our agent doing, we compare his loss to the loss of the best agent for some comparison class $\mathcal A$
		- we will mostly consider the class $\mathcal A_X=\set{A_i:i\in X}$, where an agent $A_i$ always chooses action $i$
	- external regret … $R^T_A=L^T_A-\min\set{L_B^T:B\in\mathcal A_X}=L_A^T-\min\set{L^T_i:i\in X}$
	- for now, we will consider loss $\in\set{0,1}$
	- it might seem that the class $\mathcal A_X$ contains too simple agents
		- however, we show that large comparison classes lead to a very large regret
		- let $\mathcal A_\text{all}$ be the set of agents that assign probability 1 to an arbitrary action from $X$ in every step
		- …
	- algorithms
		- greedy
		- randomized greedy
		- polynomial weights
	- no-regret dynamics
	- modern proof of the minimax theorem
	- coarse correlated equilibrium
		- CE is always CCE
		- but CCE does not imply CE
	- hierarchy of Nash equilibria
	- converging to CCE
	- regrets
		- external regret
			- we compare to agents playing only one action
		- internal regret
			- we compare to agents that play the same actions as my agent (with same probabilities), only replacing one action by some other one
		- swap regret
			- the same as internal but we allow to replace multiple actions (while keeping the probabilities)
	- we have polynomial weights algorithm for external regret
		- how to construct algorithm for swap regret?
		- we will use black-box reduction
- games in extensive form
	- chess … extensive form = tree of the game
	- in perfect-information game, all players know the node they are in
	- in imperfect-information game, they don't
		- we partition decision nodes into information sets where all nodes belong to the same player and have the same moves
	- we can convert any normal-form game of two players to an imperfect-information game in extensive form
	- pure, mixed and behavioral strategies
		- mixed strategy = probability distribution over vectors
		- behavioral strategy = vector of probability distributions
	- in general, the expressive power of behavioral strategies and mixed strategies are incomparable – but there is a class of extensive games for which the two definitions coincide
		- games of perfect recall
	- finding NE in extensive games
	- sequence form

## Mechanism design

- single item auctions
	- seller
	- $n$ bidders
		- each has valuation $v_i$
		- each privately communicates a bid $b_i$ to the seller
	- seller decides
		- who receives the item (if anyone)
		- the selling price $p$
	- bidders' utility
		- $u_i=0$ … if the bidder lost the auction
		- $u_i=v_i-p$ … if the bidder won
			- we will eventually assume that $p\leq v_i$
	- design
		- selling the item for free to the bidder with the highest bid
			- not a good idea
			- “who can name the highest number”
		- selling the item to the bidder with the highest bid for the price equal to the bid
			- does not work that well
			- what should be the bid? the bidder does not know what to do
		- dominant strategy for bidder $i$ … strategy that maximizes $u_i$ no matter what the other bidders do
		- social surplus … $\sum_i v_ix_i$
			- $x_i$ is the indicator of the bidder winning the auction
			- to maximize the social surplus for single item auction, we need the bidder with the highest valuation to win
		- an auction is awesome if it satisifies…
			- strong incentive guarantees
				- DSIC
			- strong performance guarantees
			- computational efficiency
		- there exists an awesome auction!
			- Vickrey's auction
			- the winner pays the second highest bid
- single parameter environments
	- $v_i$ … valuation “per unit of the goods”
- Myerson's lemma
- maximizing expected revenue
- virtual valuation
	- $\varphi_i(x)=x-\frac{1-F}{f_i(x)}$
	- where $\frac{1-F}{f_i(x)}$ can be regarded as the loss of not knowing the exact valuation
- Bulow-Klemperer theorem
- back to social surplus: knapsack auctions
	- $n$ bidders
	- each bidder has
		- public size $w_i\geq 0$
		- private valuation $v_i\geq 0$
	- seller has a capacity $W\geq 0$
	- feasible set $X$ … 0/1-vectors such that$\sum_ix_iw_i\leq W$
	- we cannot be awesome :(
- multi-parameter mechanism design
	- $n$ bidders
	- $\Omega$ … finite set of outcomes
	- $\forall \omega\in\Omega:$ each bidder has a private valuation $v_i(\omega)\geq 0$
	- $\forall\omega\in\Omega:$ each bidder submits his bids $b_i(\omega)\geq 0$
	- our goal is to design a mechanism that selects an outcome $\omega\in\Omega$ so that it maximizes the social surplus $\sum_iv_i(\omega)$
	- bidder A winning and bidder B winning are two separate outcomes → bidders can have opinions about each other bidder winning the item as well
	- example: single-item auction
		- $\Omega=\set{\omega_1,\dots,\omega_n,\omega_\emptyset}$
	- VCG mechanism
		- states that we can do DSIC social surplus maximization
	- note: allocation rule can assume that the bidders bid truthfully if we consider DSIC – but the payment formula has to provide DSIC
- proof of Myerson's lemma
	- payment difference sandwich
		- $y,z$ are not functions but only coefficients (bids)
		- $x_i(y;b_{-i})-x_i(z;b_{-i})$ has to be non-positive
			- because $z\cdot()\leq y\cdot()$ and $y\lt z$
- revelation principle
	- is DSIC too restrictive?
	- if there is a dominant strategy, we can assume that it is truthtelling