\documentclass[12pt]{article} \usepackage[T1]{fontenc} \usepackage[utf8]{inputenc} \usepackage[hmargin=1.2in, vmargin=1.2in]{geometry} \usepackage{amsfonts} \title{Review of \emph{Complexity Theory, Game Theory and Economics} by Tim Roughgarden} %\author{Thibaut Horel} \begin{document} \vspace{-5em} \maketitle \section*{General impression} Very nice selection of topics, striking a nice balance between covering classical results and recent papers. The level of exposition feels right for a general theory audience. My feedback is fairly local and mostly consists in suggestions for improving the exposition and typo fixes. \section*{Solar lecture 1} \begin{itemize} \item Section 1.1: for organisational purposes, it might make sens to move the plan to the foreword, or a separated ``Introduction'' or ``Preface'' chapter. \item Section 1.2.3: I suggest also introducing the concept of \emph{pure strategy} when defining a \emph{mixed strategy} below the payoff matrix. Once this term is defined, I think it should be used instead of \emph{deterministic strategy} in later sections (or even lectures?) \item Proof of Theorem 1.1: I would add variables $v, x$ below the $\max$ in the first equation to make it clear what the maximization is over. Similarly, add $w, y$ below the $\min$ in the second equation. \item Section 1.2.6: I think ``competitive'' is more standard than ``oppositional'' in the contex of games. Similarly, ``collaborative'' instead of ``cooperative'' in footnote 8. Furthermore, I think the content of this section should be moved to the beginning of Section 1.4 to motivate the transition to bimatrix games. At the moment, it feels that it is breaking the flow. \item Section 1.3.1: the first paragraph is slightly confusing, because it is said that players only know their own payoffs (and not those of the other player) but the focus of Section 1.3 at this point seems to be zero-sum games. I would clarify that the notion of uncoupled dynamics as it is defined here is for arbitrary bimatrix games. \item Sections 1.3.1 and 1.3.2: the two framed boxes are slightly ill-defined for $t=1$ since there is no history then. I would maybe make $t=1$ an explicit ``initialization'' step and then say ``At time step $t\geq 2$''. \item Example 1.7: maybe add that $\eta=+\infty$ corresponds to fictitious play \emph{with uniformly random tie-breaking}. \item Example 1.7: in the framed box, behavior at $t=1$ is ill-defined, initialize with uniformly random strategy? \item Theorem 1.8: say in the theorem statement that this is for a choice of $\eta_t = \Theta(\sqrt{t})$. \item Theorem 1.8: I am curious about the implication of this theorem for the computation of a Nash equilibrium: how does it compare to generic solvers for linear programs (e.g. the ellipsoid method)? \item Theorem 1.10: say that $n=|\Lambda|$. \item Section 1.3.7: the first sentence is unnecessary, the proof seems complete to me. \item Example 1.13: If I am not mistaken, this game is commonly known as the ``Battle of the sexes'' game. Even though the name is questionable, it might be worth mentioning so that people can recognize it in other contexts. \end{itemize} \paragraph{Typos.} \begin{itemize} \item footnote 4 p. 8: welfare $\rightarrow$ warfare \item definition 1.2: right-hind $\rightarrow$ right-hand \item first paragraph of 1.3.1: a la $\rightarrow$ à la \end{itemize} \section*{Solar lecture 2} \begin{itemize} \item Section 2.3.2 Equation (2.2), I would simply write the first term as $x_1'^TAx_2$: this is easier to read and more consistent with the previous lecture. \item Lemma 2.5: I would significantly expand the proof of Sperner's lemma because at the moment it is really hard to even get an intuition of why it is true. After looking it up it seems that a complete proof can be given in a small amount of space: 1. first highlight that there has to be an odd number of bichromatic edges through which you enter the triangle (this is the dimension 1 case) 2. explain how you orient the edges and walk the graph 3. explain why you are guaranteed to end up at a sink node (sometimes you will exit the triangle instead of finding one, but this removes two bichromatic edges so you can iterate by induction). \item Section 2.5: this is a minor thing, but I found the roadmap hard to read and refer to later. I suggest maybe a bullet list with each item starting with ``Step $i$'' ($i\in[4]$) clearly identified in bold. Maybe using notations like $\epsilon$-BFP$\leq \epsilon$-NE although not fully accurate could help quickly follow the chain of reductions. Maybe a picture as well would help visualizing the chain of reductions? \item Section 2.6: I found the oracle model confusing: why is it useful to consider instances where the $pred$ and $succ$ functions might not be consistent with each other (i.e. where $succ(v) = w$ but $pred(w) \neq v$)? It seems that when this problem is used in the next lecture, it is only used for the promise version of the problem where $pred$ and $succ$ are guaranteed to be consistent. If it is not needed, then one could get rid of cases (iii) and (iv) in the framed box. If it is indeed needed, then I think some explanation of why would be nice. On the same point, the formulation is confusing because it says that (for example) $succ(v)$ is NULL if there is no successor, but later on in the same paragraph, we understand that it is allowed for $succ(v)=w$, even when $v$ has no successor as long as $pred(w)\neq v$. I think it is important to be very clear and distinguish between the ``true'' successor (or absence thereof) according to the real graph $G$ and the successor from the perspective of the $succ$ function which is allowed to be inconsistent with the graph. \end{itemize} \paragraph{Typos.} \begin{itemize} \item last paragraph of Section 2.1: Theorem 1.8 $\rightarrow$ Theorem 2.1 \item Theorem 2.2: space missing before $\mathbb{R}^d$ \end{itemize} \section*{Solar lecture 3} \begin{itemize} \item Section 3.1.2 and Section 3.1.4. An intuition which was useful for me when trying to understand these sections and is already implicitly there but might be worth making more explicit: \begin{itemize} \item $G$ is used to define a ``gradient field'' $g$ on the entire hypercube by extrapolation. The extrapolation is designed such that the sink nodes of $G$ and source nodes (different from $0^n$) correspond to stationary points of the gradient field. \item the function $f$ is defined such that finding a fixed point of $f$ is equivalent to finding a stationary point of the gradient field (this might be a very personal bias but I always think of $f$ as expressing a ``gradient ascent'' update: $x_{k+1} = x_k + g(x_k)$, for which fixed points correspond to stationary points). \item by factoring out the parameter $\delta$ out of the definition of $g$ and defining $f(x) = x + \delta \cdot g(x)$ we can now interpret $\delta$ as the ``step size'' of gradient ascent (in any case, factoring $\delta$ out would make things a bit easier to read). \item it also follows from this gradient ascent perspective: that one can indeed solve the EOL problem by ``following'' the gradient field. The reduction basically shows that one cannot beat gradient ascent to find stationary points\footnote{This also made me wonder whether lower bounds on the query complexity of finding stationary points and the fact that gradient descent is optimal under reasonable assumptions could have somehow been leveraged somewhere in these reductions}. \end{itemize} \item Section 3.1.3: I know it is out of scope of this section to give all the details, but I wish a little bit more could have been said about the ``soft'' classification of points to one of the three categories: how is the ``intermediate regime'' defined and why is it useful? Given that it is referred in Section 3.1.4, I feel knowing a little bit more would help. \item Section 3.1.4: I think reordering the cases and using a bit more math instead of plain English could make the description of $g$ easier to read: \begin{itemize} \item for directed edge $e = (u, v)\in G$ define $\overrightarrow{e} = \frac{\sigma(v)-\sigma(u)}{\|\sigma(v)-\sigma(u)\|}$ also denote by $\overrightarrow{d}$ a default direction. \item if $x$ is close to $\sigma(e)$ with $e\in G$, then $g(x)=\overrightarrow{e}$ \item if $x$ is close to $v$ with incoming edge $(u,v)$ and outgoing edge $(v, w)$, $g(x)$ interpolates between $\overrightarrow{uv}$ and $\overrightarrow{vw}$. \item if $x$ is close to a source or sink node $v$ with adjacent edge $e$, $g(x)$ interpolates between $0^n$ and $\overrightarrow{e}$. \item in all other cases $g(x)=\overrightarrow{d}$. \end{itemize} \item Section 3.2.2: small typo $H_\epsilon$ is a set of $(1/\epsilon)^d$ points (and not $(1/\epsilon)^n$). \end{itemize} \paragraph{Typos.}\begin{itemize} \item Section 3.1.3 footnote 3: details,, $\rightarrow$ details, \end{itemize} \end{document}