ps2/main.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232

\documentclass[10pt]{article}
\usepackage{fullpage}
\usepackage{amsmath,amsfonts,amsthm}
\usepackage[english]{babel}
\usepackage[capitalize, noabbrev]{cleveref}
\usepackage{paralist} 

% these are compressed lists to help fit into a 1 page limit
\newenvironment{enumerate*}%
  {\vspace{-2ex} \begin{enumerate} %
     \setlength{\itemsep}{-1ex} \setlength{\parsep}{0pt}}%
  {\end{enumerate}}
 
\newenvironment{itemize*}%
  {\vspace{-2ex} \begin{itemize} %
     \setlength{\itemsep}{-1ex} \setlength{\parsep}{0pt}}%
  {\end{itemize}}
 
\newenvironment{description*}%
  {\vspace{-2ex} \begin{description} %
     \setlength{\itemsep}{-1ex} \setlength{\parsep}{0pt}}%
  {\end{description}}

\DeclareMathOperator{\E}{\mathbb{E}}
\let\Pr\relax
\DeclareMathOperator{\Pr}{\mathbb{P}}
\newcommand{\ex}[2][]{\E_{#1}\left[#2\right]}
\newcommand{\prob}[2][]{\Pr_{#1}\left[#2\right]}

\newcommand{\inprod}[1]{\left\langle #1 \right\rangle}
\newcommand{\R}{\mathbb{R}}
\newcommand{\eqdef}{\mathbin{\stackrel{\rm def}{=}}}
\newcommand{\llbracket}{[\![}

\newtheorem{theorem}{Theorem}
\newtheorem{lemma}{Lemma}
\newtheorem*{claim}{Claim}

\author{Thibaut Horel \and Paul Tylkin}
\title{Economics 2099 Problem Set 2 -- Solutions}

\begin{document}

\maketitle
\section*{Exercise 4.13}

In the proof of the prophet inequality, a lower bound on APX is obtained by
conditioning on the event $\mathcal{E}_i$ that all agents different from $i$
have a value below the threshold $\hat{v}$.

If this threshold is the monopoly price of the maximum value distribution, then
it is different from the monopoly prices of $\hat{v}_i$ of the individual
agents.  However, we believe that is possible to lower bound the probability of
this event:
\begin{displaymath}
    \Pr[\mathcal{E}_i] \geq \frac{1}{2} \prod_{j} \Pr[v_j\geq\hat{v}_j]
\end{displaymath}
That is, we lose a factor 2 by going from discriminatory posted prices to an
anonymous posted price. This lower bound probably follows from the regularity
assumption, which should allow us to see that the distribution of the maximum
value is not too ``far'' from each individual distribution. However, we haven't
been able to prove such a lower bound.

Given such a lower bound, we can finish the proof of the prophet inequality as
in the lecture notes.

\section*{Exercise 4.14*}\begin{claim} For any constant $\beta$, there is a matroid environment and an i.i.d. non-regular distribution such that the approximation ratio of the optimal mechanism with the surplus maximization with anonymous reserve is at least $\beta$.
\end{claim}

\begin{proof} 
Since we know that the claim is not true for a $k$-uniform matroid (where any
subset of at most $k$ agents is feasible), we need to look at a more general
matroid. We consider a transversal matroid with $n+1$ agents and $n$ items.

Let us denote by $\{0,\ldots,n\}$ the set of agents. By setting the edges of
the bipartite graph appropriately, we can ensure that any set containing agent
0 and any other agent in $\{1,\ldots n\}$ is not feasible.

Now we can set the probability distribution so that the value $v_i$ of agent
$i$ is:
\begin{displaymath}
    v_i = \begin{cases}
        1& \text{w.p. } 1-\frac{1}{n}\\
        H& \text{w.p. } \frac{1}{n}
    \end{cases}
\end{displaymath}
we think that by choosing $H$ appropriately (probably $\Omega(n)$), we can
prove that the multiplicative gap between the optimal mechanism and the surplus
maximizing mechanism with reserve prize goes to infinity as $n$ goes to
infinity. Unfortunately, it seems that a fine balance between the $\frac{1}{n}$
probability and the growth rate of $H$ is required, and we haven't been able to
find it.
\end{proof}

\section*{Exercise 4.19}
\begin{claim} In regular, matroid environments, the revenue of the surplus maximization mechanism with monopoly reserves is a 2-approximation to the optimal mechanism revenue.

\begin{proof} Let REF denote the optimal mechanism and its expected revenue and let APX denote the surplus maximization mechanism with monopoly reserves and its expected revenue. Similarly to the single-item case, we know that REF $\geq$ APX and want to show that REF $\leq$ 2APX. Let $I$ denote the set of agents allocated by the optimal mechanism and let $J$ denote the set of agents allocated by the surplus maximization mechanism with monopoly reserves. Then, $$REF = \E\left[\sum_{i \in I}\phi_i(v_i)\right]$$ and $$APX =  \E\left[\sum_{j \in J}\phi_j(v_j)\right].$$ By linearity of expectation, we can write this as $$REF = \sum_{i \in I}\E\left[\phi_i(v_i)\right]$$ and $$APX =  \sum_{j \in J}\E\left[\phi_j(v_j)\right].$$ 

Now, let $\psi: I \rightarrow J$ be a map (bijection) with the following property: $I \backslash \{i\} \cup \{\psi(i)\}\text{ is feasible.}$ The existence of such a map is guaranteed by the \textit{matroid basis exchange property} - and says basically that $\psi(i) \in J$ is a possible replacement for $i \in I$.

Now, we can write the expected revenue of the optimal mechanism by conditioning on whether $i = \psi(i)$ for $i \in I$, $\psi(i) \in J$. (i.e. whether agent $i$ is allocated the same item by the optimal mechanism as agent $\psi(i)$ is by the surplus maximization mechanism). So we can write $$REF = \underbrace{\sum_{\substack{i \in I}}\E[\phi_i(v_i) | i = \psi(i)] \Pr[i=\psi(i)]}_{REF_=} + \underbrace{\sum_{\substack{i \in I}}\E[\phi_i(v_i) | i \neq\psi(i)] \Pr[i\neq \psi(i)]}_{REF_{\neq}}.$$ We will prove the desired result by showing both $REF_=$ and $REF_{\neq}$ are bounded from above by APX. 

\begin{align}
REF_= &= \sum_{\substack{i \in I}}\E[\phi_i(v_i) | i = {\psi(i)}] \Pr[i={\psi(i)}] \\
&= \sum_{\substack{i \in I}}\E[\phi_{\psi(i)}(v_{\psi(i)}) | i = {\psi(i)}] \Pr[i={\psi(i)}] \\
&\leq  \sum_{\substack{i \in I}}\E[\phi_{\psi(i)}(v_{\psi(i)}) | i = {\psi(i)}] \Pr[i={\psi(i)}] +  \sum_{\substack{i \in I}}\E[\phi_{\psi(i)}(v_{\psi(i)}) | i \neq {\psi(i)}] \Pr[i\neq {\psi(i)}] \label{42}\\
&= \sum_{i\in I}\E\left[\phi_{\psi(i)}(v_{\psi(i)})\right] \\
&= \sum_{j\in J}\E\left[\phi_{j}(v_{j})\right] \label{bijection1}\\
&= APX.
\end{align}

Note that Equation $\ref{42}$ follows from Lemma 4.2 (i.e. from the monotonicity of virtual value functions).  Equation $\ref{bijection1}$ follows because $\Psi$ is a bijection.

Next, we consider $REF_{\neq}$.

\begin{align}
REF_{\neq} &= \sum_{\substack{i \in I}}\E[\phi_i(v_i) | i \neq {\psi(i)}] \Pr[i\neq {\psi(i)}] \\
&\leq \sum_{\substack{i \in I}}\E[v_i | i \neq {\psi(i)}] \Pr[i\neq {\psi(i)}] \\
&\leq \sum_{\substack{i \in I}}\E[p_{\psi(i)}({\mathbf v}) | i \neq {\psi(i)}] \Pr[i\neq {\psi(i)}] \label{428} \\
&\leq \sum_{\substack{i \in I}}\E[p_{\psi(i)}({\mathbf v}) | i = {\psi(i)}] \Pr[i={\psi(i)}] +  \sum_{\substack{i \in I}}\E[p_{\psi(i)}({\mathbf v}) | i \neq {\psi(i)}] \Pr[i\neq {\psi(i)}] \label{429} \\
&= \sum_{i \in I} \E[p_{\psi(i)}({\mathbf v})] \\
&= \sum_{j \in J}\E[p_j({\mathbf v})] \label{bijection2}\\
&= APX.
\end{align}

Note that Equation $\ref{428}$ follows from Lemma 4.28 and Equation $\ref{429}$ follows from Theorem 4.29. Equation $\ref{bijection2}$ follows because $\Psi$ is a bijection.

\end{proof}

\end{claim}

\section*{Additional Problem}

We prove that $\ex[\mathbf{F}]{\max_i
v_i}\leq\frac{e}{e-1}\ex[\mathbf{F}^I]{\max_i v_i}$. We first prove it for
finite support distributions by reduction to Lemma 4.13 in the lecture notes.
We then extend the result to general distributions using a density argument
(general distributions can be approximated arbitrarily well by finite support
distributions).

\paragraph{Finite support distributions}
Let us first assume that $\mathbf{F}$ has a finite support. We write $A_i$ the
set of atoms of $\mathbf{F}_i$ (the $i$-th marginal of $\mathbf{F}$) so that
both $\mathbf{F}$ and $\mathbf{F}^I$ define a probability distribution over
$A_1\times\ldots\times A_n$. We now define $A$ to be the disjoint union of
$A_1$ to $A_n$:
\begin{displaymath}
    A = \coprod_{i=1}^n A_i
    = \big\{ (a, i)\, |\, x\in A_i,\; 1\leq i \leq n\big\}
\end{displaymath}

Note that there is a canonical injection from $A_1\times\ldots\times A_n$ to
$\mathcal{P}(A)$:
\begin{displaymath}
    (a_1,\ldots, a_n)\mapsto \big\{ (a_1, 1),\ldots, (a_n, n)\big\}
\end{displaymath}
so that $\mathbf{F}$ and $\mathbf{F}^I$ naturally define a probability
distribution over subsets of $A$:
\begin{displaymath}
    \Pr^A_{\mathbf{F}}\big( \{(a_1, i_1),\ldots,(a_k, i_k)\}\big) = 
    \begin{cases}
        \Pr_{\mathbf{F}} (a_1,\ldots, a_k)&\text{if } \{i_1,\ldots,i_k\}
    = \{1,\ldots, n\}\\
    0&\text{otherwise}
    \end{cases}
\end{displaymath}
and similarly for $\Pr^A_{\mathbf{F^I}}$. That is, the only subsets of $A$ which
have a non-zero probability are the ones which ``correspond'' to an $n$-tuple in
$A_1\times\ldots\times A_n$ (they are in the image of the canonical injection
given above).

Similarly, the function $\max_i v_i$ defined over $A_1\times\ldots\times A_n$
can be extended to subsets of $A$ by:
\begin{displaymath}
    g(\{(a_1, i_1),\ldots, (a_k, i_k)\}) \eqdef \max\{a_1,\ldots, a_k\}
\end{displaymath}

The consequence of this construction\footnote{This construction might appear
    a bit technical. What we would naturally do is to consider the set of all
    atoms across all coordinates, but this would not work in cases where the
    same atom can appear at different coordinates. This is why we have to
    consider the disjoin union, with ``independent'' copies of the atoms.} is
    the following key property:
\begin{displaymath}
    \E_{S\sim\Pr_{\mathbf{F}}^A}[g(S)] = \E_{\mathbf{F}}[\max_i v_i]
\end{displaymath}
and similarly for $\mathbf{F}^I$. Now, we note that $g$ is
a maximum-weight-element set function, where the weight of $(a, i)\in A$ is
simply $a$. Hence we can apply Lemma 4.13 to $g$ for subsets of $A$ and obtain:
\begin{displaymath}
    \E_{\mathbf{F}}[\max_i v_i] = 
    \E_{S\sim\Pr_{\mathbf{F}}^A}[g(S)]
    \geq \frac{e}{e-1}
    \E_{S\sim\Pr_{\mathbf{F^I}}^A}[g(S)]=
    \E_{\mathbf{F^I}}[\max_i v_i]
\end{displaymath}

\paragraph{General distributions} We first look at a continuous distribution
$\mathbf{F}$ with bounded support. On the support of $\mathbf{F}$, it is
possible to approximate its CDF arbitrarily well in the $L^\infty$ norm (hence
in the $L^2$ norm since we are on a bounded set) by staircase functions (convex
combination of indicator functions). But an indicator function is simply the
CDF of a Dirac distribution. Hence, we can approximate $\mathbf{F}$ arbitrarily
well in the $L^2$ norm with finite support distributions. But convergence in
the $L^2$ norm implies weak convergence by the Cauchy-Schwarz inequality. That
is, we have a sequence $F_n$ of finite support distributions such that:
\begin{displaymath}
    \E_{F_n}[\max_i v_i] \xrightarrow[n\rightarrow \infty]{} \E_F[\max_i v_i]
\end{displaymath}
we can apply the previous result for each of the $F_n$ and take the limit to
obtain again:
\begin{displaymath}
    \E_{\mathbf{F}}[\max_i v_i] \geq \frac{e}{e-1}\E_{\mathbf{F^I}}[\max_i v_i]
\end{displaymath}

When $\mathbf{F}$ does not have bounded support, we can truncate $\mathbf{F}$
by considering its restriction to an increasing sequence of bounded rectangles:
$K_n = \prod_{i=1}^n [-n, n]$. We can apply the previous result to each of the
truncated distributions and again take the limit as $n\rightarrow\infty$. The
validity of this last limit will depend on the exact technical assumptions we
make on $\mathbf{F}$ (basically we need $\mathbf{F}$ to be quickly decreasing
outside of bounded intervals).

More generally, the previous approach will apply as long as we are considering
a distribution $\mathbf{F}$ in a space of distributions for which finite
support distributions are dense for the weak topology. There are various
assumptions with various level of technicality which imply such a density. We
can choose to apply one or the other based on the assumptions we have on
$\mathbf{F}$.

\end{document}