1 files changed, 48 insertions, 3 deletions
diff --git a/results.tex b/results.tex
index 31801fc..0453d95 100644
--- a/results.tex
+++ b/results.tex
@@ -562,7 +562,11 @@ will be correct with high probability is $S$ is drawn from the same
 distribution as above.
 
 \paragraph{Claim 2:} $g$ is not optimizable under budget $\sqrt{n}$ because you
-never learn anything about $X$.
+never learn anything about $X$. The set you output will necessarily be random
+with respect to $X$: $\sqrt{n}|X \cap S| \approx
+n^{\frac{1}{3}}(\frac{\sqrt{n}}{n})\sqrt{n}$ and therefore $g(S) \approx
+\sqrt{n}$ with high probability. However if we had been able to output$X$, we
+would have $g(X) = n^{\frac{5}{6}}$.
 
 \paragraph{Open Question:} Cook a similar example where $g$ is submodular and
 where you are observing sets of the same size as your budget.
@@ -573,6 +577,12 @@ guarantees are sufficient for optimization. First look at learning weights in
 a cover function. Maybe facility location? Sums of concave over modular are
 probably too hard because of the connection to neural networks.
 
+\paragraph{Notes}
+
+Are you sure the example shouldn't hav $n^{1/3}$ rather than $\sqrt{n}$?
+Otherwise a random set $S$ will be such that $\sqrt{n}|X \cap S|
+\sqrt{n}\frac{\sqrt{n}}{2} = \frac{n}{2}$, which is roughly equal to $|S|$.
+
 \section{Which learning guarantees imply optimization?}
 
 Here, we consider the following question: which learning guarantees imply
@@ -597,8 +607,8 @@ c(x)|]$ where $\forall x, c(x) \equiv \sum_i w_i x_i$ and $h(x) \equiv \sum_i
 
 As it turns out, yes. Let $\forall i, w_i - \hat w_i \equiv \alpha_i$ and let
 $V(x) \equiv |h(x) - c(x)| = |\sum_{i} \alpha_i x_i |$. Consider the collection
-of good sets $\cal G \equiv \{S : v(S) < 4\epsilon\}$. We claim that $|{\cal G}|
-\geq \frac{3}{4}c\dot2^n$. Suppose the contrary, there is at least a quarter of
+of good sets ${\cal G} \equiv \{S : v(S) < 4\epsilon\}$. We claim that $|{\cal G}|
+\geq \frac{3}{4}\cdot2^n$. Suppose the contrary, there is at least a quarter of
 the sets which have value $v(S) > 4\epsilon$ such that $\mathbb{E}_{x \sim {\cal
 U}}[|v(x)|] \geq \frac{1}{2^n}\sum_{S \in {\cal G}^c} |v(S)| >
 \frac{1}{4}\cdot4\epsilon = \epsilon$ which is a contradiction. Consider element
@@ -610,6 +620,41 @@ S}_i^c$, $v(S\cup\{j\}) \geq \alpha_i - 4\epsilon$. It follows that
 $\mathbb{E}_{x \sim {\cal U}}[|v(x)|] \geq \frac{1}{4}(\alpha_j - 4\epsilon)$
 and therefore we have $\|w - \hat w\|_{\infty} \leq 8 \epsilon$.
 
+\section{Tricky coverage function example}
+
+Let $U = \{x_1, x_2\}$ be a universe of size $2$ and $w : x \in U \mapsto
+\mathbb{R}$ a weight function on $U$. Consider a family ${\cal A}$ of $n$ sets
+$A_1, A_2, \dots A_n$ such that $A_1 \equiv \{x_1\}$, $A_2 \equiv \{x_2\}$ and
+$\forall i \neq 1,2, A_i = \{x_1, x_2\}$. Let $k \in \mathbb{R}^+$ be any
+positive constant and consider the family of coverage functions defined on $U$
+and the family of sets ${\cal A}$:
+$${\cal C} \equiv \{c : {\cal A} \mapsto \mathbb{R}^+ : \forall S \subset {\cal
+A}, c(S) = w(\cup_{S_i \in S} S_i) \text{ and } w(U) = k \}$$
+
+Note that any coverage function from this family of coverage functions is
+constant almost everywhere, equal to $w(x_1)+w(x_2) = k$. $\forall S \neq A_1,
+A_2, c(S) = k$ except for two singleton sets of ${\cal A}$: $c(\{A_1\}) =
+w(x_1)$, and $c(\{A_2\}) = w(x_2)$.
+
+It can be shown that this class of functions is $PAC$-learnable, and
+$PMAC$-learnable. However, given any distribution ${\cal D}$ on $2^U$ such that
+$\mathbb{P}(\{A_1\}) + \mathbb{P}(\{A_2\}) = o(n^{-c})$ for any constant $c$,
+then for $m = \Omega(n^c)$ for a constant $c \in \mathbb{R}^+$ and for $\hat S$
+returned by any one-shot optimization algorithm ${\cal B}$:
+$$\mathbb{P}_{S_1, \dots S_m \sim {\cal D}}\left(\mathbb{P}_{\hat S}
+\left(f(\hat S) \geq \frac{OPT}{n}\right) \geq 1- \epsilon\right) \leq \delta$$
+
+This example is not fully satisfying for two reasons:
+\begin{itemize}
+  \item the coverage function is not monotone.
+  \item the sets we observe most do not satisfy the constraint.
+\end{itemize}
+Note that even if we observe sets which \emph{almost} satisfy the cardinality
+constraint, such as all sets of size 2, we would still have no hope to optimize
+the function within better than a $\frac{1}{n}$ approximation factor.
+
+This example can be modified to address the second issue.
+
 \bibliography{optimize} \bibliographystyle{apalike}