aboutsummaryrefslogtreecommitdiffstats
path: root/paper/sections/negative.tex
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2015-04-14 18:40:10 -0400
committerThibaut Horel <thibaut.horel@gmail.com>2015-04-14 18:40:10 -0400
commit22e254972aaf4cbb08d4925695d344c2835d6aae (patch)
tree787028782dce3c63b467cf23193bad5bfa36d0a9 /paper/sections/negative.tex
parent070cd2840a2df8856f484cef6004d0240e1246d0 (diff)
downloadlearn-optimize-22e254972aaf4cbb08d4925695d344c2835d6aae.tar.gz
Model: definitions of optimizable and learnable
Diffstat (limited to 'paper/sections/negative.tex')
-rw-r--r--paper/sections/negative.tex73
1 files changed, 49 insertions, 24 deletions
diff --git a/paper/sections/negative.tex b/paper/sections/negative.tex
index 555aa1d..51ea1a3 100644
--- a/paper/sections/negative.tex
+++ b/paper/sections/negative.tex
@@ -1,32 +1,57 @@
\subsection{Model}
-Let $\Omega$ be the universe of elements and $f$ a function defined on subsets
-of $\Omega$: $f : S \in 2^{[\Omega]} \mapsto f(S) \in \mathbb{R}$. Let $K$ be a
-collection of sets of $2^{[\Omega]}$, which we call \emph{constraints}. Let
-$S^*_K$ be any solution to $\max_{S \in K} f(S)$, which we will also denote by
-$S^*$ when there is no ambiguity. Let $L$ be the problem size, which is often
-(but not always) equal to $|\Omega|$.
+We consider a submodular function $f$ defined over a ground set $N$ of size
+$n$. We assume that $f$ is non-negative and normalized:
+\begin{displaymath}
+ f:2^N\to \reals^+,\quad f(\emptyset) = 0
+\end{displaymath}
+
+Given a subset $K$ of $2^N$ representing our constraints (e.g. $K$ is an
+independent system), we are approximating $S^*$ solution to the following
+optimization problem:
+\begin{displaymath}
+ S^*\in\argmax_{S\in K} f(S)
+\end{displaymath}
+
+Unlike the standard submodular optimization framework, we do not assume access
+to a value query oracle for $f$. Denoting by $U_K$ the uniform distribution
+over $K$, we assume that we are given access to polynomially many independent
+samples drawn from $U_K$, this is the only thing which we observe about $f$.
+The goal is to understand whether or not it is possible to optimize $f$ given
+these observations. This leads to the following definition.
-In general, we say we can efficiently optimize a function $f$ under constraint
-$K$ when we have a polynomial-time algorithm making adaptive value queries to
-$f$,which returns a set $S$ such that $S \in K$ and $f(S) \geq \alpha f(S^*)$
-with high probability and $\alpha$ an absolute constant.
+\begin{definition}
+ Let $\mathcal{F}$ be a family of functions $2^N\to\reals^+$. For
+ $\alpha:\reals^+\to\reals^+_*$ and constraints $K$, we say that
+ $\mathcal{F}$ is $(\alpha, K)$-optimizable iff for any $f\in\mathcal{F}$
+ and $\eps>0$, there exists $\ell = \mathrm{poly}(n,\frac{1}{\eps})$ and an
+ algorithm $A$ running in time $\mathrm{poly}(\ell)$ such that given random
+ inputs $\mathcal{S} = \{(S_i, f(S_i)\}_{1\leq i\leq \ell}$, where the
+ variables $S_1,\dots, S_\ell$ are drawn independently from $U_K$, $A$
+ outputs a set $S$ such that:
+ \begin{displaymath}
+ \P_\mathcal{S}\big[f(S)\geq \alpha(\eps) f(S^*)\big]\geq 1-\eps
+ \end{displaymath}
+\end{definition}
-Here, we consider the scenario where we cannot make adaptive value queries, and
-in fact, where we cannot make queries at all! Instead, we suppose that we
-observe a polynomial number of set-value pairs $(S, f(S))$ where $S$ is taken
-from a known distribution $D$. We say we can efficiently \emph{passively
-optimize} $f$ under distribution $D$ or $D-$optimize $f$ under constraints $K$
-when, after observing ${\cal O}(L^c)$ set-value pairs from $D$ where $c > 0$ is
-an absolute constant, we can return a set $S$ such that $S \in K$ and $f(S)
-\geq \alpha f(S^*)$ with high probability and $\alpha$ an absolute constant.
+One of the goals of the present work is to compare this notion $(\alpha,
+K)$-optimizability, to the following notion of learnability.
-In the case of \emph{passive} observations of set-value pairs under a
-distribution $D$ for a function $f$, recent research has focused on whether we
-can efficiently and approximately \emph{learn} $f$. This was formalized in the
-PMAC model from \cite{balcan2011learning}. When thinking about passive
-optimization, it is necessary to understand the link between being able to
- $D-PMAC$ learn $f$ and being able to $D-$optimize $f$.
+\begin{definition}
+ \label{def:learnable}
+ Let $\mathcal{F}$ be a family of functions $2^N\to\reals^+$. For
+ $\alpha:\reals^+\to\reals^+_*$ and constraints $K$, we say that
+ $\mathcal{F}$ is $(\alpha, K)$-learnable iff for any $f\in\mathcal{F}$,
+ $\eps>0$ and $\delta>0$, there exists $\ell
+ = \mathrm{poly}(n,\frac{1}{\eps},\frac{1}{\delta})$ and an
+ algorithm $A$ such that given random inputs $\mathcal{S} = \{(S_i,
+ f(S_i)\}_{1\leq i\leq \ell}$, where the variables $S_1,\dots, S_\ell$ are
+ drawn independently from $U_K$, $A$ outputs a function $\tilde{f}$ such that:
+ \begin{displaymath}
+ \P_\mathcal{S}\left[\P_{S\sim U_K} [f(S)\geq \tilde{f}(S)\geq
+ \alpha(\eps) f(S)]\geq 1-\eps\right]\geq 1-\delta
+ \end{displaymath}
+\end{definition}
\subsection{Learnable $\nRightarrow$ Optimizable}