Let $\Omega$ be the universe of elements and $f$ a function defined on subsets
of $\Omega$: $f : S \in 2^{[\Omega]} \mapsto f(S) \in \mathbb{R}$. Let $K$ be a
collection of sets of $2^{[\Omega]}$, which we call \emph{constraints}. Let
$S^*_K$ be any solution to $\max_{S \in K} f(S)$, which we will also denote by
$S^*$ when there is no ambiguity. Let $L$ be the problem size, which is often
(but not always) equal to $|\Omega|$.

In general, we say we can efficiently optimize a function $f$ under constraint
$K$ when we have a polynomial-time algorithm making adaptive value queries to
$f$,which returns a set $S$ such that $S \in K$ and $f(S) \geq \alpha f(S^*)$
with high probability and $\alpha$ an absolute constant.

Here, we consider the scenario where we cannot make adaptive value queries, and
in fact, where we cannot make queries at all! Instead, we suppose that we
observe a polynomial number of set-value pairs $(S, f(S))$ where $S$ is taken
from a known distribution $D$. We say we can efficiently \emph{passively
optimize} $f$ under distribution $D$ or $D-$optimize $f$ under constraints $K$
when, after observing ${\cal O}(L^c)$ set-value pairs from $D$ where $c > 0$ is
an absolute constant, we can return a set $S$ such that $S \in K$ and $f(S)
\geq \alpha f(S^*)$ with high probability and $\alpha$ an absolute constant.

In the case of \emph{passive} observations of set-value pairs under a
distribution $D$ for a function $f$, recent research has focused on whether we
can efficiently and approximately \emph{learn} $f$. This was formalized in the
PMAC model from \cite{balcan2011learning}. When thinking about passive
optimization, it is necessary to understand the link between being able to
 $D-PMAC$ learn $f$ and being able to $D-$optimize $f$.