From 34e122cab94de7e727f5c9dd00d3c6f246cde30c Mon Sep 17 00:00:00 2001
From: Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)>
Date: Sat, 3 Nov 2012 17:04:43 -0700
Subject: prelims

---
 problem.tex | 126 ++++++++++++++++++++++++++++++------------------------------
 1 file changed, 63 insertions(+), 63 deletions(-)

(limited to 'problem.tex')

diff --git a/problem.tex b/problem.tex
index a8b89f4..237894e 100644
--- a/problem.tex
+++ b/problem.tex
@@ -1,4 +1,4 @@
-\subsection{Optimal Experimental Design}
+\subsection{Experimental Design}
 
 The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimum}  studies how an experimenter should select the parameters of a set of experiments she is about to conduct. In general, the optimality of a particular design depends on the purpose of the experiment, \emph{i.e.}, the quantity the experimenter is trying to learn or the hypothesis she is trying to validate. Due to their ubiquity in statistical analysis, a large literature on the subject focuses on learning \emph{linear models}, whereby the experimenter wishes to  fit  a linear map to the data she has collected.
 
@@ -8,7 +8,7 @@ More precisely, putting cost considerations aside, suppose that an experimenter
 \end{align}
 where $\beta$ a vector in $\reals^d$, commonly referred to as the \emph{model}, and $\varepsilon_i$ (the \emph{measurement noise}) are independent, normally distributed random  variables with zero mean and variance $\sigma^2$. 
 
-The purpose of these experiments is to allow the experimenter to estimate the model $\beta$. In particular, assuming Gaussian noise, the maximum likelihood estimator of $\beta$ is the \emph{least squares} estimator: for $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ the matrix of experiment features and
+The purpose of these experiments is to allow the experimenter to estimate the model $\beta$. In particular, under \eqref{model}, the maximum likelihood estimator of $\beta$ is the \emph{least squares} estimator: for $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ the matrix of experiment features and
 $y_S=[y_i]_{i\in S}\in\reals^{|S|}$ the observed measurements, 
 \begin{align} \hat{\beta} &=\max_{\beta\in\reals^d}\prob(y_S;\beta) =\argmin_{\beta\in\reals^d } \sum_{i\in S}(\T{\beta}x_i-y_i)^2 \nonumber\\
 & = (\T{X_S}X_S)^{-1}X_S^Ty_S\label{leastsquares}\end{align} 
@@ -20,9 +20,9 @@ the noise terms $\varepsilon_i$). In particular, $\hat{\beta}$ has mean $\beta$
 (\emph{i.e.}, it is an \emph{unbiased estimator}) and covariance
 $(\T{X_S}X_S)^{-1}$.
 
-Let $V:2^\mathcal{N}\to\reals$ be a value function, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The standard optimal experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. 
+Let $V:2^\mathcal{N}\to\reals$ be a \emph{value function}, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The standard optimal experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. 
 
-A variety of different value functions are used in experimental design\cite{pukelsheim2006optimal}; almost all make use of the covariance $(\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value functioned preferred because of its relationship to entropy is the \emph{$D$-optimality criterion}: %which yields the following optimization problem
+A variety of different value functions are used in experimental design~\cite{pukelsheim2006optimal}; almost all make use of the covariance $(\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value function preferred because of its relationship to entropy is the \emph{$D$-optimality criterion}: %which yields the following optimization problem
 \begin{align}
  V(S) &= \frac{1}{2}\log\det \T{X_S}X_S \label{dcrit} %\\
 \end{align}
@@ -41,94 +41,94 @@ the uncertainty on $\beta$, as captured by the entropy of its estimator.
 Value function \eqref{dcrit} has several appealing properties. To begin with, it is a submodular set function (see Lemma~\ref{...} and Thm.~\ref{...}). In addition, the maximization of convex relaxations of this function is a  well-studied problem \cite{boyd}. Note that \eqref{dcrit} is undefined when $\mathrm{rank}(\T{X_S}X_S)<d$; in this case, we take $V(S)=-\infty$ (so that $V$ takes values in the extended reals).
 
 
-\subsection{Budget Feasible Mechanism Design}
-In this paper, we approach the problem of optimal experimental design from the
-perspective of \emph{a budget feasible reverse auction}
-\cite{singer-mechanisms}. In particular, we assume that each experiment $i\in
-\mathcal{N}$ is associated with a cost $c_i$, that the experimenter needs to
-pay in order to conduct the experiment.  The experimenter has a budget
-$B\in\reals_+$. In the \emph{full information case}, the costs are common
-knowledge; optimal design in this context amounts to selecting a set $S$
+\subsection{Budget Feasible Reverse Auctions}
+A \emph{budget feasible reverse auction}
+\cite{singer-mechanisms} comprises a set of items $\mathcal{N}=\{1,\ldots,n\}$ as well as a single buyer. Each item has a cost $c_i\in\reals_+$. Moreover, the buyer has a positive value function  $V:2^{\mathcal{N}}\to \reals_+$, as well as a budget $B\in \reals_+$. In the \emph{full information case}, the costs $c_i$ are common
+knowledge; the objective of the buyer in this context is to select a set $S$
 maximizing the value $V(S)$ subject to the constraint $\sum_{i\in S} c_i\leq
-B$. We write:
+B$.We write:
 \begin{equation}\label{eq:non-strategic}
     OPT(V,\mathcal{N}, B) = \max_{S\subseteq\mathcal{N}} \left\{V(S) \mid
     \sum_{i\in S}c_i\leq B\right\}
 \end{equation}
-the best value we can reach under the budget constraint $B$ when selecting
-experiments from the set $\mathcal{N}$.
+for the optimal value achievable in the full-information case. \stratis{Should be $OPT(V,c,B)$\ldots better drop the arguments.}
 
-As in \cite{singer-mechanisms,chen}, we focus in a \emph{strategic scenario}:
-experiment $i$ corresponds to a \emph{strategic agent}, whose cost $c_i$ is
-private.  For example, each $i$ may correspond to a human participant; the
-feature vector $x_i$ may correspond to a normalized vector of her age, weight,
-gender, income, \emph{etc.}, and the measurement $y_i$ may capture some
-biometric information (\emph{e.g.}, her red cell blood count, a genetic marker,
-etc.). The cost $c_i$ is the amount the participant deems sufficient to
-incentivize her participation in the study.
-
-A mechanism $\mathcal{M} = (f,p)$ comprises (a) an \emph{allocation function}
+In the \emph{strategic case}, each items in $\mathcal{N}$ is held by a different strategic agent, whose cost is a-priori private.
+A \emph{mechanism} $\mathcal{M} = (f,p)$ comprises (a) an \emph{allocation function}
 $f:\reals_+^n \to 2^\mathcal{N}$ and (b) a \emph{payment function}
 $p:\reals_+^n\to \reals_+^n$. Given the
-vector or costs $[c_i]_{i\in\mathcal{N}}$, the allocation function determines the set
-$S\subseteq \mathcal{N}$ of experiments to be conducted, while the payment function
-returns a vector of payments $[p_i]_{i\in \mathcal{N}}$. As in
+vector or costs $c=[c_i]_{i\in\mathcal{N}}$, the allocation function $f$ determines the set in
+$ \mathcal{N}$ of items to be purchased, while the payment function
+returns a vector of payments $[p_i]_{i\in \mathcal{N}}$. Let $s_i(c) = \id_{i\in f_i(c)}$ be the binary indicator of  $i\in f(c)$. As in
 \cite{singer-mechanisms, chen}, we study mechanisms that are normalized
-($i\notin S$ implies $p_i=0$), individually rational ($p_i\geq c_i$) and have
-no positive transfers $p_i\geq 0$. 
+($i\notin f(c)$ implies $p_i(c)=0$), individually rational ($p_i(c)\geq c_i\cdot s_i(c)$) and have
+no positive transfers ($p_i(c)\geq 0$). 
 
-Moreover, we want to design a mechanism which has the following properties:
+In addition to the above, mechanism design in budget feasible reverse auctions seeks mechanisms that have the following properties \cite{singer-mechanisms,chen}:
 \begin{itemize}
-    \item \emph{Truthful.} An agent has no incentive to report a cost $c_i'$
-    different from his true cost $c_i$. Formally, let us write $(c_i', c_{-i})$
-    the vector of costs when agent $i$ reports cost $c_i'$ (all the other costs
-    $c_{-i}$ being the same). Let $[p_i']_{i\in \mathcal{N}} = p(c_i',
-    c_{-i})$, then the mechanism is truthful iff (a) $i\notin
-    f(c_i,c_{-i})\implies i\notin f(c_i',c_{-i})$ and (b) $p_i - c_i \geq p_i'
-    - c_i$.
-    \item \emph{Budget Feasible.} The sum of the payments should not exceed the
+    \item \emph{Truthfulness.} An agent has no incentive to missreport her true cost. Formally, let $c_{-i}$
+ be a vector of costs of all agents except $i$. %    $c_{-i}$ being the same). Let $[p_i']_{i\in \mathcal{N}} = p(c_i',
+%    c_{-i})$, then the 
+A mechanism is truthful iff  every $i \in \mathcal{N}$ and every two cost vectors $(c_i,c_{-i})$ and $(c_i',c_{-i})$
+ $p_i(c_i,c_{-i}) - s_i(c_i,c_{-i})\cdot c_i \geq p_i(c_i',c_{-i}) - s(c_i',c_{-i})\cdot c_i$.
+    \item \emph{Budget Feasibility.} The sum of the payments should not exceed the
     budget constraint:
     \begin{displaymath}
         \sum_{i\in\mathcal{N}} p_i \leq B.
     \end{displaymath}
     \item \emph{Approximation ratio.} The value of the allocated set should not
-    be too far from the optimum value of the non-strategic case
+    be too far from the optimum value of the full information case
     \eqref{eq:non-strategic}. Formally, there must exist some $\alpha\geq 1$
     such that:
     \begin{displaymath}
         OPT(V,\mathcal{N}, B) \leq \alpha V(S).
     \end{displaymath}
-    The approximation ratio captures the \emph{price of truthfulness}: this is
-    what you loose by moving from the non-strategic case to the strategic case
-    with a truthfulness constraint.
+    The approximation ratio captures the \emph{price of truthfulness}, \emph{i.e.},  the relative value loss incurred by adding the truthfulness constraint. % to the value function maximization.
     \item \emph{Computationally efficient.} The allocation and payment function
-    should be computable in polynomial time in the number $|\mathcal{N}|$ of
-    agents. \thibaut{Should we say something about the black-box model for $V$?
-    Needed to say something in general, but not in our case where the value
-    function can be computed in polynomial time}.
+    should be computable in polynomial time in the number of
+    agents $n$. %\thibaut{Should we say something about the black-box model for $V$?  Needed to say something in general, but not in our case where the value function can be computed in polynomial time}.
 \end{itemize}
 
-Note that this is a \emph{single parameter} auction: each agent has only one
-private value. In this case, Myerson's theorem \cite{myerson} gives
+As noted in \cite{singer-mechanisms, chen}, budget feasible reverse auctions are  \emph{single parameter} auctions: each agent has only one
+private value. In this case, Myerson's Theorem \cite{myerson} gives
 a characterization of truthful mechanisms.
 
-\begin{theorem}[Myerson \cite{myerson}]\label{thm:myerson}
-A normalized mechanism $\mathcal{M} = (f,p)$ for a single parameter auction is
+\begin{lemma}[Myerson \cite{myerson}]\label{thm:myerson}
+\sloppy A normalized mechanism $\mathcal{M} = (f,p)$ for a single parameter auction is
 truthful iff:
-\begin{enumerate}
-\item $f$ is \emph{monotone}: for any agent $i$ and $c_i' \leq c_i$, for any
+%\begin{enumerate}
+%\item 
+(a) $f$ is \emph{monotone}, \emph{i.e.}, for any agent $i$ and $c_i' \leq c_i$, for any
 fixed costs $c_{-i}$ of agents in $\mathcal{N}\setminus\{i\}$, $i\in f(c_i,
-c_{-i})\implies i\in f(c_i', c_{-i})$.
-\item agents are paid their \emph{threshold payments}: agent $i$ receives
-$\inf\{c_i: i\in f(c_i, c_{-i})\}$.
-\end{enumerate}
-\end{theorem}
+c_{-i})$ implies $i\in f(c_i', c_{-i})$, and (b)
+%\item
+ agents are paid \emph{threshold payments}, \emph{i.e.}, for all $i\in f(c)$, $p_i(c)=\inf\{c_i': i\in f(c_i', c_{-i})\}$.
+%\end{enumerate}
+\end{lemma}
+\fussy
+Myerson's Theorem is particularly useful when designing a budget feasible truthful
+mechanism, as it allows us to focus on designing a monotone allocation function. Then, the
+mechanism will be truthful as long as we give each agent her threshold payment---the caveat being that the latter need to sum to a value below $B$.
+
+\subsection{Budget Feasible Experimental Design}
+In this paper, we approach the problem of optimal experimental design from the
+perspective of a budget feasible reverse auction, as defined above.
+ In particular, we assume the experimenter has a budget
+$B\in\reals_+$ and plays the role of the buyer. 
+Each experiment $i\in
+\mathcal{N}$ corresponds to a strategic agent, whose cost $c_i$ is
+private. In order to obtain the measurement $y_i$, the experimenter  needs to
+pay agent $i$ a price that exceeds her cost. 
+
+For example, each $i$ may correspond to a human participant; the
+feature vector $x_i$ may correspond to a normalized vector of her age, weight,
+gender, income, \emph{etc.}, and the measurement $y_i$ may capture some
+biometric information (\emph{e.g.}, her red cell blood count, a genetic marker,
+etc.). The cost $c_i$ is the amount the participant deems sufficient to
+incentivize her participation in the study. Note that, in this setup, the feature vectors $x_i$ are public information that the experimenter can consult prior the experiment design. Moreover, though a participant may lie about her true cost $c_i$, she cannot lie about $x_i$ (\emph{i.e.}, all features are verifiable upon collection) or $y_i$ (\emph{i.e.}, she cannot falsify her measurement).
+
+
 
-This theorem is particularly useful when designing a budget feasible truthful
-mechanism: we can focus on designing a monotone allocation function. Then, the
-mechanism will be truthful as long as we give each agent her threshold payment.
-Finally, it suffices to prove that the sum of threshold payments does not
-exceed the budget to ensure budget feasibility.
 
 \begin{comment}
 \subsection{Experimental Design}
-- 
cgit v1.2.3-70-g09d2