section 3

author: Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)> 2013-02-10 23:48:32 -0800
committer: Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)> 2013-02-10 23:48:32 -0800
commit: e788f78d909b7ecfaf11cd14eee15f1ae5ba1ff4 (patch)
tree: be94c1d13099a317472dc99005dc20bb7b3f6e5e
parent: c2c1e8eda0d8fe8f87a5b6b49a5eebdc614c4d2b (diff)
download: recommendation-e788f78d909b7ecfaf11cd14eee15f1ae5ba1ff4.tar.gz
1 files changed, 78 insertions, 42 deletions
diff --git a/problem.tex b/problem.tex
index 6027379..cec7b47 100644
--- a/problem.tex
+++ b/problem.tex
@@ -8,15 +8,22 @@ Putting cost considerations aside, suppose that an experimenter \E\ wishes to co
 \end{align}
 where $\beta$ is a vector in $\reals^d$, commonly referred to as the \emph{model}, and $\varepsilon_i$ (the \emph{measurement noise}) are independent, normally distributed random  variables with mean 0 and variance $\sigma^2$. 
 
+For example, each $i$ may correspond to a human subject; the
+feature vector $x_i$ may correspond to a normalized vector of her age, weight,
+gender, income, \emph{etc.}, and the measurement $y_i$ may capture some
+biometric information (\emph{e.g.}, her red cell blood count, a genetic marker,
+etc.). The magnitude of the coefficient $\beta_i$ captures the effect that feature $i$ has on the measured variable, and its sign captures whether the corellation is positive or negative.
+
+
 The purpose of these experiments is to allow \E\  to estimate the model $\beta$. In particular,
  assume that the experimenter  \E\ has a {\em prior}
 distribution on $\beta$, \emph{i.e.},  $\beta$ has a multivariate normal prior
 with zero mean  and covariance $\sigma^2R^{-1}\in \reals^{d^2}$ (where $\sigma^2$ is the noise variance). 
 Then, \E\ estimates $\beta$ through \emph{maximum a posteriori estimation}: \emph{i.e.}, finding the parameter which maximizes the posterior distribution of $\beta$ given the observations $y_S$. Under the linearity assumption \eqref{model} and the Gaussian prior on $\beta$, maximum a posteriori estimation leads to the following maximization \cite{hastie}: 
-\begin{displaymath}
-    \hat{\beta} = \argmin_{\beta\in\reals^d} \big(\sum_{i\in S} (y_i - \T{\beta}x_i)^2
+\begin{align}
+    \hat{\beta} = \argmax_{\beta\in\reals^d} \prob(\beta\mid y_S) =\argmin_{\beta\in\reals^d} \big(\sum_{i\in S} (y_i - \T{\beta}x_i)^2
     + \T{\beta}R\beta\big) = (R+\T{X_S}X_S)^{-1}X_S^Ty_S \label{ridge}
-\end{displaymath}
+\end{align}
 where $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ is the matrix of experiment features and
 $y_S=[y_i]_{i\in S}\in\reals^{|S|}$ the observed measurements. 
 This optimization, commonly known as \emph{ridge regression}, includes an additional quadratic penalty term compared to the standard least squares estimation.
@@ -31,13 +38,12 @@ the noise terms $\varepsilon_i$). In particular, $\hat{\beta}$ has %mean $\beta$
  covariance
 $(R+\T{X_S}X_S)^{-1}$.
 
-Let $V:2^\mathcal{N}\to\reals$ be a \emph{value function}, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The standard optimal experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. 
-
-A variety of different value functions are used in experimental design~\cite{pukelsheim2006optimal}; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value function that has natural advantages is the \emph{information gain}:  %\emph{$D$-optimality criterion}: %which yields the following optimization problem
+Let $V:2^\mathcal{N}\to\reals$ be a \emph{value function}, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The classical experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. 
+A variety of different value functions are used in literature~\cite{pukelsheim2006optimal}; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value function that has natural advantages is the \emph{information gain}:  %\emph{$D$-optimality criterion}: %which yields the following optimization problem
 \begin{align}
    V(S)= I(\beta;y_S) = \entropy(\beta)-\entropy(\beta\mid y_S). \label{informationgain}
 \end{align}
-which is the entropy reduction on $\beta$ after the revelation of $y_S$ (also known as the mutual information between $Y_S$ and $\beta$). 
+which is the entropy reduction on $\beta$ after the revelation of $y_S$ (also known as the mutual information between $y_S$ and $\beta$). 
 Hence, selecting a set of experiments $S$ that
 maximizes $V(S)$ is equivalent to finding the set of experiments that minimizes
 the uncertainty on $\beta$, as captured by the entropy reduction of its estimator.
@@ -58,12 +64,17 @@ which is indeed a function of the covariance matrix $(R+\T{X_S}X_S)^{-1}$.
 %V(S) & = \frac{1}{2} \log\det \left(I + \T{X_S}X_S\right)
 %\end{align}
 %There are several  reasons
- Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, 
-$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$. %In addition, the maximization of convex relaxations of this function is a  well-studied problem \cite{boyd}.
-Our analysis will focus on the case of a \emph{homotropic} prior, in which the prior covariance is the identity matrix, \emph{i.e.}, $$R=I_d\in \reals^{d\times d}.$$ Intuitively, this corresponds to the simplest case of a prior, in which no direction of $\beta$ is a priori favored; equivalently it also corresponds to solving a ridge regression problem with penalty term $\norm{\beta}_2^2$.
+ %In addition, the maximization of convex relaxations of this function is a  well-studied problem \cite{boyd}.
+Our analysis will focus on the case of a \emph{homotropic} prior, in which the prior covariance is the identity matrix, \emph{i.e.}, $$R=I_d\in \reals^{d\times d}.$$ Intuitively, this corresponds to the simplest prior, in which no direction of $\reals^d$ is a priori favored; equivalently, it also corresponds to the case where ridge regression estimation \eqref{ridge} performed by $\E$ has a penalty term $\norm{\beta}_2^2$.
+
+ %Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, 
+%$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$.
 
 \subsection{Budgeted-Feasible Experimental Design: Full Information Case}
-We depart from the above classic experimental design setting by assuming that each experiment is associated with a cost $c_i\in\reals_+$. Moreover, the experimenter $\E$ is limited by a budget $B\in \reals_+$. In the full-information case, the experiment costs are common knowledge and the optimization problem that the experimenter wishes to solve is: 
+We depart from the above classic experimental design setting by assuming that each experiment is associated with a cost $c_i\in\reals_+$. Moreover, the experimenter $\E$ is limited by a budget $B\in \reals_+$. 
+The cost $c_i$ can capture, \emph{e.g.}, the amount the subject $i$ deems sufficient to
+incentivize her participation in the experiment. 
+In the full-information case, the experiment costs are common knowledge; as such, the optimization problem that the experimenter wishes to solve is: 
 \begin{center}
 \textsc{ExperimentalDesignProblem} (EDP)
 \begin{subequations}
@@ -73,48 +84,71 @@ We depart from the above classic experimental design setting by assuming that ea
 \end{align}\label{edp}
 \end{subequations}
 \end{center}
-\EDP{} \emph{and} the corresponding problem with the more general objective \eqref{dcrit} are NP-hard; to see this, note that \textsc{Knapsack} reduces to EDP with dimension $d=1$ by mapping the weight of each item $w_i$ to an experiment with $x_i=w_i$. Moreover, \eqref{modified} is submodular, monotone and satisfies $V(\emptyset) = 0$.
+\EDP{}, as defined above, is  NP-hard; to see this, note that \textsc{Knapsack} reduces to EDP with dimension $d=1$ by mapping the weight of each item, say, $w_i$, to an experiment with $x_i=w_i$. Moreover, \eqref{modified} is submodular, monotone and satisfies $V(\emptyset) = 0$.
 
- Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, 
-$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$, and non-negative, as $V(S)\geq 0$ for all $S\subset\mathcal{N}$.
+ Note that \eqref{modified} is a submodular set function, \emph{i.e.}, 
+$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$. It is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$, with $V(\emptyset)=0$. Finally, it is also non-negative, \emph{i.e.}, $V(S)\geq 0$ for all $S\subset\mathcal{N}$, as the matrix $\T{X_S}X_S$ is positive semi-definite for all $S\subseteq \mathcal{N}$.
 %Finally, we define the strategic version  of \EDP{} (denoted by \SEDP) by adding the additional constraints of truthfulness, individual rationality, and normal payments, as described in the previous section.
-
-
-\subsection{Experimental Design Problem: Strategic Case}
-When agents are strategic, the experimental design problem becomes a special case of  a \emph{budget feasible reverse auction}
-\cite{singer-mechanisms} comprises a set of items $\mathcal{N}=\{1,\ldots,n\}$ as well as a single buyer. Each item has a cost $c_i\in\reals_+$. Moreover, the buyer has a positive value function  $V:2^{\mathcal{N}}\to \reals_+$, as well as a budget $B\in \reals_+$. In the \emph{full information case}, the costs $c_i$ are common
-knowledge; the objective of the buyer in this context is to select a set $S$
-maximizing the value $V(S)$ subject to the constraint $\sum_{i\in S} c_i\leq
-B$. We write:
+ We denote by
 \begin{equation}\label{eq:non-strategic}
     OPT = \max_{S\subseteq\mathcal{N}} \Big\{V(S) \;\Big| \;
     \sum_{i\in S}c_i\leq B\Big\}
 \end{equation}
-for the optimal value achievable in the full-information case. %\stratis{Should be $OPT(V,c,B)$\ldots better drop the arguments here and introduce them wherever necessary.}
+for the optimal value achievable in the full-information case. 
+\subsection{Experimental Design Problem: Strategic Case}
+When the experiment subjects are strategic, the experimental design problem becomes a special case of  a \emph{budget feasible reverse auction}
+, as introduced by \citeN{singer-mechanisms}.  %comprises a set of items $\mathcal{N}=\{1,\ldots,n\}$ as well as a single buyer. Each item has a cost $c_i\in\reals_+$. Moreover, the buyer has a positive value function  $V:2^{\mathcal{N}}\to \reals_+$, as well as a budget $B\in \reals_+$. In the \emph{full information case}, the costs $c_i$ are common
+%knowledge; the objective of the buyer in this context is to select a set $S$
+%maximizing the value $V(S)$ subject to the constraint $\sum_{i\in S} c_i\leq
+%B$. We write:
+%\begin{equation}\label{eq:non-strategic}
+%    OPT = \max_{S\subseteq\mathcal{N}} \Big\{V(S) \;\Big| \;
+%    \sum_{i\in S}c_i\leq B\Big\}
+%\end{equation}
+%for the optimal value achievable in the full-information case. %\stratis{Should be $OPT(V,c,B)$\ldots better drop the arguments here and introduce them wherever necessary.}
+In this case, the costs $c_i$ are not common knowledge and their reporting can be manipulated by the experiment subjects.
+ Moreover, though a subject may lie about her true cost $c_i$, she cannot lie about $x_i$ (\emph{i.e.}, all public features are verifiable prior to the experiment) nor $y_i$ (\emph{i.e.}, the subject cannot falsify her measurement).
 
-In the \emph{strategic case}, each items in $\mathcal{N}$ is held by a different strategic agent, whose cost is \emph{a priori} private.
-A \emph{mechanism} $\mathcal{M} = (f,p)$ comprises (a) an \emph{allocation function}
-$f:\reals_+^n \to 2^\mathcal{N}$ and (b) a \emph{payment function}
+
+
+
+A \emph{mechanism} $\mathcal{M} = (S,p)$ comprises (a) an \emph{allocation function}
+$S:\reals_+^n \to 2^\mathcal{N}$ and (b) a \emph{payment function}
 $p:\reals_+^n\to \reals_+^n$. Given the
 vector or costs $c=[c_i]_{i\in\mathcal{N}}$, the allocation function $f$ determines the set in
 $ \mathcal{N}$ of items to be purchased, while the payment function
-returns a vector of payments $[p_i]_{i\in \mathcal{N}}$. Let $s_i(c) = \id_{i\in f_i(c)}$ be the binary indicator of  $i\in f(c)$. As in
-\cite{singer-mechanisms, chen}, we study mechanisms that are normalized
-($i\notin f(c)$ implies $p_i(c)=0$), individually rational ($p_i(c)\geq c_i\cdot s_i(c)$) and have
-no positive transfers ($p_i(c)\geq 0$). 
-
-In addition to the above, mechanism design in budget feasible reverse auctions seeks mechanisms that have the following properties \cite{singer-mechanisms,chen}:
+returns a vector of payments $[p_i]_{i\in \mathcal{N}}$.
+ Let $s_i(c) = \id_{i\in S(c)}$ be the binary indicator of  $i\in f(c)$. Mechanism design in budget feasible reverse auctions seeks mechanisms that have the following properties \cite{singer-mechanisms,chen}:
 \begin{itemize}
+\item \emph{Normalization.} Unallocated experiments receive zero payments, \emph{i.e.},
+ \begin{align}\notin S(c)\text{ implies }p_i(c)=0.\label{normal}\end{align}
+\item\emph{Individual Rationality.} Payments for allocated experiments exceed costs, \emph{i.e.},
+ \begin{align} p_i(c)\geq c_i\cdot s_i(c).\label{ir}\end{align}
+\item \emph{No Positive Transfers.} Payments are non-negative,\emph{i.e.},
+\begin{align}p_i(c)\geq 0\label{pt}\end{align}. 
     \item \emph{Truthfulness.} An agent has no incentive to missreport her true cost. Formally, let $c_{-i}$
- be a vector of costs of all agents except $i$. %    $c_{-i}$ being the same). Let $[p_i']_{i\in \mathcal{N}} = p(c_i',
+ be a vector of costs of all agents except $i$. Then, %    $c_{-i}$ being the same). Let $[p_i']_{i\in \mathcal{N}} = p(c_i',
 %    c_{-i})$, then the 
-A mechanism is truthful iff  $p_i(c_i,c_{-i}) - s_i(c_i,c_{-i})\cdot c_i \geq p_i(c_i',c_{-i}) - s(c_i',c_{-i})\cdot c_i$ for every $i \in \mathcal{N}$ and every two cost vectors $(c_i,c_{-i})$ and $(c_i',c_{-i})$.
+\begin{align}  p_i(c_i,c_{-i}) - s_i(c_i,c_{-i})\cdot c_i \geq p_i(c_i',c_{-i}) - s(c_i',c_{-i})\cdot c_i\label{truthful}\end{align} for every $i \in \mathcal{N}$ and every two cost vectors $(c_i,c_{-i})$ and $(c_i',c_{-i})$.
     \item \emph{Budget Feasibility.} The sum of the payments should not exceed the
-    budget constraint:
+    budget constraint, \emph{i.e.}
     %\begin{displaymath}
-     $   \sum_{i\in\mathcal{N}} p_i \leq B.$
+     \begin{align}   \sum_{i\in\mathcal{N}} p_i \leq B.\label{budget}\end{align}
     %\end{displaymath}
-    \item \emph{Approximation ratio.} The value of the allocated set should not
+\end{itemize}
+We define the \emph{Strategic} version of \EDP as
+\begin{center}
+\textsc{StrategicExperimentalDesignProblem} (SEDP)
+%\begin{subequations}
+\begin{align*}
+\text{Maximize:}&\quad V(S) = \log\det(I_d+\T{X_{S}}X_{S}) \\ %\label{modified} \\
+\text{subject to:}&\quad (S,p)\text{ satisfying }\eqref{normal}-\eqref{budget}   
+\end{align*}
+%\end{subequations}
+\end{center}
+Given that the full information problem \EDP{} is NP-hard, we further seek mechanisms that have the following two additional properties:
+\begin{itemize}
+    \item \emph{Approximation ratio.} The value of the allocated set  should not
     be too far from the optimum value of the full information case
     \eqref{eq:non-strategic}. Formally, there must exist some $\alpha\geq 1$
     such that:
@@ -133,15 +167,15 @@ private value. In this case, Myerson's Theorem \cite{myerson} gives
 a characterization of truthful mechanisms.
 
 \begin{lemma}[Myerson \cite{myerson}]\label{thm:myerson}
-\sloppy A normalized mechanism $\mathcal{M} = (f,p)$ for a single parameter auction is
+\sloppy A normalized mechanism $\mathcal{M} = (S,p)$ for a single parameter auction is
 truthful iff:
 %\begin{enumerate}
 %\item 
 (a) $f$ is \emph{monotone}, \emph{i.e.}, for any agent $i$ and $c_i' \leq c_i$, for any
-fixed costs $c_{-i}$ of agents in $\mathcal{N}\setminus\{i\}$, $i\in f(c_i,
-c_{-i})$ implies $i\in f(c_i', c_{-i})$, and (b)
+fixed costs $c_{-i}$ of agents in $\mathcal{N}\setminus\{i\}$, $i\in S(c_i,
+c_{-i})$ implies $i\in S(c_i', c_{-i})$, and (b)
 %\item
- agents are paid \emph{threshold payments}, \emph{i.e.}, for all $i\in f(c)$, $p_i(c)=\inf\{c_i': i\in f(c_i', c_{-i})\}$.
+ agents are paid \emph{threshold payments}, \emph{i.e.}, for all $i\in S(c)$, $p_i(c)=\inf\{c_i': i\in S(c_i', c_{-i})\}$.
 %\end{enumerate}
 \end{lemma}
 \fussy
@@ -149,6 +183,8 @@ Myerson's Theorem
 % is particularly useful when designing a budget feasible truthful mechanism, as it 
 allows us to focus on designing a monotone allocation function. Then, the
 mechanism will be truthful as long as we give each agent her threshold payment---the caveat being that the latter need to sum to a value below $B$.
+
+\begin{comment}
 \subsection{Budget Feasible Experimental Design}
 
 We approach the problem of optimal experimental design from the
@@ -205,7 +241,7 @@ Finally, we define the strategic version  of \EDP{} (denoted by \SEDP) by adding
 % context of budget feasible auctions \cite{chen,singer-mechanisms}. 
 
 %\stratis{A potential problem is that all of the above properties hold also for, \emph{e.g.}, $V(S)=\log(1+\det(\T{X_S}X_S))$\ldots}
-
+\end{comment}
 
 \begin{comment}\junk{
 As \eqref{dcrit} may take arbitrarily small negative values, to define a meaningful approximation one would consider  the (equivalent) maximization of $V(S) = \det\T{X_S}X_S$. %, for some strictly increasing, on-to function $f:\reals_+\to\reals_+$.
author	Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)>	2013-02-10 23:48:32 -0800
committer	Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)>	2013-02-10 23:48:32 -0800
commit	e788f78d909b7ecfaf11cd14eee15f1ae5ba1ff4 (patch)
tree	be94c1d13099a317472dc99005dc20bb7b3f6e5e
parent	c2c1e8eda0d8fe8f87a5b6b49a5eebdc614c4d2b (diff)
download	recommendation-e788f78d909b7ecfaf11cd14eee15f1ae5ba1ff4.tar.gz