diff options
| author | Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)> | 2013-02-10 23:48:32 -0800 |
|---|---|---|
| committer | Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)> | 2013-02-10 23:48:32 -0800 |
| commit | e788f78d909b7ecfaf11cd14eee15f1ae5ba1ff4 (patch) | |
| tree | be94c1d13099a317472dc99005dc20bb7b3f6e5e | |
| parent | c2c1e8eda0d8fe8f87a5b6b49a5eebdc614c4d2b (diff) | |
| download | recommendation-e788f78d909b7ecfaf11cd14eee15f1ae5ba1ff4.tar.gz | |
section 3
| -rw-r--r-- | problem.tex | 120 |
1 files changed, 78 insertions, 42 deletions
diff --git a/problem.tex b/problem.tex index 6027379..cec7b47 100644 --- a/problem.tex +++ b/problem.tex @@ -8,15 +8,22 @@ Putting cost considerations aside, suppose that an experimenter \E\ wishes to co \end{align} where $\beta$ is a vector in $\reals^d$, commonly referred to as the \emph{model}, and $\varepsilon_i$ (the \emph{measurement noise}) are independent, normally distributed random variables with mean 0 and variance $\sigma^2$. +For example, each $i$ may correspond to a human subject; the +feature vector $x_i$ may correspond to a normalized vector of her age, weight, +gender, income, \emph{etc.}, and the measurement $y_i$ may capture some +biometric information (\emph{e.g.}, her red cell blood count, a genetic marker, +etc.). The magnitude of the coefficient $\beta_i$ captures the effect that feature $i$ has on the measured variable, and its sign captures whether the corellation is positive or negative. + + The purpose of these experiments is to allow \E\ to estimate the model $\beta$. In particular, assume that the experimenter \E\ has a {\em prior} distribution on $\beta$, \emph{i.e.}, $\beta$ has a multivariate normal prior with zero mean and covariance $\sigma^2R^{-1}\in \reals^{d^2}$ (where $\sigma^2$ is the noise variance). Then, \E\ estimates $\beta$ through \emph{maximum a posteriori estimation}: \emph{i.e.}, finding the parameter which maximizes the posterior distribution of $\beta$ given the observations $y_S$. Under the linearity assumption \eqref{model} and the Gaussian prior on $\beta$, maximum a posteriori estimation leads to the following maximization \cite{hastie}: -\begin{displaymath} - \hat{\beta} = \argmin_{\beta\in\reals^d} \big(\sum_{i\in S} (y_i - \T{\beta}x_i)^2 +\begin{align} + \hat{\beta} = \argmax_{\beta\in\reals^d} \prob(\beta\mid y_S) =\argmin_{\beta\in\reals^d} \big(\sum_{i\in S} (y_i - \T{\beta}x_i)^2 + \T{\beta}R\beta\big) = (R+\T{X_S}X_S)^{-1}X_S^Ty_S \label{ridge} -\end{displaymath} +\end{align} where $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ is the matrix of experiment features and $y_S=[y_i]_{i\in S}\in\reals^{|S|}$ the observed measurements. This optimization, commonly known as \emph{ridge regression}, includes an additional quadratic penalty term compared to the standard least squares estimation. @@ -31,13 +38,12 @@ the noise terms $\varepsilon_i$). In particular, $\hat{\beta}$ has %mean $\beta$ covariance $(R+\T{X_S}X_S)^{-1}$. -Let $V:2^\mathcal{N}\to\reals$ be a \emph{value function}, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The standard optimal experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. - -A variety of different value functions are used in experimental design~\cite{pukelsheim2006optimal}; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value function that has natural advantages is the \emph{information gain}: %\emph{$D$-optimality criterion}: %which yields the following optimization problem +Let $V:2^\mathcal{N}\to\reals$ be a \emph{value function}, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The classical experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. +A variety of different value functions are used in literature~\cite{pukelsheim2006optimal}; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value function that has natural advantages is the \emph{information gain}: %\emph{$D$-optimality criterion}: %which yields the following optimization problem \begin{align} V(S)= I(\beta;y_S) = \entropy(\beta)-\entropy(\beta\mid y_S). \label{informationgain} \end{align} -which is the entropy reduction on $\beta$ after the revelation of $y_S$ (also known as the mutual information between $Y_S$ and $\beta$). +which is the entropy reduction on $\beta$ after the revelation of $y_S$ (also known as the mutual information between $y_S$ and $\beta$). Hence, selecting a set of experiments $S$ that maximizes $V(S)$ is equivalent to finding the set of experiments that minimizes the uncertainty on $\beta$, as captured by the entropy reduction of its estimator. @@ -58,12 +64,17 @@ which is indeed a function of the covariance matrix $(R+\T{X_S}X_S)^{-1}$. %V(S) & = \frac{1}{2} \log\det \left(I + \T{X_S}X_S\right) %\end{align} %There are several reasons - Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, -$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$. %In addition, the maximization of convex relaxations of this function is a well-studied problem \cite{boyd}. -Our analysis will focus on the case of a \emph{homotropic} prior, in which the prior covariance is the identity matrix, \emph{i.e.}, $$R=I_d\in \reals^{d\times d}.$$ Intuitively, this corresponds to the simplest case of a prior, in which no direction of $\beta$ is a priori favored; equivalently it also corresponds to solving a ridge regression problem with penalty term $\norm{\beta}_2^2$. + %In addition, the maximization of convex relaxations of this function is a well-studied problem \cite{boyd}. +Our analysis will focus on the case of a \emph{homotropic} prior, in which the prior covariance is the identity matrix, \emph{i.e.}, $$R=I_d\in \reals^{d\times d}.$$ Intuitively, this corresponds to the simplest prior, in which no direction of $\reals^d$ is a priori favored; equivalently, it also corresponds to the case where ridge regression estimation \eqref{ridge} performed by $\E$ has a penalty term $\norm{\beta}_2^2$. + + %Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, +%$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$. \subsection{Budgeted-Feasible Experimental Design: Full Information Case} -We depart from the above classic experimental design setting by assuming that each experiment is associated with a cost $c_i\in\reals_+$. Moreover, the experimenter $\E$ is limited by a budget $B\in \reals_+$. In the full-information case, the experiment costs are common knowledge and the optimization problem that the experimenter wishes to solve is: +We depart from the above classic experimental design setting by assuming that each experiment is associated with a cost $c_i\in\reals_+$. Moreover, the experimenter $\E$ is limited by a budget $B\in \reals_+$. +The cost $c_i$ can capture, \emph{e.g.}, the amount the subject $i$ deems sufficient to +incentivize her participation in the experiment. +In the full-information case, the experiment costs are common knowledge; as such, the optimization problem that the experimenter wishes to solve is: \begin{center} \textsc{ExperimentalDesignProblem} (EDP) \begin{subequations} @@ -73,48 +84,71 @@ We depart from the above classic experimental design setting by assuming that ea \end{align}\label{edp} \end{subequations} \end{center} -\EDP{} \emph{and} the corresponding problem with the more general objective \eqref{dcrit} are NP-hard; to see this, note that \textsc{Knapsack} reduces to EDP with dimension $d=1$ by mapping the weight of each item $w_i$ to an experiment with $x_i=w_i$. Moreover, \eqref{modified} is submodular, monotone and satisfies $V(\emptyset) = 0$. +\EDP{}, as defined above, is NP-hard; to see this, note that \textsc{Knapsack} reduces to EDP with dimension $d=1$ by mapping the weight of each item, say, $w_i$, to an experiment with $x_i=w_i$. Moreover, \eqref{modified} is submodular, monotone and satisfies $V(\emptyset) = 0$. - Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, -$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$, and non-negative, as $V(S)\geq 0$ for all $S\subset\mathcal{N}$. + Note that \eqref{modified} is a submodular set function, \emph{i.e.}, +$V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$. It is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$, with $V(\emptyset)=0$. Finally, it is also non-negative, \emph{i.e.}, $V(S)\geq 0$ for all $S\subset\mathcal{N}$, as the matrix $\T{X_S}X_S$ is positive semi-definite for all $S\subseteq \mathcal{N}$. %Finally, we define the strategic version of \EDP{} (denoted by \SEDP) by adding the additional constraints of truthfulness, individual rationality, and normal payments, as described in the previous section. - - -\subsection{Experimental Design Problem: Strategic Case} -When agents are strategic, the experimental design problem becomes a special case of a \emph{budget feasible reverse auction} -\cite{singer-mechanisms} comprises a set of items $\mathcal{N}=\{1,\ldots,n\}$ as well as a single buyer. Each item has a cost $c_i\in\reals_+$. Moreover, the buyer has a positive value function $V:2^{\mathcal{N}}\to \reals_+$, as well as a budget $B\in \reals_+$. In the \emph{full information case}, the costs $c_i$ are common -knowledge; the objective of the buyer in this context is to select a set $S$ -maximizing the value $V(S)$ subject to the constraint $\sum_{i\in S} c_i\leq -B$. We write: + We denote by \begin{equation}\label{eq:non-strategic} OPT = \max_{S\subseteq\mathcal{N}} \Big\{V(S) \;\Big| \; \sum_{i\in S}c_i\leq B\Big\} \end{equation} -for the optimal value achievable in the full-information case. %\stratis{Should be $OPT(V,c,B)$\ldots better drop the arguments here and introduce them wherever necessary.} +for the optimal value achievable in the full-information case. +\subsection{Experimental Design Problem: Strategic Case} +When the experiment subjects are strategic, the experimental design problem becomes a special case of a \emph{budget feasible reverse auction} +, as introduced by \citeN{singer-mechanisms}. %comprises a set of items $\mathcal{N}=\{1,\ldots,n\}$ as well as a single buyer. Each item has a cost $c_i\in\reals_+$. Moreover, the buyer has a positive value function $V:2^{\mathcal{N}}\to \reals_+$, as well as a budget $B\in \reals_+$. In the \emph{full information case}, the costs $c_i$ are common +%knowledge; the objective of the buyer in this context is to select a set $S$ +%maximizing the value $V(S)$ subject to the constraint $\sum_{i\in S} c_i\leq +%B$. We write: +%\begin{equation}\label{eq:non-strategic} +% OPT = \max_{S\subseteq\mathcal{N}} \Big\{V(S) \;\Big| \; +% \sum_{i\in S}c_i\leq B\Big\} +%\end{equation} +%for the optimal value achievable in the full-information case. %\stratis{Should be $OPT(V,c,B)$\ldots better drop the arguments here and introduce them wherever necessary.} +In this case, the costs $c_i$ are not common knowledge and their reporting can be manipulated by the experiment subjects. + Moreover, though a subject may lie about her true cost $c_i$, she cannot lie about $x_i$ (\emph{i.e.}, all public features are verifiable prior to the experiment) nor $y_i$ (\emph{i.e.}, the subject cannot falsify her measurement). -In the \emph{strategic case}, each items in $\mathcal{N}$ is held by a different strategic agent, whose cost is \emph{a priori} private. -A \emph{mechanism} $\mathcal{M} = (f,p)$ comprises (a) an \emph{allocation function} -$f:\reals_+^n \to 2^\mathcal{N}$ and (b) a \emph{payment function} + + + +A \emph{mechanism} $\mathcal{M} = (S,p)$ comprises (a) an \emph{allocation function} +$S:\reals_+^n \to 2^\mathcal{N}$ and (b) a \emph{payment function} $p:\reals_+^n\to \reals_+^n$. Given the vector or costs $c=[c_i]_{i\in\mathcal{N}}$, the allocation function $f$ determines the set in $ \mathcal{N}$ of items to be purchased, while the payment function -returns a vector of payments $[p_i]_{i\in \mathcal{N}}$. Let $s_i(c) = \id_{i\in f_i(c)}$ be the binary indicator of $i\in f(c)$. As in -\cite{singer-mechanisms, chen}, we study mechanisms that are normalized -($i\notin f(c)$ implies $p_i(c)=0$), individually rational ($p_i(c)\geq c_i\cdot s_i(c)$) and have -no positive transfers ($p_i(c)\geq 0$). - -In addition to the above, mechanism design in budget feasible reverse auctions seeks mechanisms that have the following properties \cite{singer-mechanisms,chen}: +returns a vector of payments $[p_i]_{i\in \mathcal{N}}$. + Let $s_i(c) = \id_{i\in S(c)}$ be the binary indicator of $i\in f(c)$. Mechanism design in budget feasible reverse auctions seeks mechanisms that have the following properties \cite{singer-mechanisms,chen}: \begin{itemize} +\item \emph{Normalization.} Unallocated experiments receive zero payments, \emph{i.e.}, + \begin{align}\notin S(c)\text{ implies }p_i(c)=0.\label{normal}\end{align} +\item\emph{Individual Rationality.} Payments for allocated experiments exceed costs, \emph{i.e.}, + \begin{align} p_i(c)\geq c_i\cdot s_i(c).\label{ir}\end{align} +\item \emph{No Positive Transfers.} Payments are non-negative,\emph{i.e.}, +\begin{align}p_i(c)\geq 0\label{pt}\end{align}. \item \emph{Truthfulness.} An agent has no incentive to missreport her true cost. Formally, let $c_{-i}$ - be a vector of costs of all agents except $i$. % $c_{-i}$ being the same). Let $[p_i']_{i\in \mathcal{N}} = p(c_i', + be a vector of costs of all agents except $i$. Then, % $c_{-i}$ being the same). Let $[p_i']_{i\in \mathcal{N}} = p(c_i', % c_{-i})$, then the -A mechanism is truthful iff $p_i(c_i,c_{-i}) - s_i(c_i,c_{-i})\cdot c_i \geq p_i(c_i',c_{-i}) - s(c_i',c_{-i})\cdot c_i$ for every $i \in \mathcal{N}$ and every two cost vectors $(c_i,c_{-i})$ and $(c_i',c_{-i})$. +\begin{align} p_i(c_i,c_{-i}) - s_i(c_i,c_{-i})\cdot c_i \geq p_i(c_i',c_{-i}) - s(c_i',c_{-i})\cdot c_i\label{truthful}\end{align} for every $i \in \mathcal{N}$ and every two cost vectors $(c_i,c_{-i})$ and $(c_i',c_{-i})$. \item \emph{Budget Feasibility.} The sum of the payments should not exceed the - budget constraint: + budget constraint, \emph{i.e.} %\begin{displaymath} - $ \sum_{i\in\mathcal{N}} p_i \leq B.$ + \begin{align} \sum_{i\in\mathcal{N}} p_i \leq B.\label{budget}\end{align} %\end{displaymath} - \item \emph{Approximation ratio.} The value of the allocated set should not +\end{itemize} +We define the \emph{Strategic} version of \EDP as +\begin{center} +\textsc{StrategicExperimentalDesignProblem} (SEDP) +%\begin{subequations} +\begin{align*} +\text{Maximize:}&\quad V(S) = \log\det(I_d+\T{X_{S}}X_{S}) \\ %\label{modified} \\ +\text{subject to:}&\quad (S,p)\text{ satisfying }\eqref{normal}-\eqref{budget} +\end{align*} +%\end{subequations} +\end{center} +Given that the full information problem \EDP{} is NP-hard, we further seek mechanisms that have the following two additional properties: +\begin{itemize} + \item \emph{Approximation ratio.} The value of the allocated set should not be too far from the optimum value of the full information case \eqref{eq:non-strategic}. Formally, there must exist some $\alpha\geq 1$ such that: @@ -133,15 +167,15 @@ private value. In this case, Myerson's Theorem \cite{myerson} gives a characterization of truthful mechanisms. \begin{lemma}[Myerson \cite{myerson}]\label{thm:myerson} -\sloppy A normalized mechanism $\mathcal{M} = (f,p)$ for a single parameter auction is +\sloppy A normalized mechanism $\mathcal{M} = (S,p)$ for a single parameter auction is truthful iff: %\begin{enumerate} %\item (a) $f$ is \emph{monotone}, \emph{i.e.}, for any agent $i$ and $c_i' \leq c_i$, for any -fixed costs $c_{-i}$ of agents in $\mathcal{N}\setminus\{i\}$, $i\in f(c_i, -c_{-i})$ implies $i\in f(c_i', c_{-i})$, and (b) +fixed costs $c_{-i}$ of agents in $\mathcal{N}\setminus\{i\}$, $i\in S(c_i, +c_{-i})$ implies $i\in S(c_i', c_{-i})$, and (b) %\item - agents are paid \emph{threshold payments}, \emph{i.e.}, for all $i\in f(c)$, $p_i(c)=\inf\{c_i': i\in f(c_i', c_{-i})\}$. + agents are paid \emph{threshold payments}, \emph{i.e.}, for all $i\in S(c)$, $p_i(c)=\inf\{c_i': i\in S(c_i', c_{-i})\}$. %\end{enumerate} \end{lemma} \fussy @@ -149,6 +183,8 @@ Myerson's Theorem % is particularly useful when designing a budget feasible truthful mechanism, as it allows us to focus on designing a monotone allocation function. Then, the mechanism will be truthful as long as we give each agent her threshold payment---the caveat being that the latter need to sum to a value below $B$. + +\begin{comment} \subsection{Budget Feasible Experimental Design} We approach the problem of optimal experimental design from the @@ -205,7 +241,7 @@ Finally, we define the strategic version of \EDP{} (denoted by \SEDP) by adding % context of budget feasible auctions \cite{chen,singer-mechanisms}. %\stratis{A potential problem is that all of the above properties hold also for, \emph{e.g.}, $V(S)=\log(1+\det(\T{X_S}X_S))$\ldots} - +\end{comment} \begin{comment}\junk{ As \eqref{dcrit} may take arbitrarily small negative values, to define a meaningful approximation one would consider the (equivalent) maximization of $V(S) = \det\T{X_S}X_S$. %, for some strictly increasing, on-to function $f:\reals_+\to\reals_+$. |
