diff options
| author | Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)> | 2013-02-10 17:35:40 -0800 |
|---|---|---|
| committer | Stratis Ioannidis <stratis@stratis-Latitude-E6320.(none)> | 2013-02-10 17:35:40 -0800 |
| commit | 73ee783f77425349498562925e0047e1f1aee1c9 (patch) | |
| tree | 6ff58434f07ab5dbb364e19f52ab5d7a053148a4 /problem.tex | |
| parent | 1e7b3ec177797e7135cd2bb26eb8612309a134da (diff) | |
| download | recommendation-73ee783f77425349498562925e0047e1f1aee1c9.tar.gz | |
intro related sedp
Diffstat (limited to 'problem.tex')
| -rw-r--r-- | problem.tex | 39 |
1 files changed, 28 insertions, 11 deletions
diff --git a/problem.tex b/problem.tex index e820f85..546500d 100644 --- a/problem.tex +++ b/problem.tex @@ -1,7 +1,7 @@ \label{sec:prel} \subsection{Experimental Design} -The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimum} studies how an experimenter \E\ should select the parameters of a set of experiments she is about to conduct. In general, the optimality of a particular design depends on the purpose of the experiment, \emph{i.e.}, the quantity \E\ is trying to learn or the hypothesis she is trying to validate. Due to their ubiquity in statistical analysis, a large literature on the subject focuses on learning \emph{linear models}, where \E\ wishes to fit a linear map to the data she has collected. +The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimum} studies how an experimenter \E\ should select the parameters of a set of experiments she is about to conduct. In general, the optimality of a particular design depends on the purpose of the experiment, \emph{i.e.}, the quantity \E\ is trying to learn or the hypothesis she is trying to validate. Due to their ubiquity in statistical analysis, a large literature on the subject focuses on learning \emph{linear models}, where \E\ wishes to fit a linear function to the data she has collected. More precisely, putting cost considerations aside, suppose that \E\ wishes to conduct $k$ among $n$ possible experiments. Each experiment $i\in\mathcal{N}\defeq \{1,\ldots,n\}$ is associated with a set of parameters (or features) $x_i\in \reals^d$, normalized so that $\|x_i\|_2\leq 1$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$, related to the experiment features $x_i$ through a linear function, \emph{i.e.}, \begin{align}\label{model} @@ -27,6 +27,7 @@ A variety of different value functions are used in experimental design~\cite{puk \begin{align} V(S) &= \frac{1}{2}\log\det \T{X_S}X_S \label{dcrit} %\\ \end{align} +defined as $-\infty$ when $\mathrm{rank}(\T{X_S}X_S)<d$. As $\hat{\beta}$ is a multidimensional normal random variable, the $D$-optimality criterion is equal (up to a constant) to the negative of the entropy of $\hat{\beta}$. Hence, selecting a set of experiments $S$ that @@ -39,8 +40,7 @@ the uncertainty on $\beta$, as captured by the entropy of its estimator. %\end{align} %There are several reasons - Value function \eqref{dcrit} is undefined when $\mathrm{rank}(\T{X_S}X_S)<d$; in this case, we take $V(S)=-\infty$ (so that $V$ takes values in the extended reals). -Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, + Note that \eqref{dcrit} is a submodular set function, \emph{i.e.}, $V(S)+V(T)\geq V(S\cup T)+V(S\cap T)$ for all $S,T\subseteq \mathcal{N}$; it is also monotone, \emph{i.e.}, $V(S)\leq V(T)$ for all $S\subset T$. %In addition, the maximization of convex relaxations of this function is a well-studied problem \cite{boyd}. \subsection{Budget Feasible Reverse Auctions} @@ -71,8 +71,7 @@ In addition to the above, mechanism design in budget feasible reverse auctions s \item \emph{Truthfulness.} An agent has no incentive to missreport her true cost. Formally, let $c_{-i}$ be a vector of costs of all agents except $i$. % $c_{-i}$ being the same). Let $[p_i']_{i\in \mathcal{N}} = p(c_i', % c_{-i})$, then the -A mechanism is truthful iff for every $i \in \mathcal{N}$ and every two cost vectors $(c_i,c_{-i})$ and $(c_i',c_{-i})$ - $p_i(c_i,c_{-i}) - s_i(c_i,c_{-i})\cdot c_i \geq p_i(c_i',c_{-i}) - s(c_i',c_{-i})\cdot c_i$. +A mechanism is truthful iff $p_i(c_i,c_{-i}) - s_i(c_i,c_{-i})\cdot c_i \geq p_i(c_i',c_{-i}) - s(c_i',c_{-i})\cdot c_i$ for every $i \in \mathcal{N}$ and every two cost vectors $(c_i,c_{-i})$ and $(c_i',c_{-i})$. \item \emph{Budget Feasibility.} The sum of the payments should not exceed the budget constraint: %\begin{displaymath} @@ -85,7 +84,8 @@ A mechanism is truthful iff for every $i \in \mathcal{N}$ and every two cost vec \begin{displaymath} OPT \leq \alpha V(S). \end{displaymath} - The approximation ratio captures the \emph{price of truthfulness}, \emph{i.e.}, the relative value loss incurred by adding the truthfulness constraint. % to the value function maximization. + % The approximation ratio captures the \emph{price of truthfulness}, \emph{i.e.}, the relative value loss incurred by adding the truthfulness constraint. % to the value function maximization. + We stress here that the approximation ratio is defined with respect \emph{to the maximal value in the full information case}. \item \emph{Computational efficiency.} The allocation and payment function should be computable in polynomial time in the number of agents $n$. %\thibaut{Should we say something about the black-box model for $V$? Needed to say something in general, but not in our case where the value function can be computed in polynomial time}. @@ -130,7 +130,7 @@ biometric information (\emph{e.g.}, her red cell blood count, a genetic marker, etc.). The cost $c_i$ is the amount the subject deems sufficient to incentivize her participation in the study. Note that, in this setup, the feature vectors $x_i$ are public information that the experimenter can consult prior the experiment design. Moreover, though a subject may lie about her true cost $c_i$, she cannot lie about $x_i$ (\emph{i.e.}, all features are verifiable upon collection) or $y_i$ (\emph{i.e.}, she cannot falsify her measurement). -Ideally, motivated by the $D$-optimality criterion, we would like to design a mechanism that maximizes \eqref{dcrit} within a good approximation ratio. In what follows, we consider a slightly more general objective as follows: +Ideally, motivated by the $D$-optimality criterion, we would like to design a mechanism that maximizes \eqref{dcrit} within a good approximation ratio. The latter however can take negative values, making it unfit for stating approximation ratios. As such, we consider the following full information problem: \begin{center} \textsc{ExperimentalDesignProblem} (EDP) \begin{subequations} @@ -140,19 +140,36 @@ Ideally, motivated by the $D$-optimality criterion, we would like to design a me \end{align}\label{edp} \end{subequations} \end{center} where $I_d\in \reals^{d\times d}$ is the identity matrix. -\junk{One possible interpretation of \eqref{modified} is that, prior to the auction, the experimenter has free access to $d$ experiments whose features form an orthonormal basis in $\reals^d$. However, \eqref{modified} can also be motivated in the context of \emph{Bayesian experimental design} \cite{chaloner1995bayesian}. In short, the objective \eqref{modified} arises naturally when the experimenter retrieves the model $\beta$ through \emph{ridge regression}, rather than the linear regression \eqref{leastsquares} over the observed data; we explore this connection in Section~\ref{sec:bed}. -} +The objective~\eqref{modified} is non-negative, and can be motivated in the context of \emph{Bayesian experimental design} \cite{chaloner1995bayesian}. In short, it arises naturally when the experimenter retrieves the model $\beta$ through \emph{ridge regression}, rather than the linear regression \eqref{leastsquares} over the observed data; we explore this connection in Section~\ref{sec:bed}. + + + + +%We present our results with this version of the objective function because it is simple and captures the versions +%we will need later (including an arbitrary matrix in place of $I_d$ or a zero matrix which will correspond to ~\eqref{dcrit} ). + +%\begin{center} +%\textsc{ExperimentalDesignProblem} (EDP) +%\begin{subequations} +%\begin{align} +%\text{Maximize}\quad V(S) &= \log\det(I_d+\T{X_S}X_S) \label{modified} \\ +%\text{subject to}\quad \sum_{i\in S} p_i&\leq B\\ +%p_i&\geq c_i, \quad i\in S\\ +%\end{align}\label{edp} +%\end{subequations} +%\end{center} -We present our results with this version of the objective function because it is simple and captures the versions -we will need later (including an arbitrary matrix in place of $I_d$ or a zero matrix which will correspond to ~\eqref{dcrit} ). %Note that maximizing \eqref{modified} is equivalent to maximizing \eqref{dcrit} in the full-information case. In particular, $\det(\T{X_S}X_S)> \det(\T{X_{S'}}X_{S'})$ iff $\det(I_d+\T{X_S}X_S)>\det(I_d+\T{X_{S'}}X_{S'})$. In addition, \EDP{} \emph{and} the corresponding problem with objective \eqref{dcrit} are NP-hard; to see this, note that \textsc{Knapsack} reduces to EDP with dimension $d=1$ by mapping the weight of each item $w_i$ to an experiment with $x_i=w_i$. Moreover, \eqref{modified} is submodular, monotone and satisfies $V(\emptyset) = 0$. +Finally, we define the strategic version of \EDP{} (denoted by \SEDP) by adding the additional constraints of truthfulness, individual rationality, and normal payments, as described in the previous section. + %, allowing us to use the extensive machinery for the optimization of submodular functions, as well as recent results in the % context of budget feasible auctions \cite{chen,singer-mechanisms}. %\stratis{A potential problem is that all of the above properties hold also for, \emph{e.g.}, $V(S)=\log(1+\det(\T{X_S}X_S))$\ldots} + \begin{comment}\junk{ As \eqref{dcrit} may take arbitrarily small negative values, to define a meaningful approximation one would consider the (equivalent) maximization of $V(S) = \det\T{X_S}X_S$. %, for some strictly increasing, on-to function $f:\reals_+\to\reals_+$. However, the following lower bound implies that such an optimization goal cannot be attained under the constraints of truthfulness, budget feasibility, and individual rationality. |
