diff options
Diffstat (limited to 'intro.tex')
| -rw-r--r-- | intro.tex | 30 |
1 files changed, 11 insertions, 19 deletions
@@ -21,43 +21,35 @@ to participate in the experiment; or, it might be the inherent value of the data Our contributions are as follows. \begin{itemize} \item -We formulate the problem of experimental design subject to a given budget, in presence of strategic agents who specify their costs. In particular, we focus on linear regression. This is naturally viewed as a budget feasible mechanism design problem. We show that the objective function is sophisticated and related to the covariance of the $x_i$'s. In particular we formulate the {\em Experimental Design Problem} (\EDP) as follows: the experimenter \E\ wishes to find set $S$ of subjects to maximize +We formulate the problem of experimental design subject to a given budget, in presence of strategic agents who specify their costs. In particular, we focus on linear regression. This is naturally viewed as a budget feasible mechanism design problem. The objective function is sophisticated and is related to the covariance of the $x_i$'s. In particular we formulate the {\em Experimental Design Problem} (\EDP) as follows: the experimenter \E\ wishes to find set $S$ of subjects to maximize \begin{align}V(S) = \log\det(I_d+\sum_{i\in S}x_i\T{x_i}) \label{obj}\end{align} with a budget constraint $\sum_{i\in S}c_i\leq B$, where $B$ is \E's budget. %, and other {\em strategic constraints} we don't list here. - The objective function, which is the key, is motivated from the so-called $D$-objective criterion; in particular, it captures the reduction in the entropy of $\beta$ when the latter is learned through linear regression methods. - +The objective function, which is the key, is obtained by optimizing the information gain in $\beta$ when it is learned through linear regression methods, and is the so-called $D$-objective criterion in the literature. + \item The above objective is submodular. There are several recent results in budget feasible mechanisms~\cite{singer-mechanisms,chen,singer-influence,bei2012budget,dobz2011-mechanisms}, and some apply to the submodular optimization in \EDP. -There is a randomized, 7.91-approximate mechanism for maximizing a general submodular function that is universally truthful, \emph{i.e.}, it is sampled from a distribution among truthful mechanisms. -There are however no known deterministic truthful mechanisms for general submodular functions. +There is a randomized, 7.91-approximate polynomial time mechanism for maximizing a general submodular function that is universally truthful, \emph{i.e.}, it is sampled from a distribution among truthful mechanisms. Also, there is a $8.34$-approximate exponential time deterministic mechanism. +There are however no known deterministic, truthful, polynomial time mechanisms for general submodular functions. For specific combinatorial problems such as \textsc{Knapsack} or \textsc{Coverage}, there exist -deterministic, truthful constant-approximation algorithms~\cite{singer-mechanisms,chen, singer-influence}, but they don't work for the linear-algebraic objective function in \EDP. +deterministic, truthful, polynomial constant-approximation algorithms~\cite{singer-mechanisms,chen, singer-influence}, but they do not work for the linear-algebraic objective function in \EDP. %{\bf S+T: could we verify that the above sentence is correct in its implication?} -We present the first known, polynomial time truthful mechanism for \EDP{} and show that it is a constant factor ($\approx 19$) approximation. The technical crux is an approximation algorithm we develop for a suitable multi-linear relaxation derived from \EDP. -In contrast, no such truthful algorithms are possible within a factor 2 approximation. +We present the first known, polynomial time truthful mechanism for \EDP{}. Our mechanism is a constant factor ($\approx 19$) approximation for \EDP{}. From a technical perspective, we present a convex relaxation to \EDP\; and show that it is within a constant factor approximation of a multi-linear relaxation for \EDP{}. This allows us to adopt the prior work in budget feasible mechanisms by Chen \emph{et al.}~\cite{chen} and Singer~\cite{singer-influence}. In contrast to this, no truthful algorithms are possible for \EDP{} within a factor 2 approximation. {\bf FIX the last sentence} \item -Our approach to mechanisms for experimental design is general. We apply it to study -experimental design beyond regression, -... {\bf FILL IN}. -. Again, the same insight -of data entropy, we can study several experiment design problems in their strategic setting as budget feasible mechanism design with a suitable objective function that is submodular. This immediately gives +Our approach to mechanisms for experimental design --- by + optimizing the information gain in parameters like $\beta$ which are estimated through the data analysis process --- is general. We give examples of this approach beyond linear regression to logistic regression, learning binary functions and others and derive budgeted mechanism design problems with submodular optimization where prior work immediately applies and gives constant factor approximations. We leave it open to get deterministic truthful polynomial time mechanisms for these problems, like we did for \EDP. \end{itemize} +In what follows, we described related work in Section~\ref{sec:related}. We describe the basics of experimental design and budget feasible mechanisms in Section~\ref{sec:peel} and define \EDP formally. In Section~\ref{sec:main} we present our mechanism for \EDP and prove our main results. In~\ref{sec:ext} we present applications of out general framework. -From a technical perspective, we extend the provide a convex relaxation to the above problem and show that it is within constant approximation from its so-called multilinear relaxation. This allows us to implement the approach used by Chen \emph{et al.}~\cite{chen} and Singer~\cite{singer-influence} to show deterministic constant approximation algorithms for \textsc{Knapsack} and \textsc{Coverage}, respectively. - - -We leave several problems open. +\junk{ \stratis{maximizing other ``optimality criteria'' (E-optimality, G-optimality, etc.) Improved upper lower bound for $R\neq I$. General entropy gain optimization} -\junk{ - \begin{itemize} \item already existing field of experiment design: survey-like setup, what are the best points to include in your experiment? Measure of the |
