summaryrefslogtreecommitdiffstats
path: root/intro.tex
diff options
context:
space:
mode:
Diffstat (limited to 'intro.tex')
-rw-r--r--intro.tex20
1 files changed, 11 insertions, 9 deletions
diff --git a/intro.tex b/intro.tex
index f0545bb..83d0f87 100644
--- a/intro.tex
+++ b/intro.tex
@@ -10,20 +10,22 @@ There is a rich literature about estimation procedures, as well as for means for
if \E\ can conduct only a limited number of experiments, so the estimation process returns $\beta$
that approximates the true parameter of the underlying population \cite{ginebra2007measure,le1996comparison,chaloner1995bayesian,boyd2004convex}.
-We depart from this classical set up by viewing experimental design in a strategic setting, and by studying mechanism design issues.
-In our setup, experiments cannot be manipulated and hence measurements are considered precise.\footnote{Thus, the experiments of our interest are statistically significant ones where each experiment provides a reliable outcome.} However, there
+We depart from this classical setup by viewing experimental design in a strategic setting, and by studying mechanism design issues.
+In our setting, experiments cannot be manipulated and hence measurements are considered reliable. %\footnote{Thus, the experiments of our interest are statistically significant ones where each experiment provides a reliable outcome.}
+However, there
is a cost $c_i$ associated with experimenting on
subject $i$ which varies from subject to subject. This may be viewed as the
-cost subject $i$ incurs when tested, and hence $i$ needs to be reimbursed; or, it might be viewed as the incentive for $i$
-to participate in the experiment; or, it might be the inherent value of the data. This economic aspect has always been inherent in experimental design: experimenters often work within strict budgets and design creative incentives. However, we are not aware of principled study of this setting from a strategic point of view. When subjects are strategic, they may have an incentive to mis-report their cost.
+cost subject $i$ incurs when tested and for which she needs to be reimbursed; or, it might be viewed as the incentive for $i$
+to participate in the experiment; or, it might be the inherent value of the data. When subjects are strategic, they may have an incentive to misreport their cost. This economic aspect has always been inherent in experimental design: experimenters often work within strict budgets and design creative incentives. However, we are not aware of principled study of this setting from a strategic point of view.
Our contributions are as follows.
\begin{itemize}
\item
-We formulate the problem of experimental design subject to a given budget, in presence of strategic agents who specify their costs. In particular, we focus on linear regression. This is naturally viewed as a budget feasible mechanism design problem. The objective function is sophisticated and is related to the covariance of the $x_i$'s. In particular, we formulate the {\em Experimental Design Problem} (\EDP) as follows: the experimenter \E\ wishes to find set $S$ of subjects to maximize
+We formulate the problem of experimental design subject to a given budget, in the presence of strategic agents who may lie about their costs. In particular, we focus on linear regression. This is naturally viewed as a budget feasible mechanism design problem, in which the objective function %is sophisticated and
+is related to the covariance of the $x_i$'s. In particular, we formulate the {\em Experimental Design Problem} (\EDP) as follows: the experimenter \E\ wishes to find set $S$ of subjects to maximize
\begin{align}V(S) = \log\det(I_d+\sum_{i\in S}x_i\T{x_i}) \label{obj}\end{align}
-with a budget constraint $\sum_{i\in S}c_i\leq B$, where $B$ is \E's budget. %, and other {\em strategic constraints} we don't list here.
+subject to a budget constraint $\sum_{i\in S}c_i\leq B$, where $B$ is \E's budget. %, and other {\em strategic constraints} we don't list here.
The objective function, which is the key, is obtained by optimizing the information gain in $\beta$ when it is learned through linear regression methods, and is related to the so-called $D$-optimality criterion.
\item
The above objective is submodular.
@@ -36,14 +38,14 @@ For specific combinatorial problems such as \textsc{Knapsack} or \textsc{Coverag
deterministic, truthful, polynomial constant-approximation algorithms~\cite{singer-mechanisms,chen, singer-influence}, but they do not work for the linear-algebraic objective function in \EDP.
%{\bf S+T: could we verify that the above sentence is correct in its implication?}
-We present the first known, polynomial time truthful mechanism for \EDP{}. Our mechanism is a constant factor ($\approx 19.68$) approximation for \EDP{}. From a technical perspective, we present a convex relaxation to \EDP\; and show that it is within a constant factor approximation of a multi-linear relaxation for \EDP{}. This allows us to adopt the prior work in budget feasible mechanisms by Chen \emph{et al.}~\cite{chen} and Singer~\cite{singer-influence}. In contrast to this, no truthful algorithms are possible for \EDP{} within a factor 2 approximation. {\bf FIX the last sentence}
+We present the first known, polynomial time truthful mechanism for \EDP{}. Our mechanism is a constant factor ($\approx 19.68$) approximation for \EDP{}. From a technical perspective, we present a convex relaxation to \EDP\; and show that it is within a constant factor approximation of the multi-linear relaxation for \EDP{}. This allows us to adopt the prior work in budget feasible mechanisms by Chen \emph{et al.}~\cite{chen} and Singer~\cite{singer-influence}. In contrast to this, we show that no truthful, budget-feasible algorithms are possible for \EDP{} within a factor 2 approximation. %{\bf FIX the last sentence}
\item
Our approach to mechanisms for experimental design --- by
- optimizing the information gain in parameters like $\beta$ which are estimated through the data analysis process --- is general. We give examples of this approach beyond linear regression to logistic regression, learning binary functions and others and derive budgeted mechanism design problems with submodular optimization where prior work immediately applies and gives constant factor approximations. We leave it open to get deterministic truthful polynomial time mechanisms for these problems, like we did for \EDP.
+ optimizing the information gain in parameters like $\beta$ which are estimated through the data analysis process --- is general. We give examples of this approach beyond linear regression to a general class that includes logistic regression and learning binary functions, and show that the corresponding budgeted mechanism design problem is also expressed through a submodular optimization. Hence, the prior work \cite{chen,singer-mechanisms} immediately applies, and gives universally truthful, polynomial timem constant factor approximation mechanisms for problems in this class. Getting deterministic truthful polynomial time mechanisms for this class or specific problems in it, like we did for \EDP, remains an open problem.
\end{itemize}
-In what follows, we described related work in Section~\ref{sec:related}. We describe the basics of experimental design and budget feasible mechanisms in Section~\ref{sec:peel} and define \EDP formally. In Section~\ref{sec:main} we present our mechanism for \EDP and prove our main results. In~\ref{sec:ext} we present applications of out general framework.
+In what follows, we described related work in Section~\ref{sec:related}. We describe the basics of experimental design and budget feasible mechanisms in Section~\ref{sec:peel} and define \EDP\ formally. In Section~\ref{sec:main} we present our mechanism for \EDP\ and prove our main results. In~\ref{sec:ext} we present applications of out general framework.
\junk{