diff options
Diffstat (limited to 'intro.tex')
| -rw-r--r-- | intro.tex | 13 |
1 files changed, 8 insertions, 5 deletions
@@ -20,9 +20,9 @@ In our setting, experiments cannot be manipulated and hence measurements are rel However, there is a cost $c_i$ associated with experimenting on subject $i$ which varies from subject to subject. This may be viewed as the -cost subject $i$ incurs when tested and for which she needs to be reimbursed; or, it might be viewed as the incentive for $i$ to participate in the experiment; or, it might be the intrinsic value of the data. +cost subject $i$ incurs when tested and for which she needs to be reimbursed; or, it might be viewed as the incentive for $i$ to participate in the experiment; or, it might be the intrinsic worth of the data to the user. - This economic aspect has always been inherent in experimental design: experimenters often work within strict budgets and design creative incentives. However, we are not aware of a principled study of this setting from a strategic point of view. When subjects are strategic, they may have an incentive to misreport their cost, leading to the neeed for a sophisticated choice of experiments and payments. Arguably, user incentiviation is of particular pertinence due to the extent of statistical analysis over user data on the Internet. %, which has led to the rise of several different research efforts in studying data markets \cite{...}. + This economic aspect has always been inherent in experimental design: experimenters often work within strict budgets and design creative incentives. However, we are not aware of a principled study of this setting from a strategic point of view. When subjects are strategic, they may have an incentive to misreport their cost, leading to the need for a sophisticated choice of experiments and payments. Arguably, user incentiviation is of particular pertinence due to the extent of statistical analysis over user data on the Internet. %, which has led to the rise of several different research efforts in studying data markets \cite{...}. Our contributions are as follows. @@ -36,16 +36,19 @@ subject to a budget constraint $\sum_{i\in S}c_i\leq B$, where $B$ is \E's budge The objective function, which is the key, is formally obtained by optimizing the information gain in $\beta$ when the latter is learned through linear regression, and is related to the so-called $D$-optimality criterion~\cite{pukelsheim2006optimal,atkinson2007optimum}. \item We present the first known polynomial time truthful mechanism for \SEDP{}, yielding a constant factor ($\approx 12.98$) approximation. In contrast to this, we show that no truthful, budget-feasible algorithms are possible for \SEDP{} within a factor 2 approximation. +We note that the objective \eqref{obj} is submodular. Using this fact, applying previous results for budget feasible mechanisms under general submodular objectives would yield either a truthful deterministic mechanism that requires exponential time, or a poly-time algorithm that is not deterministic~\cite{singer-mechanisms,chen}. \end{itemize} -We note that the objective \eqref{obj} is submodular. Using this fact, previous work by \citeN{singer-mechanisms} and \citeN{chen} on budget feasible mechanisms for submodular maximization yields a $8.34$-approximate deterministic mechanism for \SEDP{} that is not polynomial time, unless P=NP. Alternatively, previous work by \citeN{chen} on general submodular objectives also yields a randomized, 7.91-approximate polynomial time mechanism for \SEDP{} that is however \emph{universally truthful}, \emph{i.e.}, it is sampled from a distribution among truthful mechanisms. In contrast, our result is the first deterministic constant factor approximation mechanism for \SEDP{} that is both polytime and truthful. + +% budget feasible mechanisms for submodular maximization yields a $8.34$-approximate deterministic mechanism for \SEDP{} that is not polynomial time, unless P=NP. Alternatively, previous work by \citeN{chen} on general submodular objectives also yields a randomized, 7.91-approximate polynomial time mechanism for \SEDP{} that is however \emph{universally truthful}, \emph{i.e.}, it is sampled from a distribution among truthful mechanisms. In contrast, our result is the first deterministic constant factor approximation mechanism for \SEDP{} that is both polytime and truthful. % either a randomized, 7.91-approximate polynomial time mechanism for maximizing a general submodular function that is universally truthful, \emph{i.e.}, it is sampled from a distribution among truthful mechanisms. %There are several recent results in budget feasible %mechanisms~\cite{singer-mechanisms,chen,singer-influence,bei2012budget,dobz2011-mechanisms}, and some apply to the submodular optimization in %\EDP. %There is a randomized, 7.91-approximate polynomial time mechanism for maximizing a general submodular function that is universally truthful, \emph{i.e.}, it is sampled from a distribution among truthful mechanisms. Also, there is a $8.34$-approximate exponential time deterministic mechanism. %There are however no known deterministic, truthful, polynomial time mechanisms for general submodular functions. -Though such mechanisms were known to exist for combinatorial problems with specific submodular objectives such as \textsc{Knapsack} or \textsc{Coverage}~\cite{singer-mechanisms,chen, singer-influence}, these do not readily apply to the more complicated linear-algebraic objective function \eqref{obj} of \SEDP. + +%Though such mechanisms were known to exist for combinatorial problems with specific submodular objectives such as \textsc{Knapsack} or \textsc{Coverage}~\cite{singer-mechanisms,chen, singer-influence}, these do not readily apply to the more complicated linear-algebraic objective function \eqref{obj} of \SEDP. %{\bf S+T: could we verify that the above sentence is correct in its implication?} From a technical perspective, we present a convex relaxation of \eqref{obj}, and show that it is within a constant factor from the so-called multi-linear relaxation of \eqref{obj}. This allows us to adopt the approach followed by prior work in budget feasible mechanisms by Chen \emph{et al.}~\cite{chen} and Singer~\cite{singer-influence}. %{\bf FIX the last sentence} @@ -54,7 +57,7 @@ From a technical perspective, we present a convex relaxation of \eqref{obj}, and %Our approach to mechanisms for experimental design --- by % optimizing the information gain in parameters like $\beta$ which are estimated through the data analysis process --- is general. We give examples of this approach beyond linear regression to a general class that includes logistic regression and learning binary functions, and show that the corresponding budgeted mechanism design problem is also expressed through a submodular optimization. Hence, prior work \cite{chen,singer-mechanisms} immediately applies, and gives randomized, universally truthful, polynomial time, constant factor approximation mechanisms for problems in this class. Getting deterministic, truthful, polynomial time mechanisms with a constant approximation factor for this class or specific problems in it, like we did for \EDP, remains an open problem. -In what follows, we describe related work in Section~\ref{sec:related}. We briefly review experimental design and budget feasible mechanisms in Section~\ref{sec:peel} and define \EDP\ formally. In Section~\ref{sec:main} we present our mechanism for \EDP\ and prove our main results. The present applications of our general framework are presented in Section~\ref{sec:ext}. +In what follows, we describe related work in Section~\ref{sec:related}. We briefly review experimental design and budget feasible mechanisms in Section~\ref{sec:peel} and define \SEDP\ formally. In Section~\ref{sec:main} we present our mechanism for \SEDP\ and prove our main results. A generalization of our framework to machine learning tasks beyond linear regression is presented in Section~\ref{sec:ext}. \junk{ |
