diff options
| -rw-r--r-- | intro.tex | 2 | ||||
| -rw-r--r-- | main.tex | 14 | ||||
| -rw-r--r-- | problem.tex | 4 | ||||
| -rw-r--r-- | related.tex | 4 |
4 files changed, 14 insertions, 10 deletions
@@ -24,7 +24,7 @@ Our contributions are as follows. We formulate the problem of experimental design subject to a given budget, in presence of strategic agents who specify their costs. In particular, we focus on linear regression. This is naturally viewed as a budget feasible mechanism design problem. The objective function is sophisticated and is related to the covariance of the $x_i$'s. In particular we formulate the {\em Experimental Design Problem} (\EDP) as follows: the experimenter \E\ wishes to find set $S$ of subjects to maximize \begin{align}V(S) = \log\det(I_d+\sum_{i\in S}x_i\T{x_i}) \label{obj}\end{align} with a budget constraint $\sum_{i\in S}c_i\leq B$, where $B$ is \E's budget. %, and other {\em strategic constraints} we don't list here. -The objective function, which is the key, is obtained by optimizing the information gain in $\beta$ when it is learned through linear regression methods, and is the so-called $D$-objective criterion in the literature. +The objective function, which is the key, is obtained by optimizing the information gain in $\beta$ when it is learned through linear regression methods, and is the so-called $D$-optimality criterion in the literature. \item The above objective is submodular. There are several recent results in budget feasible @@ -19,8 +19,12 @@ included in $S$ is the one with the highest \emph{marginal-value-per-cost}: \end{align} This process terminates when no more items can be added to $S$ using \eqref{greedy} under budget $B$. This is the generalization of the \emph{value-per-cost} ratio used in the greedy approximation algorithm for \textsc{Knapsack}. However, in contrast to \textsc{Knapsack}, for general submodular -functions, the marginal value of an element in \eqref{greedy} depends on the set to which it the element is added. -% +functions, the marginal value of an element in \eqref{greedy} depends on the set to which the element is added. +Similarly, the value of an element depends on the set in which it is +considered. However, in the following, the value of an element $i$ will refer to +its value as a singleton set: $V(\{i\})$. If there is no ambiguity, we will write +$V(i)$ instead of $V(\{i\})$. + Unfortunately, even for the full information case, the greedy algorithm gives an unbounded approximation ratio. Let $i^* = \argmax_{i\in\mathcal{N}} V(i)$ be the element of maximum value. @@ -28,7 +32,7 @@ unbounded approximation ratio. Let $i^* %It has been noted by Khuller \emph{et al.}~\cite{khuller} that For the maximum coverage problem, taking the maximum between the greedy solution and $V(i^*$) -gives a $\frac{2e}{e-1}$ approximation ratio ~\cite{khuller} . In the general case, +gives a $\frac{2e}{e-1}$ approximation ratio ~\cite{khuller}. In the general case, \junk{we have the following result from Singer \cite{singer-influence} which follows from Chen \emph{et al.}~\cite{chen}: \stratis{Is it Singer or Chen? Also, we need to introduce $V(i)$ somewhere...} @@ -85,8 +89,8 @@ $V(i^*)$ to %$OPT(R_{\mathcal{N}\setminus\{i\}}, B)$, where $R$ denotes the optimal value of a \emph{fractional relaxation} of the function $V$ over the set $\mathcal{N}$. A fuction $R:[0, 1]^{n}\to\reals_+$ defined on the hypercube $[0, 1]^{n}$ is a fractional relaxation of $V$ over the set $\mathcal{N}$ if %(a) $R$ is a function defined on the hypercube $[0, 1]^{n}$ and (b) - $R(\mathbf{1}_S) = V(S)$ for all $S\subseteq\mathcal{N}$, where -$\mathbf{1}_S$ denotes the indicator vector of $S$. The optimization program + $R(\id_S) = V(S)$ for all $S\subseteq\mathcal{N}$, where +$\id_S$ denotes the indicator vector of $S$. The optimization program \eqref{eq:non-strategic} extends naturally to such relaxations: \begin{align} OPT' = \argmax_{\lambda\in[0, 1]^{n}} diff --git a/problem.tex b/problem.tex index f0eeb43..b07cd42 100644 --- a/problem.tex +++ b/problem.tex @@ -134,7 +134,7 @@ incentivize her participation in the study. Note that, in this setup, the featur Ideally, motivated by the $D$-optimality criterion, we would like to design a mechanism that maximizes or approximates \eqref{dcrit} . Since \eqref{dcrit} may take arbitrarily small negative values, to define a meaningful approximation one would consider the (equivalent) maximization of $V(S) = \det\T{X_S}X_S$. %, for some strictly increasing, on-to function $f:\reals_+\to\reals_+$. However, the following lower bound implies that such an optimization goal cannot be attained under the constraints of truthfulness, budget feasibility, and individual rationality. \begin{lemma} -For any $M>1$, there is no $M$-approximate, truthful, budget feasible, individionally rational mechanism for a budget feasible reverse auction with $V(S) = \det{\T{X_S}X_S}$. +For any $M>1$, there is no $M$-approximate, truthful, budget feasible, individually rational mechanism for a budget feasible reverse auction with $V(S) = \det{\T{X_S}X_S}$. \end{lemma} \begin{proof} \input{proof_of_lower_bound1} @@ -151,7 +151,7 @@ This negative result motivates us to study a problem with a modified objective: \end{center} where $I_d\in \reals^{d\times d}$ is the identity matrix. One possible interpretation of \eqref{modified} is that, prior to the auction, the experimenter has free access to $d$ experiments whose features form an orthonormal basis in $\reals^d$. However, \eqref{modified} can also be motivated in the context of \emph{Bayesian experimental design} \cite{chaloner1995bayesian}. In short, the objective \eqref{modified} arises naturally when the experimenter retrieves the model $\beta$ through \emph{ridge regression}, rather than the linear regression \eqref{leastsquares} over the observed data; we explore this connection in Section~\ref{sec:bed}. -Note that maximizing \eqref{modified} is equivalent to maximizing \eqref{dcrit} in the full-information case. In particular, $\det(\T{X_S}X_S)> \det(\T{X_{S'}}X_{S'})$ iff $\det(I_d+\T{X_S}X_S)>\det(I_d+\T{X_{S'}}X_{S'})$. In addition, \eqref{edp}---and the equivalent problem with objective \eqref{dcrit}---are NP-hard; to see this, note that \textsc{Knapsack} reduces to EDP with dimension $d=1$ by mapping the weight of each item $w_i$ to an experiment with $x_i=w_i$. Nevertheless, \eqref{modified} is submodular, monotone and satifies $V(\emptyset) = 0$. +Note that maximizing \eqref{modified} is equivalent to maximizing \eqref{dcrit} in the full-information case. In particular, $\det(\T{X_S}X_S)> \det(\T{X_{S'}}X_{S'})$ iff $\det(I_d+\T{X_S}X_S)>\det(I_d+\T{X_{S'}}X_{S'})$. In addition, \eqref{edp}---and the equivalent problem with objective \eqref{dcrit}---are NP-hard; to see this, note that \textsc{Knapsack} reduces to EDP with dimension $d=1$ by mapping the weight of each item $w_i$ to an experiment with $x_i=w_i$. Nevertheless, \eqref{modified} is submodular, monotone and satisfies $V(\emptyset) = 0$. %, allowing us to use the extensive machinery for the optimization of submodular functions, as well as recent results in the % context of budget feasible auctions \cite{chen,singer-mechanisms}. diff --git a/related.tex b/related.tex index 9573309..cd27ed5 100644 --- a/related.tex +++ b/related.tex @@ -4,7 +4,7 @@ Budget feasible mechanism design was originally proposed by Singer \cite{singer-mechanisms}. Singer considers the problem of maximizing an arbitrary submodular function subject to a budget constraint in the \emph{value query} model, \emph{i.e.} assuming an oracle providing the value of the submodular objective on any given set. Singer shows that there exists a randomized, 112-approximation mechanism for submodular maximization that is \emph{universally truthful} (\emph{i.e.}, it is a randomized mechanism sampled from a distribution over truthful mechanisms). Chen \emph{et al.}~\cite{chen} improve this result by providing a 7.91-approximate mechanism, and show a corresponding lower bound of $2$ among universally truthful mechanisms for submodular maximization. -In contrast to the above results, no truthful, constant approximation mechanism that runs in polynomial time is presently known for submodular maximization. However, assuming access to an oracle providing the optimum in the full-information setup, Chen \emph{et al.},~provide a truthful, $8.34$-appoximate mechanism; in cases for which the full information problem is NP-hard, as the one we consider here, this mechanism is not poly-time, unless P=NP. Chen \emph{et al.}~also prove a $1+\sqrt{2}$ lower bound for truthful mechanisms, improving upon an earlier bound of 2 by Singer \cite{singer-mechanisms}. +In contrast to the above results, no truthful, constant approximation mechanism that runs in polynomial time is presently known for submodular maximization. However, assuming access to an oracle providing the optimum in the full-information setup, Chen \emph{et al.},~provide a truthful, $8.34$-approximate mechanism; in cases for which the full information problem is NP-hard, as the one we consider here, this mechanism is not poly-time, unless P=NP. Chen \emph{et al.}~also prove a $1+\sqrt{2}$ lower bound for truthful mechanisms, improving upon an earlier bound of 2 by Singer \cite{singer-mechanisms}. Improved bounds, as well as deterministic polynomial mechanisms, are known for specific submodular objectives. For symmetric submodular functions, a truthful mechanism with approximation ratio 2 is known, and this ratio is tight \cite{singer-mechanisms}. Singer also provides a 7.32-approximate truthful mechanism for the budget feasible version of \textsc{Matching}, and a corresponding lower bound of 2 \cite{singer-mechanisms}. Improving an earlier result by Singer, Chen \emph{et al.}~\cite{chen}, give a truthful, $2+\sqrt{2}$-approximate mechanism for \textsc{Knapsack}, and a lower bound of $1+\sqrt{2}$. Finally, a truthful, 31-approximate mechanism is also known for the budgeted version of \textsc{Coverage} \cite{singer-mechanisms,singer-influence}. @@ -16,7 +16,7 @@ a truthful, $O(\log^3 n)$-approximate mechanism A series of recent papers \cite{mcsherrytalwar,approximatemechanismdesign,xiao:privacy-truthfulness,chen:privacy-truthfulness} consider the related problem of conducting retrieve data from an \textit{unverified} database: the buyer of the data cannot verify the data reported by individuals and therefore must incentivize them to report truthfully. McSherry and Talwar \cite{mcsherrytalwar} argue that \emph{differentially private} mechanisms also offer a form of \emph{approximate truthfulness}, as the mechanism's output is only slightly perturbed when an individual unilaterally changes her data value; as a result, reporting untruthfully can only increase an individual's utility by a small amount. Xiao \cite{xiao:privacy-truthfulness}, improving upon earlier work by Nissim \emph{et al.}~\cite{approximatemechanismdesign} construct mechanisms that simultaneously achieve exact truthfulness as well as differential privacy. Eliciting private data through a \emph{survey} \cite{roth-liggett}, whereby individuals first decide whether to participate in the survey and then report their data, - also fall under the unverified database setting \cite{xiao:privacy-truthfulness}. Our work differs from the above works in that it does not capture user utility through differential privacy; crucially, we also assume that experiments conducted are \emph{tamper-proof}, and individuals can missreport their costs but not their values. + also fall under the unverified database setting \cite{xiao:privacy-truthfulness}. Our work differs from the above works in that it does not capture user utility through differential privacy; crucially, we also assume that experiments conducted are \emph{tamper-proof}, and individuals can misreport their costs but not their values. Our work is closest to Roth and Schoenebeck \cite{roth-schoenebeck}, who also consider how to sample individuals from a population to obtain an unbiased estimate of a reported value. The authors assume a prior on the joint distribution \stratis{to be continued} |
