related.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107

\section{Related work}
\label{sec:related}

\junk{\subsection{Experimental Design} The classic experimental design problem, which we also briefly review in Section~\ref{sec:edprelim}, deals with  which $k$ experiments to conduct among a set of $n$ possible experiments. It is a well studied problem both in the non-Bayesian \cite{pukelsheim2006optimal,atkinson2007optimum,boyd2004convex} and Bayesian setting \cite{chaloner1995bayesian}. Beyond $D$-optimality, several other objectives are encountered in the literature  \cite{pukelsheim2006optimal}; many involve some function of the covariance matrix of the estimate of $\beta$, such as $E$-optimality (maximizing the smallest eigenvalue of the covariance of $\beta$) or $T$-optimality (maximizing the trace). Our focus on $D$-optimality is motivated by both its tractability as well as its relationship to the information gain.  %are encountered in the literature, though they do not relate to entropy as $D$-optimality. We leave the task of approaching the maximization of such objectives from a strategic point of view as an open problem.
}
\paragraph{Budget Feasible Mechanisms for General Submodular Functions}
Budget feasible mechanism design was originally proposed by  \citeN{singer-mechanisms}. Singer considers the problem of maximizing an arbitrary submodular function subject to a budget constraint in the \emph{value query} model, \emph{i.e.} assuming an  oracle providing the  value of the submodular objective on any given set. 
 Singer shows that there exists a randomized, 112-approximation mechanism for submodular maximization that is \emph{universally truthful} (\emph{i.e.}, it is a randomized mechanism sampled from a distribution over truthful mechanisms). \citeN{chen}  improve this result by providing a 7.91-approximate mechanism, and show a corresponding lower bound of $2$ among universally truthful randomized mechanisms for submodular maximization.

The above approximation guarantees hold for the expected value of the
randomized mechanism: for a given
instance, the approximation ratio provided by the mechanism may in fact be
unbounded. No deterministic, truthful, constant approximation mechanism that
runs in polynomial time is presently known for submodular maximization.
However, assuming access to an oracle providing the optimum in the
full-information setup, Chen \emph{et al.},~propose a truthful, $8.34$-approximate mechanism; in cases for which the full information problem is NP-hard, as the one we consider here, this mechanism is not poly-time, unless P=NP.  Chen \emph{et al.}~also prove a $1+\sqrt{2}$ lower bound for truthful deterministic mechanisms, improving upon an earlier bound of 2 by \citeN{singer-mechanisms}. 


\paragraph{Budget Feasible Mechanism Design on Specific Problems}
Improved bounds, as well as deterministic polynomial mechanisms, are known for specific submodular objectives. For symmetric submodular functions, a truthful mechanism with approximation ratio 2 is known, and this ratio is tight \cite{singer-mechanisms}. Singer also provides a 7.32-approximate truthful mechanism for the budget feasible version of \textsc{Matching}, and a corresponding lower bound of 2 \cite{singer-mechanisms}. Improving an earlier result by Singer, \citeN{chen} give a truthful, $2+\sqrt{2}$-approximate mechanism for \textsc{Knapsack}, and a lower bound of $1+\sqrt{2}$. Finally, a truthful, 31-approximate  mechanism is also known for the budgeted version of \textsc{Coverage} \cite{singer-influence}. 

The above deterministic mechanisms for \textsc{Knapsack} \cite{chen} and \textsc{Coverage} \cite{singer-influence} follow the same general framework, which we also employ. We describe this framework in detail in Section~\ref{sec:main}. Both of these mechanisms rely on approximating the optimal solution to the underlying combinatorial problem by a well-known LP relaxation~\cite{pipage}, which can be solved exactly in polynomial time. No such  relaxation exists for \EDP, whose logarithmic objective is unlikely to be approximable through an LP. We develop instead a convex relaxation to \EDP; though, contrary to the above LP relaxations, this cannot be solved exactly, we establish that it can incorporated in the framework of \cite{chen,singer-influence} to yield a $\delta$-truthful mechanism for \EDP.  


%Our results therefore add \SEDP{} to the set of problems for which a deterministic, polynomial time, constant approximation mechanism is known.

\paragraph{Beyond Submodular Objectives}
Beyond submodular objectives, it is known that no truthful mechanism with approximation ratio smaller than $n^{1/2-\epsilon}$ exists for maximizing  fractionally subadditive functions (a class that includes submodular functions) assuming access to a value query oracle~\cite{singer-mechanisms}. Assuming access to a stronger oracle (the \emph{demand} oracle), there exists
a truthful, $O(\log^3 n)$-approximate mechanism 
\cite{dobz2011-mechanisms} as well as a universally truthful, $O(\frac{\log n}{\log \log n})$-approximate mechanism for subadditive maximization 
\cite{bei2012budget}. Moreover, in a Bayesian setup, assuming a prior distribution among the agent's costs, there exists a truthful mechanism with a 768/512-approximation ratio \cite{bei2012budget}. %(in terms of expectations)
Posted price, rather than direct revelation mechanisms, are also studied in \cite{singerposted}.


\paragraph{Data Markets}
 A series of recent papers  \cite{mcsherrytalwar,approximatemechanismdesign,xiao:privacy-truthfulness,chen:privacy-truthfulness} consider the related problem of retrieving data from an  \textit{unverified} database, where strategic users may misreport their data to ensure their privacy. %\citeN{mcsherrytalwar} argue that \emph{differentially private} mechanisms offer a form of \emph{approximate truthfulness}: if users have a utility that depends on their privacy, reporting their data untruthfully can only increase their utility by a small amount.  %\citeN{xiao:privacy-truthfulness}, improving upon earlier work by~\citeN{approximatemechanismdesign}, constructs mechanisms that simultaneously achieve exact truthfulness as well as differential privacy. 
We depart by assuming that experiment outcomes are tamper-proof and cannot be manipulated.  
A different set of papers \cite{ghosh-roth:privacy-auction,roth-liggett,pranav} consider a setting where data cannot be misreported, but the utility of users is a function of the differential privacy guarantee an analyst provides them. We do not focus on privacy; any privacy costs in our setup are internalized in the costs $c_i$.  %Eliciting private data through a \emph{survey} \cite{roth-liggett}, whereby individuals first decide whether to participate in the survey and then report their data,
% also falls under the unverified database setting \cite{xiao:privacy-truthfulness}. In the \emph{verified} database setting, \citeN{ghosh-roth:privacy-auction} and~\citeN{pranav} consider budgeted auctions where users have a utility again captured by differential privacy. Our work departs from the above setups in that utilities do not involve privacy, whose effects are assumed to be internalized in the costs reported by the users; crucially, we also assume that experiments are tamper-proof, and individuals can misreport their costs but not their values.

\sloppy
Our work is closest to the survey setup of~\citeN{roth-schoenebeck}, who also
consider how to sample individuals with different features who report a hidden
value at a certain cost. The authors assume a joint distribution between costs
$c_i$ and features $x_i$, and wish to obtain an unbiased estimate of the
expectation of the hidden value over the population, under the constraints of
truthfulness, budget feasibility and individual rationality. Our work departs
by learning a more general statistic (a linear model) rather than data means.
We note that, as in \cite{roth-schoenebeck}, costs $c_i$ and features $x_i$ can
be arbitrarily correlated in our work---the experimenter's objective \eqref{obj}  does not depend on their joint distribution.


\fussy
%\stratis{TODO: privacy discussion. Logdet objective. Should be one paragraph each.}

\begin{comment}
Two types of mechanisms: \emph{deterministic} and \emph{randomized}. For
randomized mechanisms, people seek \emph{universally truthful} mechanisms:
mechanisms which are a randomization of truthful mechanisms.

\paragraph{Symmetric submodular functions} $V(S) = g(|S|)$ where $g$ is
a concave function. Truthful deterministic mechanism with approximation ratio
of 2. This is optimal \cite{singer-mechanisms}.

\paragraph{Knapsack} deterministic: $1+\sqrt{2}\leq \alpha \leq 2 + \sqrt{2}$,
randomized: $2\leq \alpha\leq 3$ \cite{chen}

\paragraph{Matching} deterministic: $2 \leq \alpha\leq 7.32$ \cite{singer-mechanisms}

\paragraph{Coverage} deterministic: $ ?\leq\alpha\leq 31$ \cite{singer-influence}

\paragraph{Submodular function} deterministic: $1+\sqrt{2}\leq\alpha\leq ?$,
randomized: $2\leq\alpha\leq 7.91$ \cite{chen}

For wider class of functions, \cite{singer-mechanisms} proves that even for XOS
functions, the lower bound is $\sqrt{n}$ (no constant approximation) even in the
non-strategic case). To be able to say something interesting, people extend the
computational model to the \emph{demand} query model: you have access to an
oracle which given a vector of price $[p_i]_{i\in\mathcal{N}}$ returns
$S\in\argmax_{S\subseteq\mathcal{N}} V(S) - \sum_{i\in S} p_i$

\paragraph{XOS functions} fractionally additive functions: functions which can
be written as the max of a finite set of additive functions. deterministic: 768

\paragraph{Complement-free (sub-additive) objective} derministic: $O(\log^3 n)$
\cite{dobz2011-mechanisms}. randomized: $O(\frac{\log n}{\log \log n})$
\cite{bei2012budget}. More generally \cite{bei2012budget} gives a randomized
mechanism which is $O(\mathcal{I})$ where $\mathcal{I}$ is the integrality gap
of a set cover like problem which is defined from the value function. In
particular, for XOS function, this integrality gap is $O(1)$.

\paragraph{Bayesian mechanism design} when we have a distribution over the
costs of the agents. In this case the approximation guarantee is defined in
expectations. \cite{bei2012budget} $768/512$ approximation ratio for any
subadditive function.

\paragraph{Online mechanisms} \cite{learning} in the i.i.d posted price model
for symmetric submodular functions randomized $O(1)$-competitive algorithm. For
general submodular function in the secretary posted price model randomized
$O(\log n)$-competitive algorithm. In the bidding model $O(1)$-competitive
algorithm.

\stratis{What is known about the maximization of logdet in the non-strategic case? Is it NP hard? Approximation ratios?}
\thibaut{Knapsack reduces to our problem in dimension 1, hence maximizing log
det is NP-hard. The approximation ratio is at least (1-1/e) by applying the
general approximation algorithm from Sviridenko \cite{sviridenko-submodular}.}
\end{comment}