Merge branch 'master' of ssh://74.95.195.229:1444/git/data_value

author: Thibaut Horel <thibaut.horel@gmail.com> 2012-11-05 22:00:49 +0100
committer: Thibaut Horel <thibaut.horel@gmail.com> 2012-11-05 22:00:49 +0100
commit: f9099cda78ef0229ca972ee4b8bbb0fb9fbde7e4 (patch)
tree: b723e4ad1357e13c95f97d8e40b6cebfb7c05949
parent: 9c2dfdab302e7a5923336b4eed0bfa74c0c49b14 (diff)
parent: c5438848e77fca83bdf022efe002204a8273a2bb (diff)
download: recommendation-f9099cda78ef0229ca972ee4b8bbb0fb9fbde7e4.tar.gz
4 files changed, 26 insertions, 42 deletions
diff --git a/abstract.tex b/abstract.tex
index d79f6d6..2da5561 100644
--- a/abstract.tex
+++ b/abstract.tex
@@ -2,7 +2,7 @@ We initiate the study of mechanisms for \emph{experimental design}. In this sett
 an  experimenter with  a budget $B$ has access to a population of $n$ potential experiment subjects $i\in 1,\ldots,n$,  each associated with a vector of features $x_i\in\reals^d$ as well as a cost $c_i>0$.
 Conducting an experiment with subject $i$  reveals an unknown value $y_i\in \reals$ to the experimenter. Assuming a linear relationship between $x_i$'s and $y_i$'s, \emph{i.e.},  $y_i \approx  \T{\beta} x_i$, conducting the experiments and obtaining the measurements $y_i$ allows the experimenter to estimate  $\beta$. The experimenter's goal is to select which experiments to conduct, subject to her budget constraint, to obtain the best estimate possible. 
 
-We study this problem when subjects are \emph{strategic} and may lie about their costs. In particular, we formulate the {\em Experimental Design Problem} (\EDP) as finding a set $S$ of subjects that maximize $V(S) = \log\det(I_d+\sum_{i\in S}x_i\T{x_i})$ under the constraint $\sum_{i\in S}c_i\leq B$; our objective function corresponds to  the information gain in  $\beta$ when it is learned through linear regression methods, and is related to the so-called $D$-optimality criterion. We present the first known, polynomial time truthful mechanism for \EDP{}, yielding a constant factor ($\approx 19.68$) approximation, and show that no truthful algorithms are possible within a factor 2 approximation. Moreover, we show that a wider class of learning problems admits a polynomial time universally truthful (\emph{i.e.}, randomized) mechanism, also within a constant factor approximation.
+We study this problem when subjects are \emph{strategic} and may lie about their costs. In particular, we formulate the {\em Experimental Design Problem} (\EDP) as finding a set $S$ of subjects that maximize $V(S) = \log\det(I_d+\sum_{i\in S}x_i\T{x_i})$ under the constraint $\sum_{i\in S}c_i\leq B$; our objective function corresponds to  the information gain in  $\beta$ when it is learned through linear regression methods, and is related to the so-called $D$-optimality criterion. We present the first known deterministic, polynomial time truthful mechanism for \EDP{}, yielding a constant factor ($\approx 19.68$) approximation, and show that no truthful algorithms are possible within a factor 2 approximation. Moreover, we show that a wider class of learning problems admits a polynomial time universally truthful (\emph{i.e.}, randomized) mechanism, also within a constant factor approximation.
 
 
 
diff --git a/general.tex b/general.tex
index 957521a..65b3c8f 100644
--- a/general.tex
+++ b/general.tex
@@ -1,14 +1,15 @@
 \subsection{Bayesian Experimental Design}\label{sec:bed}
-In this section, we extend our results to Bayesian experimental design
-\cite{chaloner1995bayesian}. We show that objective function \eqref{modified}
-has a natural interpretation in this context, further motivating its selection
-as our objective.  Moreover, we extend Theorem~\ref{thm:main} to a more general
-Bayesian setting.
 
-In the Bayesian setting, it is assumed that the experimenter has a prior
+%In this section, we extend our results to Bayesian experimental design
+%\cite{chaloner1995bayesian}. We show that objective function \eqref{modified}
+%has a natural interpretation in this context, further motivating its selection
+%as our objective.  Moreover, 
+
+We extend Theorem~\ref{thm:main} to a more general
+Bayesian setting where it is assumed that the experimenter  \E\ has a {\em prior}
 distribution on $\beta$: in particular, $\beta$ has a multivariate normal prior
 with zero mean  and covariance $\sigma^2R^{-1}\in \reals^{d^2}$ (where $\sigma^2$ is the noise variance). 
-The experimenter estimates $\beta$ through \emph{maximum a posteriori estimation}: \emph{i.e.}, finding the parameter which maximizes the posterior distribution of $\beta$ given the observations $y_S$. Under the linearity assumption \eqref{model} and the Gaussian prior on $\beta$, maximum a posteriori estimation leads to the following maximization \cite{hastie}: 
+\E\ estimates $\beta$ through \emph{maximum a posteriori estimation}: \emph{i.e.}, finding the parameter which maximizes the posterior distribution of $\beta$ given the observations $y_S$. Under the linearity assumption \eqref{model} and the Gaussian prior on $\beta$, maximum a posteriori estimation leads to the following maximization \cite{hastie}: 
 \begin{displaymath}
     \hat{\beta} = \argmin_{\beta\in\reals^d} \sum_i (y_i - \T{\beta}x_i)^2
     + \T{\beta}R\beta
@@ -27,13 +28,15 @@ Assuming normal noise variables, the information gain is  equal (up to a constan
 \begin{align}
 V(S)  = \frac{1}{2}\log\det(R + \T{X_S}X_S)\label{bayesianobjective}
 \end{align}
-Our objective \eqref{modified} clearly follows from \eqref{bayesianobjective}
-by setting $R=I_d$. Hence, our optimization can be interpreted as
+Our objective for \EDP\ 
+%\eqref{modified} 
+clearly follows from \eqref{bayesianobjective}
+by setting $R=I_d$. Hence, the optimization discussed thus far can be interpreted as
 a maximization of the information gain when the prior distribution has
 a covariance $\sigma^2 I_d$, and the experimenter is solving a ridge regression
 problem with penalty term $\norm{x}_2^2$. 
 
-Moreover, our results can be extended to the general Bayesian case, by
+Our results can be extended to the general Bayesian case, by
 replacing $I_d$ with the positive semidefinite matrix $R$. First, we re-set the
 origin of the value function so that $V(\emptyset) = 0$:
 \begin{align}\label{eq:normalized}
@@ -60,30 +63,9 @@ where $\mu$ is the smallest eigenvalue of $R$.
 
 \subsection{$D$-Optimality and Beyond}
 
-Ideally, motivated by the $D$-optimality criterion, we would like to design a mechanism that maximizes \eqref{dcrit} within a good approximation ratio.  In what follows, we consider a slightly more general objective as follows:
-\begin{center}
-\textsc{ExperimentalDesignProblem} (EDP)
-\begin{subequations}
-\begin{align}
-\text{Maximize}\quad V(S) &= \log\det(I_d+\T{X_S}X_S) \label{modified} \\
-\text{subject to}\quad \sum_{i\in S} c_i&\leq B
-\end{align}\label{edp}
-\end{subequations}
-\end{center} where $I_d\in \reals^{d\times d}$ is the identity matrix.
-\junk{One possible interpretation of \eqref{modified} is that, prior to the auction, the experimenter has free access to $d$ experiments whose features form an orthonormal basis in $\reals^d$. However, \eqref{modified} can also be motivated in the context of \emph{Bayesian experimental design} \cite{chaloner1995bayesian}. In short, the objective \eqref{modified} arises naturally when the experimenter retrieves the model $\beta$ through \emph{ridge regression}, rather than the linear regression \eqref{leastsquares} over the observed data; we explore this connection in Section~\ref{sec:bed}. 
-}
-
-We present our results with this version of the objective function because it is simple and captures the versions
-we will need later (including  an arbitrary matrix in place of $I_d$ or a zero matrix which will correspond to ~\eqref{dcrit} ).
-
-%Note that maximizing \eqref{modified} is equivalent to maximizing \eqref{dcrit} in the full-information case. In particular, $\det(\T{X_S}X_S)> \det(\T{X_{S'}}X_{S'})$ iff $\det(I_d+\T{X_S}X_S)>\det(I_d+\T{X_{S'}}X_{S'})$. In addition,
-\EDP{} \emph{and} the corresponding problem with objective \eqref{dcrit} are NP-hard; to see this, note that \textsc{Knapsack} reduces to EDP with dimension $d=1$ by mapping the weight of each item $w_i$ to an experiment with $x_i=w_i$. Moreover, \eqref{modified} is submodular, monotone and satisfies $V(\emptyset) = 0$.
-%, allowing us to use the extensive machinery for the optimization of submodular functions, as well as recent results in the 
-% context of budget feasible auctions \cite{chen,singer-mechanisms}. 
-
-%\stratis{A potential problem is that all of the above properties hold also for, \emph{e.g.}, $V(S)=\log(1+\det(\T{X_S}X_S))$\ldots}
-
-As \eqref{dcrit} may take arbitrarily small negative values, to define a meaningful approximation one would consider  the (equivalent) maximization of $V(S) = \det\T{X_S}X_S$. %, for some strictly increasing, on-to function $f:\reals_+\to\reals_+$. 
+We now reexamine the classical $D$-optimality in \eqref{dcrit}, which is~\ref{bayesianobjective} with $R$ replaced by
+the zero matrix. 
+Since  \eqref{dcrit} may take arbitrarily small negative values, to define a meaningful approximation one would consider  the (equivalent) maximization of $V(S) = \det\T{X_S}X_S$. %, for some strictly increasing, on-to function $f:\reals_+\to\reals_+$. 
 However, the following lower bound implies that such an optimization goal cannot be attained under the constraints of  truthfulness, budget feasibility, and individual rationality.
 \begin{lemma}
 For any $M>1$, there is no $M$-approximate, truthful, budget feasible, individually rational mechanism for a budget feasible reverse auction with value function $V(S) = \det{\T{X_S}X_S}$.
@@ -93,6 +75,8 @@ For any $M>1$, there is no $M$-approximate, truthful, budget feasible, individua
 \input{proof_of_lower_bound1}
 \end{proof}
 
+Beyond $D$-optimality, 
+{\bf STRATIC, fill in..}
 
 \subsection{Beyond Linear Models}
 Selecting experiments that maximize the information gain in the Bayesian setup
@@ -135,7 +119,7 @@ variables is a submodular function. Thus, the value function is written in
 
 This lemma implies that learning an \emph{arbitrary hypothesis, under an
 arbitrary prior} when noise is conditionally independent leads to a submodular
-value function. Hence, we can apply the results from Chen \emph{et al.} to get
+value function. Hence, we can apply the previously known results to get
 the following corollary:
 \begin{corollary}
     For Bayesian experimental design with the objective given by the
diff --git a/intro.tex b/intro.tex
index 83d0f87..e377462 100644
--- a/intro.tex
+++ b/intro.tex
@@ -6,7 +6,7 @@ known to the experimenter.
 Typically, \E\ has a hypothesis of the relationship between $x_i$'s and $y_i$'s, such as, say linear, \emph{i.e.},  $y_i \approx  \T{\beta} x_i$; conducting the experiments and obtaining the measurements $y_i$ lets \E\  estimate  $\beta$. %The goal of experimental design amounts to determining which subjects to experiment upon to produce the best possible such estimate.
  
 The above experimental design scenario  has many applications, from medical testing to marketing research and others. 
-There is a rich literature about estimation procedures, as well as for means for quantifying the quality of the produced estimate  \cite{pukelsheim2006optimal}.  There is also an extensive theory on how to select subjects 
+There is a rich literature about estimation procedures, as well as for means of quantifying the quality of the produced estimate  \cite{pukelsheim2006optimal}.  There is also an extensive theory on how to select subjects 
 if \E\ can conduct only a limited number of experiments, so the estimation process returns $\beta$
 that approximates the true parameter of the underlying population \cite{ginebra2007measure,le1996comparison,chaloner1995bayesian,boyd2004convex}. 
 
@@ -38,14 +38,14 @@ For specific combinatorial problems such as \textsc{Knapsack} or \textsc{Coverag
 deterministic, truthful, polynomial constant-approximation algorithms~\cite{singer-mechanisms,chen, singer-influence}, but they do not work for the linear-algebraic  objective function in \EDP.
 %{\bf S+T: could we verify that the above sentence is correct in its implication?}
 
-We present the first known, polynomial time truthful mechanism for \EDP{}. Our mechanism is a constant factor ($\approx 19.68$) approximation for \EDP{}. From a technical perspective, we present a convex relaxation to \EDP\; and show that it is within a constant factor approximation of the multi-linear relaxation for \EDP{}. This allows us to adopt the prior work in budget feasible mechanisms by Chen \emph{et al.}~\cite{chen} and Singer~\cite{singer-influence}.  In contrast to this, we show that no truthful, budget-feasible algorithms are possible for \EDP{}  within a factor 2 approximation. %{\bf FIX the last sentence}
+We present the first known, polynomial time truthful mechanism for \EDP{}. Our mechanism is a constant factor ($\approx 19.68$) approximation for \EDP{}. In contrast to this, we show that no truthful, budget-feasible algorithms are possible for \EDP{}  within a factor 2 approximation. From a technical perspective, we present a convex relaxation of \eqref{obj}, and show that it is within a constant factor from the so-called multi-linear relaxation of  \eqref{obj}. This allows us to adopt the approach followed by prior work in budget feasible mechanisms by Chen \emph{et al.}~\cite{chen} and Singer~\cite{singer-influence}.   %{\bf FIX the last sentence}
 
 \item
 Our approach to mechanisms for experimental design --- by 
- optimizing  the information gain in parameters like $\beta$ which are estimated through the data analysis process --- is general.  We give examples of this approach beyond linear regression to a general class that includes logistic regression and learning binary functions, and show that the corresponding budgeted mechanism design problem is also expressed through a submodular optimization.  Hence, the prior work \cite{chen,singer-mechanisms} immediately applies, and gives universally truthful, polynomial timem constant factor approximation mechanisms for problems in this class. Getting deterministic truthful polynomial time mechanisms for this class or specific problems in it, like we did for \EDP, remains an open problem.
+ optimizing  the information gain in parameters like $\beta$ which are estimated through the data analysis process --- is general.  We give examples of this approach beyond linear regression to a general class that includes logistic regression and learning binary functions, and show that the corresponding budgeted mechanism design problem is also expressed through a submodular optimization.  Hence,  prior work \cite{chen,singer-mechanisms} immediately applies, and gives randomized, universally truthful, polynomial time, constant factor approximation mechanisms for problems in this class. Getting deterministic, truthful, polynomial time mechanisms with a constant approximation factor for this class or specific problems in it, like we did for \EDP, remains an open problem.
 \end{itemize}
 
-In what follows, we described related work in Section~\ref{sec:related}. We describe the basics of experimental design and budget feasible mechanisms in Section~\ref{sec:peel} and define \EDP\ formally. In Section~\ref{sec:main} we present our mechanism for \EDP\ and prove our main results. In~\ref{sec:ext} we present applications of out general framework. 
+In what follows, we describe related work in Section~\ref{sec:related}. We briefly review  experimental design and budget feasible mechanisms in Section~\ref{sec:peel} and define \EDP\ formally. In Section~\ref{sec:main} we present our mechanism for \EDP\ and prove our main results. The present applications of our general framework are presented in Section~\ref{sec:ext}. 
 
 \junk{
 
diff --git a/related.tex b/related.tex
index 6ecdb12..24ba359 100644
--- a/related.tex
+++ b/related.tex
@@ -15,13 +15,13 @@ a truthful, $O(\log^3 n)$-approximate mechanism
 \cite{dobz2011-mechanisms} as well as a universally truthful, $O(\frac{\log n}{\log \log n})$-approximate mechanism for subadditive maximization 
 \cite{bei2012budget}. Moreover, in a Bayesian setup, assuming a prior distribution among the agent's costs, there exists a truthful mechanism with a 768/512-approximation ratio \cite{bei2012budget}. %(in terms of expectations)
 
- A series of recent papers  \cite{mcsherrytalwar,approximatemechanismdesign,xiao:privacy-truthfulness,chen:privacy-truthfulness} consider the related problem of retrieving data from an  \textit{unverified} database: the auctioneer  cannot verify the data reported by individuals and therefore must incentivize them to report this truthfully. 
+ A series of recent papers  \cite{mcsherrytalwar,approximatemechanismdesign,xiao:privacy-truthfulness,chen:privacy-truthfulness} consider the related problem of retrieving data from an  \textit{unverified} database: the auctioneer  cannot verify the data reported by individuals and therefore must incentivize them to report truthfully. 
 McSherry and Talwar \cite{mcsherrytalwar} argue that \emph{differentially private} mechanisms offer a form of \emph{approximate truthfulness}: if users have a utility that depends on their privacy, reporting their data untruthfully can only increase their utility by a small amount. Xiao \cite{xiao:privacy-truthfulness}, improving upon earlier work by Nissim \emph{et al.}~\cite{approximatemechanismdesign}, constructs mechanisms that
 simultaneously achieve exact truthfulness as well as differential privacy. Eliciting private data through a \emph{survey} \cite{roth-liggett}, whereby individuals first decide whether to participate in the survey and then report their data,
  also falls under the unverified database setting \cite{xiao:privacy-truthfulness}. In the \emph{verified} database setting, Ghosh and Roth~\cite{ghosh-roth:privacy-auction} and Dandekar \emph{et al.}~\cite{pranav} consider budgeted auctions where users have a utility again captured by differential privacy. Our work departs from the above setups in that utilities do not involve privacy, whose effects are assumed to be internalized in the costs reported by the users; crucially, we also assume that experiments are tamper-proof, and individuals can misreport their costs but not their values.
 
 \sloppy
-Our work is closest to the survey setup of Roth and Schoenebeck~\cite{roth-schoenebeck}, who also consider how to sample individuals with different features who reported a hidden value at a certain cost. The authors assume a prior on the joint distribution between costs and features, and wish to obtain an unbiased estimate of the expectation of the hidden value under the constraints of truthfulness, budget feasibility and individual rationality. Our work departs by learning a more general statistic (a linear model) than data means. We note that, as in \cite{roth-schoenebeck}, costs and features can be arbitrarily correlated (our results are prior-free).
+Our work is closest to the survey setup of Roth and Schoenebeck~\cite{roth-schoenebeck}, who also consider how to sample individuals with different features who report a hidden value at a certain cost. The authors assume a prior on the joint distribution between costs and features, and wish to obtain an unbiased estimate of the expectation of the hidden value under the constraints of truthfulness, budget feasibility and individual rationality. Our work departs by learning a more general statistic (a linear model) than data means. We note that, as in \cite{roth-schoenebeck}, costs and features can be arbitrarily correlated (our results are prior-free).
 
 
 \fussy
author	Thibaut Horel <thibaut.horel@gmail.com>	2012-11-05 22:00:49 +0100
committer	Thibaut Horel <thibaut.horel@gmail.com>	2012-11-05 22:00:49 +0100
commit	f9099cda78ef0229ca972ee4b8bbb0fb9fbde7e4 (patch)
tree	b723e4ad1357e13c95f97d8e40b6cebfb7c05949
parent	9c2dfdab302e7a5923336b4eed0bfa74c0c49b14 (diff)
parent	c5438848e77fca83bdf022efe002204a8273a2bb (diff)
download	recommendation-f9099cda78ef0229ca972ee4b8bbb0fb9fbde7e4.tar.gz