From f28514e9eb3ed56091b3d10b8f6efc3b91a0e2b6 Mon Sep 17 00:00:00 2001 From: Stratis Ioannidis Date: Wed, 3 Jul 2013 07:04:36 -0700 Subject: up to myerson --- intro.tex | 6 +++--- main.tex | 5 ++++- notes.bib | 4 +++- problem.tex | 72 ++++++++++++++++++++++++++++++++++++++++--------------------- 4 files changed, 58 insertions(+), 29 deletions(-) diff --git a/intro.tex b/intro.tex index 0712540..85f4f3f 100644 --- a/intro.tex +++ b/intro.tex @@ -21,7 +21,7 @@ In our setting, experiments cannot be manipulated and hence measurements are rel \E{} has a total budget of $B$ to conduct all the experiments. There is a cost $c_i$ associated with experimenting on subject $i$ which varies from subject to subject. This cost $c_i$ is determined by the subject $i$: it may be viewed as the -cost $i$ incurs when tested and for which she needs to be reimbursed; or, it might be viewed as the incentive for $i$ to participate in the experiment; or, it might be the intrinsic worth of the data to the user. The economic aspect of paying subjects has always been inherent in experimental design: experimenters often work within strict budgets and design creative incentives. Subjects often negotiate better incentives or higher payments. +cost $i$ incurs when tested and for which she needs to be reimbursed; or, it might be viewed as the incentive for $i$ to participate in the experiment; or, it might be the intrinsic worth of the data to the subject. The economic aspect of paying subjects has always been inherent in experimental design: experimenters often work within strict budgets and design creative incentives. Subjects often negotiate better incentives or higher payments. However, we are not aware of a principled study of this setting from a strategic point of view, when subjects declare their costs and therefore determine their payment. Such a setting is increasingly realistic, given the growth of these experiments over the Internet and associated data markets. % When subjects are strategic, they may have an incentive to misreport their cost, leading to the need for a sophisticated choice of experiments and payments. Arguably, user incentiviation is of particular pertinence due to the extent of statistical analysis over user data on the Internet. %, which has led to the rise of several different research efforts in studying data markets \cite{...}. @@ -29,7 +29,7 @@ However, we are not aware of a principled study of this setting from a strategic Our contributions are as follows. \begin{itemize} \item -We initiate the study of experimental design problem in presence of a budget and strategic subjects. +We initiate the study of experimental design in the presence of a budget and strategic subjects. %formulate the problem of experimental design subject to a given budget, in the presence of strategic agents who may lie about their costs. %In particular, we focus on linear regression. This is naturally viewed as a budget feasible mechanism design problem, in which the objective function %is sophisticated and %is related to the covariance of the $x_i$'s. In particular, we formulate the {\em Experimental Design Problem} (\SEDP) as @@ -41,7 +41,7 @@ subject to a budget constraint $\sum_{i\in S}c_i\leq B$, where $B$ is \E's budge \smallskip The objective function, which is the key, is formally obtained by optimizing the information gain in $\beta$ when the latter is learned through ridge regression, and is related to the so-called $D$-optimality criterion~\cite{pukelsheim2006optimal,atkinson2007optimum}. \item -We present a polynomial time, truthful mechanism for \SEDP{}, yielding a constant factor ($\approx 12.98$) approximation to the optimal value of \eqref{obj}. In contrast to this, we show that no truthful, budget-feasible mechanisms are possible for \SEDP{} within a factor 2 approximation. +We present a polynomial time, $\epsilon$-truthful mechanism for \SEDP{}, yielding a constant factor ($\approx 12.98$) approximation to the optimal value of \eqref{obj}. In contrast to this, we show that no truthful, budget-feasible mechanisms are possible for \SEDP{} within a factor 2 approximation. \smallskip We note that the objective \eqref{obj} is submodular. Using this fact, applying previous results on budget feasible mechanism design under general submodular objectives~\cite{singer-mechanisms,chen} would yield either a deterministic, truthful, constant-approximation mechanism that requires exponential time, or a non-deterministic, (universally) truthful, poly-time mechanism that yields a constant approximation ratio only \emph{in expectation} (\emph{i.e.}, its approximation guarantee for a given instance may in fact be unbounded). diff --git a/main.tex b/main.tex index ff26c5d..f44926b 100644 --- a/main.tex +++ b/main.tex @@ -1,6 +1,8 @@ \label{sec:main} -In this section we present our mechanism for \SEDP. Prior approaches to budget +In this section we present our mechanism for \SEDP. +\begin{comment} +Prior approaches to budget feasible mechanisms for submodular maximization build upon the full information case, which we discuss first. @@ -24,6 +26,7 @@ the following algorithm: \;\textbf{else return}\; S_G \end{equation} has a constant approximation ratio \cite{singer-mechanisms}. +\end{comment} \subsection{Submodular Maximization in the Strategic Case} diff --git a/notes.bib b/notes.bib index 4bcc719..363436e 100644 --- a/notes.bib +++ b/notes.bib @@ -319,9 +319,11 @@ year = 2012 bibsource = {DBLP, http://dblp.uni-trier.de} } -@article{krause-submodular, +@techreport{krause-submodular, title={A note on the budgeted maximization of submodular functions}, author={Krause, A. and Guestrin, C.}, + institution= {CMU}, + number = {CMU-CALD-05-103}, year={2005} } diff --git a/problem.tex b/problem.tex index a5db2ac..c05e20c 100644 --- a/problem.tex +++ b/problem.tex @@ -3,7 +3,7 @@ The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimum,chaloner1995bayesian} considers the following formal setting. % studies how an experimenter \E\ should select the parameters of a set of experiments she is about to conduct. In general, the optimality of a particular design depends on the purpose of the experiment, \emph{i.e.}, the quantity \E\ is trying to learn or the hypothesis she is trying to validate. Due to their ubiquity in statistical analysis, a large literature on the subject focuses on learning \emph{linear models}, where \E\ wishes to fit a linear function to the data she has collected. %Putting cost considerations aside -Suppose that an experimenter \E\ wishes to conduct $k$ among $n$ possible experiments. Each experiment $i\in\mathcal{N}\defeq \{1,\ldots,n\}$ is associated with a set of parameters (or features) $x_i\in \reals^d$, normalized so that $\|x_i\|_2\leq 1$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$, related to the experiment features $x_i$ through a linear function, \emph{i.e.}, +Suppose that an experimenter \E\ wishes to conduct $k$ among $n$ possible experiments. Each experiment $i\in\mathcal{N}\defeq \{1,\ldots,n\}$ is associated with a set of parameters (or features) $x_i\in \reals^d$, normalized so that $$b\leq \|x_i\|_2\leq 1,$$ for some $b>0$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$, related to the experiment features $x_i$ through a linear function, \emph{i.e.}, \begin{align}\label{model} \forall i\in\mathcal{N},\quad y_i = \T{\beta} x_i + \varepsilon_i \end{align} @@ -19,7 +19,7 @@ etc.). The magnitude of the coefficient $\beta_i$ captures the effect that featu The purpose of these experiments is to allow \E\ to estimate the model $\beta$. In particular, assume that the experimenter \E\ has a {\em prior} distribution on $\beta$, \emph{i.e.}, $\beta$ has a multivariate normal prior -with zero mean and covariance $\sigma^2R^{-1}\in \reals^{d^2}$ (where $\sigma^2$ is the noise variance). +with zero mean and covariance $\sigma^2R^{-1}\in \reals^{d^2}$ (where $\sigma^2$ is the noise variance and the prior on $\beta$). Then, \E\ estimates $\beta$ through \emph{maximum a posteriori estimation}: \emph{i.e.}, finding the parameter which maximizes the posterior distribution of $\beta$ given the observations $y_S$. Under the linearity assumption \eqref{model} and the Gaussian prior on $\beta$, maximum a posteriori estimation leads to the following maximization \cite{hastie}: \begin{align} \hat{\beta} = \argmax_{\beta\in\reals^d} \prob(\beta\mid y_S) =\argmin_{\beta\in\reals^d} \big(\sum_{i\in S} (y_i - \T{\beta}x_i)^2 @@ -33,17 +33,12 @@ This optimization, commonly known as \emph{ridge regression}, includes an additi %\begin{align} \hat{\beta} &=\max_{\beta\in\reals^d}\prob(y_S;\beta) =\argmin_{\beta\in\reals^d } \sum_{i\in S}(\T{\beta}x_i-y_i)^2 \nonumber\\ %& = (\T{X_S}X_S)^{-1}X_S^Ty_S\label{leastsquares}\end{align} %The estimator $\hat{\beta}$ is unbiased, \emph{i.e.}, $\expt{\hat{\beta}} = \beta$ (where the expectation is over the noise variables $\varepsilon_i$). Furthermore, $\hat{\beta}$ is a multidimensional normal random variable with mean $\beta$ and covariance matrix $(X_S\T{X_S})^{-1}$. -Note that the estimator $\hat{\beta}$ is a linear map of $y_S$; as $y_S$ is -a multidimensional normal r.v., so is $\hat{\beta}$ (the randomness coming from -the noise terms $\varepsilon_i$). -%In particular, $\hat{\beta}$ has %mean $\beta$% (\emph{i.e.}, it is an \emph{unbiased estimator}) and -% covariance -%$(R+\T{X_S}X_S)^{-1}$. +%Note that the estimator $\hat{\beta}$ is a linear map of $y_S$; as $y_S$ is a multidimensional normal r.v., so is $\hat{\beta}$ (the randomness coming from the noise terms $\varepsilon_i$). +%In particular, $\hat{\beta}$ has %mean $\beta$% (\emph{i.e.}, it is an \emph{unbiased estimator}) and covariance $(R+\T{X_S}X_S)^{-1}$. Let $V:2^\mathcal{N}\to\reals$ be a \emph{value function}, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The classical experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. -A variety of different value functions are used in literature~\cite{pukelsheim2006optimal}. -%; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. -A value function that has natural advantages is the \emph{information gain}: %\emph{$D$-optimality criterion}: %which yields the following optimization problem +A variety of different value functions are used in literature~\cite{pukelsheim2006optimal,boyd2004convex}; %; almost all make use of the covariance $(R+\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. +one that has natural advantages is the \emph{information gain}: %\emph{$D$-optimality criterion}: %which yields the following optimization problem \begin{align} V(S)= I(\beta;y_S) = \entropy(\beta)-\entropy(\beta\mid y_S). \label{informationgain} \end{align} @@ -57,7 +52,13 @@ Under the linear model \eqref{model}, and the Gaussian prior, the information ga \end{align} This value function is known in the experimental design literature as the $D$-optimality criterion -\cite{pukelsheim2006optimal,atkinson2007optimum,chaloner1995bayesian}. +\cite{pukelsheim2006optimal,atkinson2007optimum,chaloner1995bayesian}. + +Note that the estimator $\hat{\beta}$ is a linear map of $y_S$; as $y_S$ is a multidimensional normal r.v., so is $\hat{\beta}$ (the randomness coming from the noise terms $\varepsilon_i$). +In particular, $\hat{\beta}$ has %mean $\beta$% (\emph{i.e.}, it is an \emph{unbiased estimator}) and +covariance $(R+\T{X_S}X_S)^{-1}$. As such, maximizing $V(S)$ can alternatively be seen as a means of reducing the uncertainty on estimator $\hat{\beta}$ my minimizing the product of the eigenvalues of its covariance. + +%An alternative interpretation, given that $(R+ \T{X_S}X_S)^{-1}$ is the covariance of the estimator $\hat{\beta}$, is that it tries to minimize the %which is indeed a function of the covariance matrix $(R+\T{X_S}X_S)^{-1}$. %defined as $-\infty$ when $\mathrm{rank}(\T{X_S}X_S)\delta.$ The mechanism is \emph{truthful} if $\delta=0$. \item \emph{Budget Feasibility.} The sum of the payments should not exceed the budget constraint, \emph{i.e.} $\sum_{i\in\mathcal{N}} p_i(c) \leq B.\label{budget}$ @@ -186,7 +210,7 @@ Given that the full information problem \EDP{} is NP-hard, we further seek mecha As noted in \cite{singer-mechanisms, chen}, budget feasible reverse auctions are \emph{single parameter} auctions: each agent has only one private value (namely, $c_i$). As such, Myerson's Theorem \cite{myerson} gives -a characterization of truthful mechanisms. +a characterization of truthful mechanisms. NEEDS TO BE FIXED \begin{lemma}[\citeN{myerson}]\label{thm:myerson} \sloppy A normalized mechanism $\mathcal{M} = (S,p)$ for a single parameter auction is -- cgit v1.2.3-70-g09d2