diff options
| -rw-r--r-- | notes.tex | 496 | ||||
| -rw-r--r-- | paper.tex | 23 | ||||
| -rw-r--r-- | papers/budget_feasible_mechanisms/BudgetFeasibleMechanisms.pdf (renamed from papers/BudgetFeasibleMechanisms.pdf) | bin | 231480 -> 231480 bytes | |||
| -rw-r--r-- | papers/budget_feasible_mechanisms/HowToWinFriendsAndInfluencePeople.pdf (renamed from papers/HowToWinFriendsAndInfluencePeople.pdf) | bin | 261742 -> 261742 bytes | |||
| -rw-r--r-- | papers/budget_feasible_mechanisms/SODA11_054_chenn.pdf (renamed from papers/SODA11_054_chenn.pdf) | bin | 977285 -> 977285 bytes | |||
| -rw-r--r-- | proof.tex | 571 | ||||
| -rw-r--r-- | stoc_paper.pdf | bin | 0 -> 328249 bytes | |||
| -rw-r--r-- | stoc_paper.tex | 49 |
8 files changed, 56 insertions, 1083 deletions
diff --git a/notes.tex b/notes.tex deleted file mode 100644 index 1ce4cc4..0000000 --- a/notes.tex +++ /dev/null @@ -1,496 +0,0 @@ -\documentclass{IEEEtran} -%\usepackage{mathptmx} -\usepackage[utf8]{inputenc} -\usepackage{amsmath,amsthm,amsfonts} -\newtheorem{lemma}{Lemma} -\newtheorem{fact}{Fact} -\newtheorem{example}{Example} -\newtheorem{prop}{Proposition} -\newcommand{\var}{\mathop{\mathrm{Var}}} -\newcommand{\condexp}[2]{\mathop{\mathbb{E}}\left[#1|#2\right]} -\newcommand{\expt}[1]{\mathop{\mathbb{E}}\left[#1\right]} -\newcommand{\norm}[1]{\lVert#1\rVert} -\newcommand{\tr}[1]{#1^*} -\newcommand{\ip}[2]{\langle #1, #2 \rangle} -\newcommand{\mse}{\mathop{\mathrm{MSE}}} -\DeclareMathOperator{\trace}{tr} -\DeclareMathOperator*{\argmax}{arg\,max} -\title{Value of data} -\author{Stratis Ionnadis \and Thibaut Horel} -\begin{document} -\maketitle - -\section{Introduction} - -The goal of this work is to propose a framework to study the value of user -data. Although it is clear that there is some notion of value attached to -user data (for example user data can be used to generate revenue through online -advertising, recommender systems, etc.), it is not clear which definition of -value should be used for formal works on this notion. After having proposed -a definition of value, we study how generic economic problem behave with -regards to this definition. Finally we study the computational feasibility of -these problems. - -\section{Data model} -\label{sec:data-model} - -There is a set of users and an experimenter. Each user has a vector of public -features (e.g. age, height, binary features, labels, etc.) and a private piece -of information: an undisclosed variable. - -The experimenter wants to learn the relationship between the public features -and the private variable. Before conducting the experiment, he has a prior -knowledge of this relationship, called his \emph{hypothesis} and the experiment -consists in selecting a set of users and asking them to reveal their private -variables. Based on the observed data, the experimenter updates his hypothesis. - -For the experimenter, there is a notion of value attached to a group of users: -this is how much the data teaches him about the hypothesis, how much it reduces -his uncertainty about the hypothesis. For the users, there is a cost attached -to revealing their data. The experimenter also has a budget constraint on the -amount of money he can spend. - -The problems arising in this setup are natural: the experimenter wants to -maximize his utility: the value of the set of users he selects, but he needs to -compensate the users by taking into account their costs and his budget -constraint. The users' costs can either be public, which directly leads to -combinatorial optimizations problems, or private, in which case, a notion of -strategy intervenes in the way the experimenter compensates the users and -requires an auction approach. - -Formally, there is a set of users indexed by a set $\mathcal{I}$. The public -feature vector of user $i\in\mathcal{I}$ is an element $x_i$ of a feature set -$E$, his undisclosed variable is denoted by $y_i$ and belongs to some space -$A$. The cost of user $i$ for revealing his data is a positive real number -$c_i\in\mathbf{R}_+$. The budget of the experimenter is denoted by $B$. - -The prior knowledge of the experimenter takes the form of a random variable $H$ -over $A^E$ or a subset of $A^E$ called the \emph{hypothesis set}. This random -variable expresses his uncertainty about the true hypothesis $h$. The true -hypothesis gives the relationship between the feature vector of user $i$ and -his private variable through the equation: -\begin{equation}\label{eq:hypothesis-model} - y_i = h(x_i) + \varepsilon_i -\end{equation} -where $\varepsilon_i$ is a random variable over $A$. - -\emph{TODO: explain why this model is not restrictive: $y$ can always be - written as a deterministic function of $x$ plus something independent of -$x$} - -\emph{Examples.} -\begin{enumerate} - \item if the hypothesis set is finite (the experimenter has a few - deterministic model he wants to choose from), observing data allows him - to rule out parts of the hypothesis set. In this case, the uncertainty - of the experimenter could simply be measured by the size of the - hypothesis set. - \item if $A=\{0,1\}$ and the hypothesis set is the set of all binary - functions on $E$, the learning task of the experimenter is a binary - classification problem. - \item if $A=\mathbf{R}$, $E=\mathbf{R}^d$ and the hypothesis set is the set - of linear functions from $E$ to $A$: $\mathcal{L}(A,E)$, the learning - task is a linear regression. The prior knowledge of the experimenter is - a prior distribution on the parameters of the linear function, which is - equivalent to regularized linear regression (e.g. ridge regression). -\end{enumerate} - -\section{Economics of data} -\label{sec:economics} - -The goal of this section is to further discuss the optimization problems -mentioned in the previous section. - -The value function (the utility function of the experimenter) will be denoted -by $V$ and is simply a function mapping a set of users $S\subset \mathcal{I}$ -to $V(S)\in\mathbf{R}_+$. The choice of a specific value function will be -discussed in Section~\ref{sec:data-value}. - -\subsection{Optimal observation selection} - -In this problem, the costs $(c_i)_{i\in \mathcal{I}}$ of the users are public. -When selecting the set $S\subset\mathcal{I}$ of users, the experimenter has to -pay exactly $\sum_{i\in S} c_i$. Hence, the optimization problem consist in -selecting $S^*$ defined by: -\begin{equation}\label{eq:observation-selection} - S^* = \argmax_{S\subset\mathcal{I}}\Big\{V(S)\;\Big|\; - \sum_{i\in S} c_i \leq B\Big\} -\end{equation} - -This is a function maximization problem under a knapsack constraint. If $V$ can -be any function, then this problem is obviously NP-hard. A common assumption to -make on $V$ is that it is submodular (see Section~\ref{sec:submodularity}) -which is the extension of the notion of convexity for set functions. However, -maximizing a submodular function under knapsack constraint is still NP-hard. - -Sviridenko (2004) gave a (1-1/e) polynomial time approximation ratio for this -problem, when the value function is non-decreasing and submodular. - -Note that this problem covers the case where all users have the same cost $c$. -In this case, letting $B' = \left\lfloor B/c\right\rfloor$, the problems -becomes a maximization problem under a size constraint: -\begin{displaymath} - S^* = \argmax_{S\subset\mathcal{I}}\left\{V(S)\;|\; - | S| \leq B'\right\} -\end{displaymath} -for which Newhauser (1978) gave an optimal (1-1/e) polynomial approximation -scheme, in the case of a non-decreasing submodular function. - -\subsection{Budget feasible auctions} - -Here, the cost of the users are private. Before the beginning of the -experiment, they report a cost $c_i'$ which is not necessarily equal to their -true cost: a user may decide to lie to receive more money, however, by doing -so, their is a risk of not being included in the experiment. - -This notion of strategy involved in the way the users reveal their costs roots -this problem in the auction and mechanism design theory. Formally, the -experimenter wants to design an allocation function $f: -\mathbf{R}_+^{\mathcal{I}} \rightarrow 2^{\mathcal{I}}$, which, given the -reported costs of the users selects the set of users to be included in the -experiment, and a payment function $p: \mathbf{R}_+^{\mathcal{I}} \rightarrow -\mathbf{R}_+^{\mathcal{I}}$, which given the reported costs returns the vector -of payments to allocate to each user. - -For notation convenience, we will assume given the costs $\{c_i, -i\in\mathcal{I}\}$ of the users, and we will denote by $\{s_i, -i\in\mathcal{I}\}$ the characteristic function of $f(\{c_i, -i\in\mathcal{I}\})$, that is, $s_i = 1$ iff $i\in f(\{c_i, i\in\mathcal{I}\})$. -The payment received by user $i$ will be denoted by $p_i$. - -The mechanism should satisfy the following conditions: -\begin{itemize} - \item \textbf{Normalized} if $s_i = 0$ then $p_i = 0$. - \item \textbf{Individually rational} $p_i \geq s_ic_i$. - \item \textbf{Truthful} $p_i - s_ic_i \geq p_i' - s_i'c_i$, where $p_i'$ - and $s_i'$ are the payment and allocation of user $i$ had he reported - a cost $c_i'$ different from his true cost $c_i$ (keeping the costs - reported by the other users the same). - \item \textbf{Budget feasible} the payments should be within budget: - \begin{displaymath} - \sum_{i\in \mathcal{I}} s_ip_i \leq B - \end{displaymath} -\end{itemize} - -Yaron Singer (2010) proved a lower bound of 2 for the approximation ratio of -a polynomial algorithm to solve this problem. He also gave a randomized general -algorithm with an approximation ratio of 117.7, although for specific problems -(coverage, knapsack, etc.) better ratios can be attained. - -State of the art: Chenn, Gravin and Lie, lower bound of $1+\sqrt{2}$ and upper -bound of 8.34 in the fractional non-strategic optimization problem can be -solved in polynomial time. - -\section{Value of data} -\label{sec:data-value} - -Here, we discuss a choice for the value function appearing in the problems -discussed in Section~\ref{sec:economics}. Such a value function should be at -least normalized, positive and non-decreasing. It should furthermore capture -a notion of \emph{uncertainty reduction} related to the learning task of the -experimenter. - -The prior knowledge of the experimenter, the distribution of the random -variable $H$ over the hypothesis set conveys his uncertainty about the true -hypothesis. A common measure of uncertainty is given by the entropy. If we -denote by $P_H$ the probability distribution of $H$, its entropy is defined -by\footnote{Here we choose to write the entropy of a discrete distribution, but - our results do not rely on this assumption: if the distribution is - continuous, one simply needs to replace the entropy by the differential -entropy.}: -\begin{displaymath} - \mathbb{H}(H) = -\sum_{h\in A^E}P_H(h)\log\big(P_H(h)\big) -\end{displaymath} - -We will denote by $Y_i$, $i\in\mathcal{I}$ the answer of user $i$ according to -the experimenter's knowledge: -\begin{equation}\label{eq:data-model} - Y_i = H(x_i) + \varepsilon_i -\end{equation} -and $Y_S$ will denote the set of answers from the users in a subset $S$ of -$\mathcal{I}$. - -We can now define the value of a group $S$ of users as being the decrease of -entropy induced by observing $S$: -\begin{equation}\label{eq:value} - \forall S\subset I,\; V(S) - = \mathbb{H}(H) - \mathbb{H}(H\,|\,Y_S) -\end{equation} -where $\mathbb{H}(H\,|\,Y_S)$ is the conditional entropy of $H$ given $Y_S$. -One can also note that the definition of the value given in \eqref{eq:value} is -simply the mutual information $I(H;Y_S)$ between $H$ and $Y_S$. Submodularity -is conserved by composition by a concave function on the left (see -Section~\ref{sec:submodularity}). Hence, the definition of the value of $S$ can be -extended to any $f\big(V(S)\big)$ where $f$ is a non-decreasing concave -function. - -This notion of value, also known as the \emph{value of information} (TODO: ref) -can be motivated by considering the information theoretic interpretation of the -entropy: if we consider that the experimenter has access to an oracle to whom -he can ask yes/no questions. Then, the entropy of the distribution is exactly -the number of questions he needs to ask to fully know the hypothesis. If he -needs to pay for each question asked to the oracle, then our definition of -value directly relates to the cost decrease implied by observing a set of -users. - -Using the \emph{information never hurts} principle, it is easy to see that the -value function defined by \eqref{eq:value} is positive an non-decreasing (with -regard to inclusion). - -Furthermore, if we add the natural assumption that -$(\varepsilon_i)_{i\in\mathcal{I}}$ are jointly independent, which is -equivalent to say that conditioned on the hypothesis, the -$(Y_i)_{i\in\mathcal{I}}$ are independent, we get the following fact. - -\begin{fact}\label{value-submodularity} - Assuming that $(Y_i)_{i\in\mathcal{I}}$ defined in \eqref{eq:data-model} - are independent conditioned on $H$, then $V$ defined in \eqref{eq:value} is - submodular. -\end{fact} - -\begin{proof} - Using the chain rule, one can rewrite the value of $S$ as: - \begin{displaymath} - V(S) = \mathbb{H}(Y_S) - \mathbb{H}(Y_S\,|\, H) - \end{displaymath} - We can now use the conditional independence of $Y_S$ to write the - conditional entropy as a sum: - \begin{displaymath} - V(S) = \mathbb{H}(Y_S) - \sum_{s\in S}\mathbb{H}(Y_s\,|\, H) - \end{displaymath} - - It is well known that the joint entropy of a set of random variable is - submodular. Thus the last equation expresses $V$ as the sum of a submodular - function and of an additive function. As a consequence, $V$ is submodular. -\end{proof} - -\section{Value of data in the linear regression setup} - -In this section we will assume a linear model: the feature vectors belong to -$\mathbf{R}^d$ and the private variables belong to $\mathbf{R}$. The private -variable can be expressed as a linear combination of the features: -\begin{displaymath} - y = \beta^*x + \epsilon -\end{displaymath} -The noise $\epsilon$ is normally distributed, zero-mean and of variance -$\sigma^2$. Furthermore, the noise is independent of the user. - -The hypothesis set of the experimenter is the set of all linear forms from -$\mathbf{R}^d$ to $\mathbf{R}$. Because a linear form can be uniquely -represented as the inner product by a vector, it is equivalent to say that the -hypothesis set is $\mathbf{R}^d$ and that the experimenter's hypothesis is -a random variable $\beta$ over $\mathbf{R}^d$. - -A common assumption made in linear regression is that $\beta$ is normally -distributed, zero-mean, and its covariance matrix is denoted by $\Sigma$. This -prior distribution conveys the idea that $\beta$ should have a small $L^2$ -norm, which means that the model should have some kind of sparsity. Indeed, it -is easy to prove that choosing the $\hat\beta$ which maximizes the \emph{a -posteriori} distribution given the observations under a normal prior is -equivalent to solve the following optimization problem: -\begin{displaymath} - \hat\beta = \arg\min_{\beta\in\mathbf{R}^d}\|Y-X\beta\|^2 + \lambda - \|\beta\|^2 -\end{displaymath} -where $\lambda$ can be expressed as a function of $\Sigma$ and $\sigma^2$. This -optimization problem is known as ridge regression, and is simply a least square -fit of the data with a penalization on the $\beta$ which have a large $L^2$ -norm, which is consistent with the prior distribution. - -\begin{fact} - Under the linear regression model, with a multivariate normal prior, the - value of data of a set $S$ of users is given by: - \begin{equation}\label{eq:linear-regression-value} - V(S) = \frac{1}{2}\log\det\left(I_d - + \frac{\Sigma}{\sigma^2}X_S^*X_S\right) - \end{equation} - where $X_S$ is the matrix whose rows are the line-vectors $x_s^*$ for $s$ in - $S$. -\end{fact} - -\begin{proof} -Let us recall that the entropy of a multivariate normal variable $B$ in -dimension $d$ and of covariance $\Sigma$ (the mean is not relevant) is given -by: -\begin{equation}\label{eq:multivariate-entropy} - \mathbb{H}(B) = \frac{1}{2}\log\big((2\pi e)^d \det \Sigma\big) -\end{equation} - -Using the chain rule as in the proof of Fact~\ref{value-submodularity} we get -that: -\begin{displaymath} - V(S) = \mathbb{H}(Y_S) - \mathbb{H}(Y_S\,|\,\beta) -\end{displaymath} - -Conditioned on $\beta$, $(Y_S)$ follows a multivariate normal -distribution of mean $X\beta$ and of covariance matrix $\sigma^2 I_n$. Hence: -\begin{equation}\label{eq:h1} - \mathbb{H}(Y_S\,|\,\beta) - = \frac{1}{2}\log\left((2\pi e)^n \det(\sigma^2I_n)\right) -\end{equation} - -$(Y_S)$ also follows a multivariate normal distribution of mean zero. Let us -compute its covariance matrix, $\Sigma_Y$: -\begin{align*} - \Sigma_Y & = \expt{YY^*} = \expt{(X_S\beta + E)(X_S\beta + E)^*}\\ - & = X_S\Sigma X_S^* + \sigma^2I_n -\end{align*} -Thus, we get that: -\begin{equation}\label{eq:h2} - \mathbb{H}(Y_S) - = \frac{1}{2}\log\left((2\pi e)^n \det(X_S\Sigma X_S^* + \sigma^2 I_n)\right) -\end{equation} - -Combining \eqref{eq:h1} and \eqref{eq:h2} we get: -\begin{displaymath} - V(S) = \frac{1}{2}\log\det\left(I_n+\frac{1}{\sigma^2}X_S\Sigma - X_S^*\right) -\end{displaymath} - -Finally, we can use the Sylvester's formula to get the result. -\end{proof} - -\emph{Remarks.} -\begin{enumerate} - \item it is known that for a set of symmetric positive definite matrices, - defining the value of a set to be the $\log\det$ of the sum of the - matrices yields a non-decreasing, submodular value function. Noting that: - \begin{displaymath} - X_S^*X_S = \sum_{s\in S}x_sx_s^* - \end{displaymath} - it is clear that our value function is non-decreasing and submodular. - The positivity follows from a direct application of the spectral - theorem. - \item the matrix which appears in the value function: - \begin{displaymath} - I_d + \frac{\Sigma}{\sigma^2}X_S^*X_S - \end{displaymath} - is also the inverse of the covariance matrix of the ridge regression - estimator. In optimal experiment design, it is common to use the - determinant of the inverse of the estiamator's covariance matrix as - a mesure of the quality of the predicion. Indeed, this directly relates to - the inverse of the volume of the confidence ellipsoid. - \item This value function can be computed up to a fixed decimal precision in - polynomial time. -\end{enumerate} -\subsubsection*{Marginal contribution} - -Here, we want to compute the marginal contribution of a point $x$ to a set $S$ of -users: -\begin{displaymath} - \Delta_xV(S) = V(S\cup \{x\}) - V(S) -\end{displaymath} - -Using that: -\begin{displaymath} - X_{S\cup\{x\}}^*X_{S\cup\{x\}} = X_S^*X_S + xx^* -\end{displaymath} -we get: -\begin{align*} - \Delta_x V(S) & = \frac{1}{2}\log\det\left(I_d - + \Sigma\frac{X_S^*X_S}{\sigma^2} + \Sigma\frac{xx^*}{\sigma^2}\right)\\ - & - \frac{1}{2}\log\det\left(I_d + \Sigma\frac{X_S^*X_S}{\sigma^2}\right)\\ - & = \frac{1}{2}\log\det\left(I_d + xx^*\left(\sigma^2\Sigma^{-1} + -X_S^*X_S\right)^{-1}\right)\\ -& = \frac{1}{2}\log\left(1 + x^*\left(\sigma^2\Sigma^{-1} -+ X_S^*X_S\right)^{-1}x\right) -\end{align*} - -\emph{Remark.} This formula shows that given a set $S$, users do not bring all -the same contribution to the set $S$. This contribution depends on the norm of -$x$ for the bilinear form defined by the matrix $(\sigma^2\Sigma^{-1} -+ X_S^*X_S)^{-1}$ which reflects how well the new point $x$ \emph{aligns} with -the already existing points. - -\section*{Appendix: Submodularity} -\label{sec:submodularity} - -In this section, we will consider that we are given a \emph{universe} set $U$. -A set function $f$ is a function defined on the power set of $U$, $\mathfrak{P}(U)$. - -A set function $f$ defined on $\mathfrak{P}(U)$ will be said \emph{increasing} if -it is increasing with regards to inclusion, that is: - -\begin{displaymath} - \forall\,S\subseteq T,\quad f(S)\leq f(T) -\end{displaymath} - -A \emph{decreasing} function on $\mathfrak{P}(U)$ is defined similarly. - -A set function $f$ defined on $\mathfrak{P}(U)$ is said to be -\emph{submodular} if it verifies the diminishing returns property, that is, -the marginal increments when adding a point to a set, is a set decreasing -function. More formally, for any point $x$ in $U$, we can define the marginal -increment of $f$ regarding $x$, it is the set function defined as: -\begin{displaymath} - \Delta_x f(S) = f(S\cup\{x\}) - f(S) -\end{displaymath} -Then, $f$ is \emph{submodular} iff. for all $x$ in $U$, $\Delta_x f$ is a set -decreasing function. - -Similarly, a \emph{supermodular} is a function whose marginal increments are set -increasing functions. -\begin{prop} - Let $R:\mathbf{R}\rightarrow \mathbf{R}$ be a decreasing concave function and - $f:\mathfrak{P}(U)\rightarrow\mathbf{R}$ be a decreasing submodular function, - then the composed function $R\circ f$ is increasing and supermodular. -\end{prop} - -\begin{proof} - The increasingness of $R\circ f$ follows immediately from the decreasingness - of $R$ and $f$. - - For the supermodularity, let $S$ and $T$ be two sets such that $S\subseteq - T$. By decreasingness of $f$, we have: - \begin{displaymath} - \forall\,V,\quad f(T)\leq f(S)\quad\mathrm{and}\quad f(T\cup V)\leq f(S\cup V) - \end{displaymath} - - Thus, by concavity of $R$: - \begin{displaymath}\label{eq:base} - \begin{split} - \forall\,V,\quad\frac{R\big(f(S)\big)-R\big(f(S\cup V)\big)}{f(S)-f(S\cup - V)}\\ - \leq\frac{R\big(f(T)\big)-R\big(f(T\cup V)\big)}{f(T)-f(T\cup V)} - \end{split} - \end{displaymath} - - $f$ is decreasing, so multiplying this last inequality by - $f(S)-f(S\cup V)$ and $f(T)-f(T\cup V)$ yields: - \begin{multline} - \forall V,\quad\Big(R\big(f(S)\big)-R\big(f(S\cup V)\big)\Big)\big(f(T)-f(T\cup V)\big)\\ - \leq \Big(R\big(f(T)\big)-R\big(f(T\cup V)\big)\Big)\big(f(S)-f(S\cup V)\big) - \end{multline} - - $f$ is submodular, so: - \begin{displaymath} - f(T\cup V)-f(T)\leq f(S\cup V) - f(S) - \end{displaymath} - - $R\circ f$ is increasing, so: - \begin{displaymath} - R\big(f(S)\big)-R\big(f(S\cup V)\big)\leq 0 - \end{displaymath} - - By combining the two previous inequalities, we get: - \begin{multline*} - \forall V,\quad\Big(R\big(f(S)\big)-R\big(f(S\cup V)\big)\Big)\big(f(S)-f(S\cup V)\big)\\ - \leq \Big(R\big(f(S)\big)-R\big(f(S\cup V)\big)\Big)\big(f(T)-f(T\cup V)\big) - \end{multline*} - - Injecting this last inequality into \eqref{eq:base} gives: - \begin{multline*} - \forall V,\quad\Big(R\big(f(S)\big)-R\big(f(S\cup V)\big)\Big)\big(f(S)-f(S\cup V)\big)\\ - \leq \Big(R\big(f(T)\big)-R\big(f(T\cup V)\big)\Big)\big(f(S)-f(S\cup V)\big) - \end{multline*} - - Dividing left and right by $f(S)-f(S\cup V)$ yields: - \begin{displaymath} - \forall V,\quad R\big(f(S)\big)-R\big(f(S\cup V)\big) - \leq R\big(f(T)\big)-R\big(f(T\cup V)\big) - \end{displaymath} - which is exactly the supermodularity of $R\circ f$. -\end{proof} - -\end{document} - @@ -1,28 +1,19 @@ -\documentclass{acm_proc_article-sp} +\documentclass{article} \usepackage[utf8]{inputenc} -\usepackage{amsmath,amsfonts} +\usepackage{amsmath,amsfonts,amsthm} \usepackage{algorithm} \usepackage{algpseudocode,bbm,color,verbatim} \input{definitions} \title{Budget Feasible Mechanisms for Experimental Design} - +\author{ +Thibaut Horel\\Technicolor\\\texttt{thibaut.horel@ens.fr} \and +Stratis Ioannidis\\Technicolor\\\texttt{stratis.ioannidis@technicolor.com}\and +S. Muthukrishnan\\Rutgers University, Microsoft Research\\\texttt{muthu@cs.rutgers.edu}} %Remove permission block empty space \makeatletter \let\@copyrightspace\relax \makeatother -\numberofauthors{3} -\author{ -\alignauthor Thibaut Horel\\ %\titlenote{\texttt{thibaut.horel@ens.fr}}\\ - \affaddr{Technicolor}\\ - \email{thibaut.horel@ens.fr} -\alignauthor Stratis Ioannidis\\ % \titlenote{\texttt{stratis.ioannidis@technicolor.com}}\\ - \affaddr{Technicolor}\\ - \email{stratis.ioannidis\\@technicolor.com} -\alignauthor S. Muthukrishnan\\% \titlenote{\texttt{muthu@cs.rutgers.edu}}\\ - \affaddr{Rutgers University, Microsoft Research}\\ - \email{muthu@cs.rutgers.edu} -} \begin{document} \maketitle @@ -44,6 +35,6 @@ \bibliographystyle{abbrv} \bibliography{notes} \end{small} -\appendix +\section*{Appendix} \input{appendix} \end{document} diff --git a/papers/BudgetFeasibleMechanisms.pdf b/papers/budget_feasible_mechanisms/BudgetFeasibleMechanisms.pdf Binary files differindex 76f3902..76f3902 100644 --- a/papers/BudgetFeasibleMechanisms.pdf +++ b/papers/budget_feasible_mechanisms/BudgetFeasibleMechanisms.pdf diff --git a/papers/HowToWinFriendsAndInfluencePeople.pdf b/papers/budget_feasible_mechanisms/HowToWinFriendsAndInfluencePeople.pdf Binary files differindex 2d35e81..2d35e81 100644 --- a/papers/HowToWinFriendsAndInfluencePeople.pdf +++ b/papers/budget_feasible_mechanisms/HowToWinFriendsAndInfluencePeople.pdf diff --git a/papers/SODA11_054_chenn.pdf b/papers/budget_feasible_mechanisms/SODA11_054_chenn.pdf Binary files differindex 5946a60..5946a60 100644 --- a/papers/SODA11_054_chenn.pdf +++ b/papers/budget_feasible_mechanisms/SODA11_054_chenn.pdf diff --git a/proof.tex b/proof.tex deleted file mode 100644 index bc57ca0..0000000 --- a/proof.tex +++ /dev/null @@ -1,571 +0,0 @@ -\documentclass{acm_proc_article-sp} -\usepackage[utf8]{inputenc} -\usepackage{amsmath,amsfonts} -\usepackage{algorithm} -\usepackage{algpseudocode} -\newtheorem{lemma}{Lemma} -\newtheorem{fact}{Fact} -\newtheorem{example}{Example} -\newtheorem{prop}{Proposition} -\newtheorem{theorem}{Theorem} -\newcommand*{\defeq}{\stackrel{\text{def}}{=}} -\newcommand{\var}{\mathop{\mathrm{Var}}} -\newcommand{\condexp}[2]{\mathop{\mathbb{E}}\left[#1|#2\right]} -\newcommand{\expt}[1]{\mathop{\mathbb{E}}\left[#1\right]} -\newcommand{\norm}[1]{\lVert#1\rVert} -\newcommand{\tr}[1]{#1^*} -\newcommand{\ip}[2]{\langle #1, #2 \rangle} -\newcommand{\mse}{\mathop{\mathrm{MSE}}} -\DeclareMathOperator{\trace}{tr} -\DeclareMathOperator*{\argmax}{arg\,max} -\begin{document} - -\section{Budget feasible mechanism} - -\subsection{Notations} - -\begin{itemize} - \item If $x\in\mathbf{R}^d$, $x^*$ will denote the transpose of vector $x$. - If $(x, y)\in\mathbf{R}^d\times\mathbf{R}^d$, $x^*y$ is the standard - inner product of $\mathbf{R}^d$. - \item We will often use the following order over symmetric matrices: if $A$ - and $B$ are two $d\times d$ real symmetric matrices, we write that - $A\leq B$ iff: - \begin{displaymath} - \forall x\in\mathbf{R}^d,\quad x^*Ax \leq x^*Bx - \end{displaymath} - That is, iff $B-A$ is symmetric semi-definite positive. - \item Let us recall this property of the ordered defined above: if $A$ and - $B$ are two symmetric definite positive matrices such that $A\leq B$, - then $A^{-1}\leq B^{-1}$. -\end{itemize} - -\subsection{Data model} -\begin{itemize} - \item set of $n$ users $\mathcal{N} = \{1,\ldots, n\}$ - \item each user has a public vector of features $x_i\in\mathbf{R}^d$ and some - undisclosed variable $y_i\in\mathbf{R}$ - \item we assume that the data has been normalized: $\forall - i\in\mathcal{N}$, $\norm{x_i}\leq 1$ - \item \textbf{Ridge regression model:} - \begin{itemize} - \item $y_i = \beta^*x_i + \varepsilon_i$ - \item $\varepsilon_i \sim \mathcal{N}(0,\sigma^2)$, - $(\varepsilon_i)_{i\in \mathcal{N}}$ are mutually independent. - \item prior knowledge of $\beta$: $\beta\sim\mathcal{N}(0,\kappa I_d)$ - \item $\mu = \frac{\kappa}{\sigma^2}$ is the regularization factor. - \item after the variables $y_i$ are disclosed, the experimenter - does \emph{maximum a posteriori estimation} which is equivalent - under Gaussian prior to solving the ridge regression - maximisation problem: - \begin{displaymath} - \beta_{\text{max}} = \argmax \sum_i (y_i - \beta^*x_i)^2 - + \frac{1}{\mu}\sum_i \norm{\beta}_2^2 - \end{displaymath} - \end{itemize} -\end{itemize} - -\subsection{Economics} - -Value function: following the experiment design theory, the value of -data is the decrease of uncertainty about the model. Common measure of -uncertainty, the entropy. Thus, the value of a set $S$ of users is: -\begin{displaymath} - V(S) = \mathbb{H}(\beta) - \mathbb{H}(\beta|Y_S) -\end{displaymath} -where $Y_S = \{y_i,\,i\in S\}$ is the set of observations. - -In our case (under Gaussian prior): -\begin{align*} - \forall S\subset\mathcal{N},\; V(S) - & = \frac{1}{2}\log\det\left(I_d - + \mu\sum_{i\in S} x_ix_i^*\right)\\ - & \defeq \frac{1}{2}\log\det A(S) -\end{align*} - -\begin{lemma}[Marginal contribution] - \begin{displaymath} - \Delta_i V(S)\defeq V(S\cup\{i\}) - V(S) - = \frac{1}{2}\log\left(1 + \mu x_i^*A(S)^{-1}x_i\right) - \end{displaymath} -\end{lemma} - -\begin{proof} - We have: - \begin{align*} - V(S\cup\{i\}) & = \frac{1}{2}\log\det A(S\cup\{i\})\\ - & = \frac{1}{2}\log\det\left(A(S) + \mu x_i x_i^*\right)\\ - & = V(S) + \frac{1}{2}\log\det\left(I_d + \mu A(S)^{-1}x_i - x_i^*\right)\\ - & = V(S) + \frac{1}{2}\log\left(1 + \mu x_i^* A(S)^{-1}x_i\right) - \end{align*} - where the last equality comes from Sylvester's determinant formula. -\end{proof} -\begin{itemize} - \item each user $i$ has a cost $c_i$ - \item the auctioneer has a budget constraint $B$ - \item optimisation problem: - \begin{displaymath} - OPT(V,\mathcal{N}, B) = \max_{S\subset\mathcal{N}} \left\{ V(S)\,|\, - \sum_{i\in S}c_i\leq B\right\} - \end{displaymath} -\end{itemize} - -\subsection{Relaxations of the value function} - -We say that $R_\mathcal{N}:[0,1]^n\rightarrow\mathbf{R}$ is a relaxation of the -value function $V$ over $\mathcal{N}$ if it coincides with $V$ at binary -points. Formally, for any $S\subset\mathcal{N}$, let $\mathbf{1}_S$ denote the -indicator vector of $S$. $R_\mathcal{N}$ is a relaxation of $V$ over -$\mathcal{N}$ iff: -\begin{displaymath} - \forall S\subset\mathcal{N},\; R_\mathcal{N}(\mathbf{1}_S) = V(S) -\end{displaymath} - -We can extend the optimisation problem defined above to a relaxation by -extending the cost function: -\begin{displaymath} - \forall \lambda\in[0,1]^n,\; c(\lambda) - = \sum_{i\in\mathcal{N}}\lambda_ic_i -\end{displaymath} -The optimisation problem becomes: -\begin{displaymath} - OPT(R_\mathcal{N}, B) = - \max_{\lambda\in[0,1]^n}\left\{R_\mathcal{N}(\lambda)\,|\, c(\lambda)\leq B\right\} -\end{displaymath} -The relaxations we will consider here rely on defining a probability -distribution over subsets of $\mathcal{N}$. - -Let $\lambda\in[0,1]^n$, let us define: -\begin{displaymath} - P_\mathcal{N}^\lambda(S) = \prod_{i\in S}\lambda_i - \prod_{i\in\mathcal{N}\setminus S}(1-\lambda_i) -\end{displaymath} -$P_\mathcal{N}^\lambda(S)$ is the probability of picking the set $S$ if we select -a subset of $\mathcal{N}$ at random by deciding independently for each point to -include it in the set with probability $\lambda_i$ (and to exclude it with -probability $1-\lambda_i$). - -We will consider two relaxations of the value function $V$ over $\mathcal{N}$: -\begin{itemize} - \item the \emph{multi-linear extension} of $V$: - \begin{align*} - F_\mathcal{N}(\lambda) - & = \mathbb{E}_{S\sim P_\mathcal{N}^\lambda}\big[\log\det A(S)\big]\\ - & = \sum_{S\subset\mathcal{N}} P_\mathcal{N}^\lambda(S) V(S)\\ - & = \sum_{S\subset\mathcal{N}} P_\mathcal{N}^\lambda(S) \log\det A(S)\\ - \end{align*} - \item the \emph{concave relaxation} of $V$: - \begin{align*} - L_{\mathcal{N}}(\lambda) - & = \log\det \mathbb{E}_{S\sim P_\mathcal{N}^\lambda}\big[A(S)\big]\\ - & = \log\det\left(\sum_{S\subset N} - P_\mathcal{N}^\lambda(S)A(S)\right)\\ - & = \log\det\left(I_d + \mu\sum_{i\in\mathcal{N}} - \lambda_ix_ix_i^*\right)\\ - & \defeq \log\det \tilde{A}(\lambda) - \end{align*} -\end{itemize} - -\begin{lemma} - The \emph{concave relaxation} $L_\mathcal{N}$ is concave\footnote{Hence - this relaxation is well-named!}. -\end{lemma} - -\begin{proof} - This follows from the concavity of the $\log\det$ function over symmetric - positive semi-definite matrices. More precisely, if $A$ and $B$ are two - symmetric positive semi-definite matrices, then: - \begin{multline*} - \forall\alpha\in [0, 1],\; \log\det\big(\alpha A + (1-\alpha) B\big)\\ - \geq \alpha\log\det A + (1-\alpha)\log\det B - \end{multline*} -\end{proof} - -\begin{lemma}[Rounding]\label{lemma:rounding} - For any feasible $\lambda\in[0,1]^n$, there exists a feasible - $\bar{\lambda}\in[0,1]^n$ such that at most one of its component is - fractional, that is, lies in $(0,1)$ and: - \begin{displaymath} - F_{\mathcal{N}}(\lambda)\leq F_{\mathcal{N}}(\bar{\lambda}) - \end{displaymath} -\end{lemma} - -\begin{proof} - We give a rounding procedure which given a feasible $\lambda$ with at least - two fractional components, returns some $\lambda'$ with one less fractional - component, feasible such that: - \begin{displaymath} - F_\mathcal{N}(\lambda) \leq F_\mathcal{N}(\lambda') - \end{displaymath} - Applying this procedure recursively yields the lemma's result. - - Let us consider such a feasible $\lambda$. Let $i$ and $j$ be two - fractional components of $\lambda$ and let us define the following - function: - \begin{displaymath} - F_\lambda(\varepsilon) = F(\lambda_\varepsilon) - \quad\textrm{where} \quad - \lambda_\varepsilon = \lambda + \varepsilon\left(e_i-\frac{c_i}{c_j}e_j\right) - \end{displaymath} - - It is easy to see that if $\lambda$ is feasible, then: - \begin{multline}\label{eq:convex-interval} - \forall\varepsilon\in\Big[\max\Big(-\lambda_i,(\lambda_j-1)\frac{c_j}{c_i}\Big), \min\Big(1-\lambda_i, \lambda_j - \frac{c_j}{c_i}\Big)\Big],\;\\ - \lambda_\varepsilon\;\;\textrm{is feasible} - \end{multline} - - Furthermore, the function $F_\lambda$ is convex, indeed: - \begin{align*} - F_\lambda(\varepsilon) - & = \mathbb{E}_{S'\sim P_{\mathcal{N}\setminus\{i,j\}}^\lambda(S')}\Big[ - (\lambda_i+\varepsilon)\Big(\lambda_j-\varepsilon\frac{c_i}{c_j}\Big)V(S'\cup\{i,j\})\\ - & + (\lambda_i+\varepsilon)\Big(1-\lambda_j+\varepsilon\frac{c_i}{c_j}\Big)V(S'\cup\{i\})\\ - & + (1-\lambda_i-\varepsilon)\Big(\lambda_j-\varepsilon\frac{c_i}{c_j}\Big)V(S'\cup\{j\})\\ - & + (1-\lambda_i-\varepsilon)\Big(1-\lambda_j+\varepsilon\frac{c_i}{c_j}\Big)V(S')\Big]\\ - \end{align*} - Thus, $F_\lambda$ is a degree 2 polynomial whose dominant coefficient is: - \begin{multline*} - \frac{c_i}{c_j}\mathbb{E}_{S'\sim - P_{\mathcal{N}\setminus\{i,j\}}^\lambda(S')}\Big[ - V(S'\cup\{i\})+V(S'\cup\{i\})\\ - -V(S'\cup\{i,j\})-V(S')\Big] - \end{multline*} - which is positive by submodularity of $V$. Hence, the maximum of - $F_\lambda$ over the interval given in \eqref{eq:convex-interval} is - attained at one of its limit, at which either the $i$-th or $j$-th component of - $\lambda_\varepsilon$ becomes integral. -\end{proof} - -\begin{lemma}\label{lemma:relaxation-ratio} - The following inequality holds: - \begin{displaymath} - \forall\lambda\in[0,1]^n,\; - \frac{\log\big(1+\mu\big)}{2\mu} - \,L_\mathcal{N}(\lambda)\leq - F_\mathcal{N}(\lambda)\leq L_{\mathcal{N}}(\lambda) - \end{displaymath} -\end{lemma} - -\begin{proof} - - We will prove that: - \begin{displaymath} - \frac{\log\big(1+\mu\big)}{2\mu} - \end{displaymath} - is a lower bound of the ratio $\partial_i F_\mathcal{N}(\lambda)/\partial_i - L_\mathcal{N}(\lambda)$. - - This will be enough to conclude, by observing that: - \begin{displaymath} - \frac{F_\mathcal{N}(\lambda)}{L_\mathcal{N}(\lambda)} - \sim_{\lambda\rightarrow 0} - \frac{\sum_{i\in \mathcal{N}}\lambda_i\partial_i F_\mathcal{N}(0)} - {\sum_{i\in\mathcal{N}}\lambda_i\partial_i L_\mathcal{N}(0)} - \end{displaymath} - and that an interior critical point of the ratio - $F_\mathcal{N}(\lambda)/L_\mathcal{N}(\lambda)$ is defined by: - \begin{displaymath} - \frac{F_\mathcal{N}(\lambda)}{L_\mathcal{N}(\lambda)} - = \frac{\partial_i F_\mathcal{N}(\lambda)}{\partial_i - L_\mathcal{N}(\lambda)} - \end{displaymath} - - Let us start by computing the derivatives of $F_\mathcal{N}$ and - $L_\mathcal{N}$ with respect to - the $i$-th component. - - For $F$, it suffices to look at the derivative of - $P_\mathcal{N}^\lambda(S)$: - \begin{displaymath} - \partial_i P_\mathcal{N}^\lambda(S) = \left\{ - \begin{aligned} - & P_{\mathcal{N}\setminus\{i\}}^\lambda(S\setminus\{i\})\;\textrm{if}\; i\in S \\ - & - P_{\mathcal{N}\setminus\{i\}}^\lambda(S)\;\textrm{if}\; - i\in \mathcal{N}\setminus S \\ - \end{aligned}\right. - \end{displaymath} - - Hence: - \begin{multline*} - \partial_i F_\mathcal{N} = - \sum_{\substack{S\subset\mathcal{N}\\ i\in S}} - P_{\mathcal{N}\setminus\{i\}}^\lambda(S\setminus\{i\})V(S)\\ - - \sum_{\substack{S\subset\mathcal{N}\\ i\in \mathcal{N}\setminus S}} - P_{\mathcal{N}\setminus\{i\}}^\lambda(S)V(S)\\ - \end{multline*} - - Now, using that every $S$ such that $i\in S$ can be uniquely written as - $S'\cup\{i\}$, we can write: - \begin{multline*} - \partial_i F_\mathcal{N} = - \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}} - P_{\mathcal{N}\setminus\{i\}}^\lambda(S)V(S\cup\{i\})\\ - - \sum_{\substack{S\subset\mathcal{N}\\ i\in \mathcal{N}\setminus S}} - P_{\mathcal{N}\setminus\{i\}}^\lambda(S)V(S)\\ - \end{multline*} - - Finally, by using the expression for the marginal contribution of $i$ to - $S$: - \begin{displaymath} - \partial_i F_\mathcal{N}(\lambda) = - \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}} - P_{\mathcal{N}\setminus\{i\}}^\lambda(S) - \log\Big(1 + \mu x_i^*A(S)^{-1}x_i\Big) - \end{displaymath} - - The computation of the derivative of $L_\mathcal{N}$ uses standard matrix - calculus and gives: - \begin{displaymath} - \partial_i L_\mathcal{N}(\lambda) - = \mu x_i^* \tilde{A}(\lambda)^{-1}x_i - \end{displaymath} - - Using the following inequalities: - \begin{gather*} - \forall S\subset\mathcal{N}\setminus\{i\},\quad - P_{\mathcal{N}\setminus\{i\}}^\lambda(S)\geq - P_{\mathcal{N}\setminus\{i\}}^\lambda(S\cup\{i\})\\ - \forall S\subset\mathcal{N},\quad P_{\mathcal{N}\setminus\{i\}}^\lambda(S) - \geq P_\mathcal{N}^\lambda(S)\\ - \forall S\subset\mathcal{N},\quad A(S)^{-1} \geq A(S\cup\{i\})^{-1}\\ - \end{gather*} - we get: - \begin{align*} - \partial_i F_\mathcal{N}(\lambda) - & \geq \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}} - P_\mathcal{N}^\lambda(S) - \log\Big(1 + \mu x_i^*A(S)^{-1}x_i\Big)\\ - & \geq \frac{1}{2} - \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}} - P_\mathcal{N}^\lambda(S) - \log\Big(1 + \mu x_i^*A(S)^{-1}x_i\Big)\\ - &\hspace{-3.5em}+\frac{1}{2} - \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}} - P_\mathcal{N}^\lambda(S\cup\{i\}) - \log\Big(1 + \mu x_i^*A(S\cup\{i\})^{-1}x_i\Big)\\ - &\geq \frac{1}{2} - \sum_{S\subset\mathcal{N}} - P_\mathcal{N}^\lambda(S) - \log\Big(1 + \mu x_i^*A(S)^{-1}x_i\Big)\\ - \end{align*} - - Using that $A(S)\geq I_d$ we get that: - \begin{displaymath} - \mu x_i^*A(S)^{-1}x_i \leq \mu - \end{displaymath} - - Moreover: - \begin{displaymath} - \forall x\leq\mu,\; \log(1+x)\geq - \frac{\log\big(1+\mu\big)}{\mu} x - \end{displaymath} - - Hence: - \begin{displaymath} - \partial_i F_\mathcal{N}(\lambda) \geq - \frac{\log\big(1+\mu\big)}{2\mu} - x_i^*\bigg(\sum_{S\subset\mathcal{N}}P_\mathcal{N}^\lambda(S)A(S)^{-1}\bigg)x_i - \end{displaymath} - - Finally, using that the inverse is a matrix convex function over symmetric - positive definite matrices: - \begin{align*} - \partial_i F_\mathcal{N}(\lambda) &\geq - \frac{\log\big(1+\mu\big)}{2\mu} - x_i^*\bigg(\sum_{S\subset\mathcal{N}}P_\mathcal{N}^\lambda(S)A(S)\bigg)^{-1}x_i\\ - & \geq \frac{\log\big(1+\mu\big)}{2\mu} - \partial_i L_\mathcal{N}(\lambda) - \end{align*} -\end{proof} - -\begin{lemma} - Let us denote by $C_\mu$ the constant of which appears in the bound of the - previous lemma: $C_\mu = \frac{\log(1+\mu)}{2\mu}$, then: - \begin{displaymath} - OPT(L_\mathcal{N}, B) \leq \frac{1}{C_\mu}\big(2 OPT(V,\mathcal{N},B) - + \max_{i\in\mathcal{N}}V(i)\big) - \end{displaymath} -\end{lemma} - -\begin{proof} - Let us consider a feasible point $\lambda^*\in[0,1]^n$ such that $L_\mathcal{N}(\lambda^*) - = OPT(L_\mathcal{N}, B)$. By applying lemma~\ref{lemma:relaxation-ratio} - and lemma~\ref{lemma:rounding} we get a feasible point $\bar{\lambda}$ with at most - one fractional component such that: - \begin{equation}\label{eq:e1} - L_\mathcal{N}(\lambda^*) \leq \frac{1}{C_\mu} - F_\mathcal{N}(\bar{\lambda}) - \end{equation} - - Let $\lambda_i$ denote the fractional component of $\bar{\lambda}$ and $S$ - denote the set whose indicator vector is $\bar{\lambda} - \lambda_i e_i$. - Using the fact that $F_\mathcal{N}$ is linear with respect to the $i$-th - component and is a relaxation of the value function, we get: - \begin{displaymath} - F_\mathcal{N}(\bar{\lambda}) = V(S) +\lambda_i V(S\cup\{i\}) - \end{displaymath} - - Using the submodularity of $V$: - \begin{displaymath} - F_\mathcal{N}(\bar{\lambda}) \leq 2 V(S) + V(i) - \end{displaymath} - - Note that since $\bar{\lambda}$ is feasible, $S$ is also feasible and - $V(S)\leq OPT(V,\mathcal{N}, B)$. Hence: - \begin{equation}\label{eq:e2} - F_\mathcal{N}(\bar{\lambda}) \leq 2 OPT(V,\mathcal{N}, B) - + \max_{i\in\mathcal{N}} V(i) - \end{equation} - - Putting \eqref{eq:e1} and \eqref{eq:e2} together gives the results. -\end{proof} - -\subsection{The mechanism} - -\begin{algorithm}\label{mechanism} - \caption{Budget feasible mechanism for ridge regression} - \begin{algorithmic}[1] - \State $i^* \gets \argmax_{j\in\mathcal{N}}V(j)$ - \State $x^* \gets \argmax_{x\in[0,1]^n} \{L_{\mathcal{N}\setminus\{i^*\}}(x) - \,|\, c(x)\leq B\}$ - \Statex - \If{$L(x^*) < CV(i^*)$} - \State \textbf{return} $\{i^*\}$ - \Else - \State $i \gets \argmax_{1\leq j\leq n}\frac{V(j)}{c_j}$ - \State $S \gets \emptyset$ - \While{$c_i\leq \frac{B}{2}\frac{V(S\cup\{i\})-V(S)}{V(S\cup\{i\})}$} - \State $S \gets S\cup\{i\}$ - \State $i \gets \argmax_{j\in\mathcal{N}\setminus S} - \frac{V(S\cup\{j\})-V(S)}{c_j}$ - \EndWhile - \State \textbf{return} $S$ - \EndIf - \end{algorithmic} -\end{algorithm} - -The constant $C$ which appears in the mechanism is unspecified for now. We can -make the analysis regardless of its value. The optimal value will be specified -in the theorem. - -\begin{lemma} -The mechanism is monotone. -\end{lemma} - -\begin{proof} - We assume by contradiction that there exists a user $i$ that has been - selected by the mechanism and that would not be selected had he reported - a cost $c_i'\leq c_i$ (all the other costs staying the same). - - If $i\neq i^*$ and $i$ has been selected, then we are in the case where - $L(x^*) \geq C V(i^*)$ and $i$ was included in the result set by the greedy - part of the mechanism. By reporting a cost $c_i'\leq c_i$, using the - submodularity of $V$, we see that $i$ will satisfy the greedy selection - rule: - \begin{displaymath} - i = \argmax_{j\in\mathcal{N}\setminus S} \frac{V(S\cup\{j\}) - - V(S)}{c_j} - \end{displaymath} - in an earlier iteration of the greedy heuristic. Let us denote by $S_i$ - (resp. $S_i'$) the set to which $i$ is added when reporting cost $c_i$ - (resp. $c_i'$). We have $S_i'\subset S_i$. Moreover: - \begin{align*} - c_i' & \leq c_i \leq - \frac{B}{2}\frac{V(S_i\cup\{i\})-V(S_i)}{V(S_i\cup\{i\})}\\ - & \leq \frac{B}{2}\frac{V(S_i'\cup\{i\})-V(S_i')}{V(S_i'\cup\{i\})} - \end{align*} - Hence $i$ will still be included in the result set. - - If $i = i^*$, $i$ is included iff $L(x^*) \leq C V(i^*)$. Reporting $c_i'$ - instead of $c_i$ does not change the value $V(i^*)$ nor $L(x^*)$ (which is - computed over $\mathcal{N}\setminus\{i^*\}$). Thus $i$ is still included by - reporting a different cost. -\end{proof} - -\begin{lemma} -The mechanism is budget feasible. -\end{lemma} - -\begin{lemma} - Let us denote by $S_M$ the set returned by the mechanism. Let us also - write: - \begin{displaymath} - C_{\textrm{max}} = \max\left(1+C,\frac{e}{e-1}\left( 3 + \frac{12e}{C\cdot - C_\mu(e-1) -5e +1}\right)\right) - \end{displaymath} - - Then: - \begin{displaymath} - OPT(V, \mathcal{N}, B) \leq - C_\text{max}\cdot V(S_M) - \end{displaymath} -\end{lemma} - -\begin{proof} - - If the condition on line 3 of the algorithm holds, then: - \begin{displaymath} - V(i^*) \geq \frac{1}{C}L(x^*) \geq - \frac{1}{C}OPT(V,\mathcal{N}\setminus\{i\}, B) - \end{displaymath} - - But: - \begin{displaymath} - OPT(V,\mathcal{N},B) \leq OPT(V,\mathcal{N}\setminus\{i\}, B) + V(i^*) - \end{displaymath} - - Hence: - \begin{displaymath} - V(i^*) \geq \frac{1}{C+1} OPT(V,\mathcal{N}, B) - \end{displaymath} - - If the condition of the algorithm does not hold: - \begin{align*} - V(i^*) & \leq \frac{1}{C}L(x^*) \leq \frac{1}{C\cdot C_\mu} - \big(2 OPT(V,\mathcal{N}, B) + V(i^*)\big)\\ - & \leq \frac{1}{C\cdot C_\mu}\left(\frac{2e}{e-1}\big(3 V(S_M) - + 2 V(i^*)\big) - + V(i^*)\right) - \end{align*} - - Thus: - \begin{align*} - V(i^*) \leq \frac{6e}{C\cdot C_\mu(e-1)- 5e + 1} V(S_M) - \end{align*} - - Finally, using again that: - \begin{displaymath} - OPT(V,\mathcal{N},B) \leq \frac{e}{e-1}\big(3 V(S_M) + 2 V(i^*)\big) - \end{displaymath} - - We get: - \begin{displaymath} - OPT(V, \mathcal{N}, B) \leq \frac{e}{e-1}\left( 3 + \frac{12e}{C\cdot - C_\mu(e-1) -5e +1}\right) V(S_M) - \end{displaymath} -\end{proof} - -The optimal value for $C$ is: -\begin{displaymath} - C^* = \arg\min C_{\textrm{max}} -\end{displaymath} - -This equation has two solutions. Only one of those is such that: -\begin{displaymath} - C\cdot C_\mu(e-1) -5e +1 \geq 0 -\end{displaymath} -which is needed in the proof of the previous lemma. Computing this solution, -we can state the main result of this section. - -\begin{theorem} - The mechanism is individually rational, truthful and has an approximation - ratio of: - \begin{multline*} - 1 + C^* = \frac{5e-1 + C_\mu(4e-1)}{2C_\mu(e-1)}\\ - + \frac{\sqrt{C_\mu^2(1+2e)^2 - + 2C_\mu(14e^2+5e+1) + (1-5e)^2}}{2C_\mu(e-1)} - \end{multline*} -\end{theorem} -\end{document} diff --git a/stoc_paper.pdf b/stoc_paper.pdf Binary files differnew file mode 100644 index 0000000..9ae80aa --- /dev/null +++ b/stoc_paper.pdf diff --git a/stoc_paper.tex b/stoc_paper.tex new file mode 100644 index 0000000..cbb13af --- /dev/null +++ b/stoc_paper.tex @@ -0,0 +1,49 @@ +\documentclass{acm_proc_article-sp} +\usepackage[utf8]{inputenc} +\usepackage{amsmath,amsfonts} +\usepackage{algorithm} +\usepackage{algpseudocode,bbm,color,verbatim} +\input{definitions} +\title{Budget Feasible Mechanisms for Experimental Design} + +%Remove permission block empty space +\makeatletter +\let\@copyrightspace\relax +\makeatother + +\numberofauthors{3} +\author{ +\alignauthor Thibaut Horel\\ %\titlenote{\texttt{thibaut.horel@ens.fr}}\\ + \affaddr{Technicolor}\\ + \email{thibaut.horel@ens.fr} +\alignauthor Stratis Ioannidis\\ % \titlenote{\texttt{stratis.ioannidis@technicolor.com}}\\ + \affaddr{Technicolor}\\ + \email{stratis.ioannidis\\@technicolor.com} +\alignauthor S. Muthukrishnan\\% \titlenote{\texttt{muthu@cs.rutgers.edu}}\\ + \affaddr{Rutgers University, Microsoft Research}\\ + \email{muthu@cs.rutgers.edu} +} +\begin{document} + +\maketitle +\begin{abstract} +\input{abstract} +\end{abstract} +\section{Introduction} +\input{intro} + +\section{Preliminaries}\label{sec:peel} +\input{problem} +\section{Mechanism for \EDP{}} +\input{main} +\section{Extension to Other Problems}\label{sec:ext} +\input{general} +%\section{Conclusion} +%\input{conclusion} +\begin{small} +\bibliographystyle{abbrv} +\bibliography{notes} +\end{small} +\appendix +\input{appendix} +\end{document} |
