diff options
Diffstat (limited to 'notes2.tex')
| -rw-r--r-- | notes2.tex | 162 |
1 files changed, 0 insertions, 162 deletions
diff --git a/notes2.tex b/notes2.tex deleted file mode 100644 index 92eabb5..0000000 --- a/notes2.tex +++ /dev/null @@ -1,162 +0,0 @@ -\documentclass{article} -\usepackage[utf8]{inputenc} -\usepackage{amsmath,amsthm,amsfonts} -\usepackage{comment} -\newtheorem{lemma}{Lemma} -\newtheorem{fact}{Fact} -\newtheorem{example}{Example} -\newtheorem{prop}{Proposition} -\newcommand{\var}{\mathop{\mathrm{Var}}} -\newcommand{\condexp}[2]{\mathop{\mathbb{E}}\left[#1|#2\right]} -\newcommand{\expt}[1]{\mathop{\mathbb{E}}\left[#1\right]} -\newcommand{\norm}[1]{\lVert#1\rVert} -\newcommand{\tr}[1]{#1^*} -\newcommand{\ip}[2]{\langle #1, #2 \rangle} -\newcommand{\mse}{\mathop{\mathrm{MSE}}} -\newcommand{\trace}{\mathop{\mathrm{tr}}} -\begin{document} - -\section{Problem} - -\begin{itemize} -\item database $D = (X_i)_{1\leq i\leq n}$ -\item $(X_i)_{1\leq i\leq n}\in\big(\mathbf{R^d}\big)^n$ sampled in an i.i.d - fashion from the probability distribution $\mu$ -\end{itemize} - -We assume that there is a common variable $P$ which is the underlying -phenomenon that we are interesting in, and all the variables in the database, -are noisy realisations of this phenomenon: -\begin{displaymath} - X_i = P + N_i -\end{displaymath} -where the $N_i$ are pairwise independent and also independent from $H$. - - -We are interested in defining the $\emph{value}$ of a subset $S$ of the -database $D$. The approaches to such a definition can be classified in two main -families: -\begin{itemize} - \item \emph{Information theoretic:} We use the entropy of $P$ as a measure - of the uncertainty. Then the value of a subset $S$ is the decrease of - entropy induced by revealing it: - \begin{displaymath} - V(S) = H(P) - H(P\,|\,S) - \end{displaymath} - \item \emph{Statistical:} in this case you are trying to estimate the - phenomenon. The variance of the estimation is used as a measure of - the uncertainty. The value of a set $S$ can be defined as the average - reduction of the variance over all the points in the database once you - observe $S$: - \begin{displaymath} - V(S) = \frac{1}{|D|}\sum_{x\in D} \sigma_x^2 - \sigma_{x|A}^2 - \end{displaymath} -\end{itemize} - -Once you have defined a notion of value, there are several problems you can -address: -\begin{enumerate} - \item \emph{budget auctions:} the points in the database are associated - with users. Each user $i$ has a cost $c_i$ for revealing his data. You - are given a budget $B$ and you want to select a subset $S$ of users and - a payment scheme which is truthful. The value of the subset - $S$ should be as big as possible with the constraint that the payments - should stay within budget. - \item \emph{value sharing:} the goal is to find a value sharing method to - split the whole value of a subset $S$ among its participants. - \item \emph{k-best auctions:} this one is very similar to \emph{budget - auctions}, but instead of having a budget constraint, the constraint is - on the number of people in the subset $S$. -\end{enumerate} - -These problems have already been studied, and optimal or near-optimal solution -are already known when the value function is submodular. Fortunately, it can be -shown that the information theoretic value defined above is submodular. - -\section{Submodularity} - -In this section, we will consider that we are given a \emph{universe} set $U$. -A set function $f$ is a function defined on the power set of $U$, $\mathfrak{P}(U)$. - -A set function $f$ defined on $\mathfrak{P}(U)$ will be said \emph{increasing} if -it is increasing with regards to inclusion, that is: - -\begin{displaymath} - \forall\,S\subseteq T,\quad f(S)\leq f(T) -\end{displaymath} - -A \emph{decreasing} function on $\mathfrak{P}(U)$ is defined similarly. - -A set function $f$ defined on $\mathfrak{P}(U)$ is said to be -\emph{submodular} if it verifies the diminishing returns property, that is, -the marginal increments when adding a point to a set, is a set decreasing -function. More formally, for any point $x$ in $U$, we can define the marginal -increment of $f$ regarding $x$, it is the set function defined as: -\begin{displaymath} - \Delta_x f(S) = f(S\cup\{x\}) - f(S) -\end{displaymath} -Then, $f$ is \emph{submodular} iff. for all $x$ in $U$, $\Delta_x f$ is a set -decreasing function. - -Similarly, a \emph{supermodular} is a function whose marginal increments are set -increasing functions. -\begin{prop} - Let $R:\mathbf{R}\rightarrow \mathbf{R}$ be a decreasing concave function and - $f:\mathfrak{P}(U)\rightarrow\mathbf{R}$ be a decreasing submodular function, - then the composed function $R\circ f$ is increasing and supermodular. -\end{prop} - -\begin{proof} - The increasingness of $R\circ f$ follows immediately from the decreasingness - of $R$ and $f$. - - For the supermodularity, let $S$ and $T$ be two sets such that $S\subseteq - T$. By decreasingness of $f$, we have: - \begin{displaymath} - \forall\,V,\quad f(T)\leq f(S)\quad\mathrm{and}\quad f(T\cup V)\leq f(S\cup V) - \end{displaymath} - - Thus, by concavity of $R$: - \begin{displaymath}\label{eq:base} - \forall\,V,\quad\frac{R\big(f(S)\big)-R\big(f(S\cup V)\big)}{f(S)-f(S\cup V)} - \leq\frac{R\big(f(T)\big)-R\big(f(T\cup V)\big)}{f(T)-f(T\cup V)} - \end{displaymath} - - $f$ is decreasing, so multiplying this last inequality by - $f(S)-f(S\cup V)$ and $f(T)-f(T\cup V)$ yields: - \begin{multline} - \forall V,\quad\Big(R\big(f(S)\big)-R\big(f(S\cup V)\big)\Big)\big(f(T)-f(T\cup V)\big)\\ - \leq \Big(R\big(f(T)\big)-R\big(f(T\cup V)\big)\Big)\big(f(S)-f(S\cup V)\big) - \end{multline} - - $f$ is submodular, so: - \begin{displaymath} - f(T\cup V)-f(T)\leq f(S\cup V) - f(S) - \end{displaymath} - - $R\circ f$ is increasing, so: - \begin{displaymath} - R\big(f(S)\big)-R\big(f(S\cup V)\big)\leq 0 - \end{displaymath} - - By combining the two previous inequalities, we get: - \begin{multline*} - \forall V,\quad\Big(R\big(f(S)\big)-R\big(f(S\cup V)\big)\Big)\big(f(S)-f(S\cup V)\big)\\ - \leq \Big(R\big(f(S)\big)-R\big(f(S\cup V)\big)\Big)\big(f(T)-f(T\cup V)\big) - \end{multline*} - - Injecting this last inequality into \eqref{eq:base} gives: - \begin{multline*} - \forall V,\quad\Big(R\big(f(S)\big)-R\big(f(S\cup V)\big)\Big)\big(f(S)-f(S\cup V)\big)\\ - \leq \Big(R\big(f(T)\big)-R\big(f(T\cup V)\big)\Big)\big(f(S)-f(S\cup V)\big) - \end{multline*} - - Dividing left and right by $f(S)-f(S\cup V)$ yields: - \begin{displaymath} - \forall V,\quad R\big(f(S)\big)-R\big(f(S\cup V)\big) - \leq R\big(f(T)\big)-R\big(f(T\cup V)\big) - \end{displaymath} - which is exactly the supermodularity of $R\circ f$. -\end{proof} - -\end{document} |
