summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2012-10-28 14:14:11 -0700
committerThibaut Horel <thibaut.horel@gmail.com>2012-10-28 14:14:11 -0700
commitcc28557b334d2bc53e9c8243409550ea1c9f368f (patch)
tree2c0a153fdbde38cc7a261b493046448eb4791e83
parentcac5ada8f9c9de77944d061476f704fbe112d924 (diff)
downloadrecommendation-cc28557b334d2bc53e9c8243409550ea1c9f368f.tar.gz
Work on the main section (section 3). Incorporation of the proof from proof.tex
-rw-r--r--paper.tex524
1 files changed, 514 insertions, 10 deletions
diff --git a/paper.tex b/paper.tex
index 1ba804c..1484c3d 100644
--- a/paper.tex
+++ b/paper.tex
@@ -186,7 +186,7 @@ Combining \eqref{eq:h1} and \eqref{eq:h2} we get:
Finally, we can use Sylvester's determinant theorem to get the result.
\end{proof}
-It is also interesting to the marginal contribution of a user to a set: the
+It is also interesting to look at the marginal contribution of a user to a set: the
increase of value induced by adding a user to an already existing set of users.
We have the following lemma.
@@ -216,6 +216,10 @@ that matrix inversion is decreasing, we see that the marginal contribution of
a fixed user is a set decreasing function. This is the \emph{submodularity} of
the value function.
+TODO? Explain what are the points which are the most valuable : points which
+are aligned along directions where the spread over the already existing points
+is small.
+
\subsection{Auction}
TODO Explain the optimization problem, why it has to be formulated as an auction
@@ -232,16 +236,50 @@ should we introduce the notion of submodularity?
\section{Main result}
-TODO Explain:
+All the budget feasible mechanisms studied recently (TODO ref Singer Chen)
+rely on using a greedy heuristic extending the idea of the greedy heuristic for
+the knapsack problem. In knapsack, objects are selected based on their
+\emph{value-per-cost} ratio. The quantity which plays a similar role for
+general submodular functions is the \emph{marginal-contribution-per-cost}
+ratio: let us assume that you have already selected a set of points $S$, then
+the \emph{marginal-contribution-per-cost} ratio per cost of a new point $i$ is
+defined by:
+\begin{displaymath}
+ \frac{V(S\cup\{i\}) - V(S)}{c_i}
+\end{displaymath}
+
+The greedy heuristic then simply repeatedly selects the point whose
+marginal-contribution-per-cost ratio is the highest until it reaches the budget
+limit. Mechanism considerations aside, this is known to have an unbounded
+approximation ratio. However, lemma TODO ref in Chen or Singer shows that the
+maximum between the greedy heuristic and the point with maximum value (as
+a singleton set) provides a $\frac{5e}{e-1}$ approximation ratio.
+
+Unfortunately, TODO Singer 2011 points out that taking the maximum between the
+greedy heuristic and the most valuable point is not incentive compatible.
+Singer and Chen tackle this issue similarly: instead of comparing the most
+valuable point to the greedy solution, they compare it to a solution which is
+close enough to keep a constant approximation ratio:
\begin{itemize}
- \item the mechanism uses the greedy heuristic
- \item we know that the maximum of greedy and meatiest guy is a good
- approximation, but not incentive compatible
- \item instead compare the value of the meatiest guy to $L(x^*)$ (introduce
- $L(x^*)$ which can be easily computed and is not too far from the greedy
- value
+ \item Chen suggests using $OPT(V,\mathcal{N}\setminus\{i\}, B)$.
+ Unfortunately, in the general case, this cannot be computed exactly in
+ polynomial time.
+ \item Singer uses using the optimal value of a relaxed objective function
+ which can be proven to be close to the optimal of the original
+ objective function. The function used is tailored to the specific
+ problem of coverage.
\end{itemize}
+Here, we use a relaxation of the objective function which is tailored to the
+problem of ridge regression. We define:
+\begin{displaymath}
+ \forall\lambda\in[0,1]^{|\mathcal{N}|}\,\quad L_{\mathcal{N}}(\lambda) \defeq
+ \log\det\left(I_d + \mu\sum_{i\in\mathcal{N}} \lambda_i x_i x_i^*\right)
+\end{displaymath}
+
+We can now present the mechanism we use, which has a same flavor to Chen's and
+Singer's
+
\begin{algorithm}\label{mechanism}
\caption{Mechanism for ridge regression}
\begin{algorithmic}[1]
@@ -264,9 +302,16 @@ TODO Explain:
\end{algorithmic}
\end{algorithm}
+Notice, that the stopping condition in the while loop is more sophisticated
+than just ensuring that the sum of the costs does not exceed the budget. This
+is because the selected users will be payed more than their costs, and this
+stopping condition ensures budget feasibility when the users are paid their
+threshold payment.
+
+We can now state the main result of this section:
\begin{theorem}
- The mechanism is truthful, individually rational, budget feasible.
- Furthermore, choosing:
+ The mechanism in \ref{mechanism} is truthful, individually rational, budget
+ feasible. Furthermore, choosing:
\begin{multline*}
C = C^* = \frac{5e-1 + C_\mu(2e+1)}{2C_\mu(e-1)}\\
+ \frac{\sqrt{C_\mu^2(1+2e)^2
@@ -278,8 +323,467 @@ TODO Explain:
+ \frac{\sqrt{C_\mu^2(1+2e)^2
+ 2C_\mu(14e^2+5e+1) + (1-5e)^2}}{2C_\mu(e-1)}
\end{multline*}
+ where:
+ \begin{displaymath}
+ C_\mu = \frac{\log(1+\mu)}{2\mu}
+ \end{displaymath}
\end{theorem}
+The proof will consist of the claims of the theorem broken down into lemmas.
+
+Because the user strategy is parametrized by a single parameter, truthfulness
+is equivalent to monotonicity (TODO ref). The proof here is the same as in TODO
+and is given for the sake of completeness.
+
+\begin{lemma}
+The mechanism is monotone.
+\end{lemma}
+
+\begin{proof}
+ We assume by contradiction that there exists a user $i$ that has been
+ selected by the mechanism and that would not be selected had he reported
+ a cost $c_i'\leq c_i$ (all the other costs staying the same).
+
+ If $i\neq i^*$ and $i$ has been selected, then we are in the case where
+ $L(x^*) \geq C V(i^*)$ and $i$ was included in the result set by the greedy
+ part of the mechanism. By reporting a cost $c_i'\leq c_i$, using the
+ submodularity of $V$, we see that $i$ will satisfy the greedy selection
+ rule:
+ \begin{displaymath}
+ i = \argmax_{j\in\mathcal{N}\setminus S} \frac{V(S\cup\{j\})
+ - V(S)}{c_j}
+ \end{displaymath}
+ in an earlier iteration of the greedy heuristic. Let us denote by $S_i$
+ (resp. $S_i'$) the set to which $i$ is added when reporting cost $c_i$
+ (resp. $c_i'$). We have $S_i'\subset S_i$. Moreover:
+ \begin{align*}
+ c_i' & \leq c_i \leq
+ \frac{B}{2}\frac{V(S_i\cup\{i\})-V(S_i)}{V(S_i\cup\{i\})}\\
+ & \leq \frac{B}{2}\frac{V(S_i'\cup\{i\})-V(S_i')}{V(S_i'\cup\{i\})}
+ \end{align*}
+ Hence $i$ will still be included in the result set.
+
+ If $i = i^*$, $i$ is included iff $L(x^*) \leq C V(i^*)$. Reporting $c_i'$
+ instead of $c_i$ does not change the value $V(i^*)$ nor $L(x^*)$ (which is
+ computed over $\mathcal{N}\setminus\{i^*\}$). Thus $i$ is still included by
+ reporting a different cost.
+\end{proof}
+
+\begin{lemma}
+The mechanism is budget feasible.
+\end{lemma}
+
+The proof is the same as in Chen and is given here for the sake of
+completeness.
+\begin{proof}
+
+\end{proof}
+
+Using the characterization of the threshold payments from Singer TODO, we can
+prove individual rationality similarly to Chen TODO.
+\begin{lemma}
+The mechanism is individually rational
+\end{lemma}
+
+\begin{proof}
+
+\end{proof}
+
+The following lemma requires to have a careful look at the relaxation function
+we chose in the mechanism. The next section will be dedicated to studying this
+relaxation and will contain the proof of the following lemma:
+\begin{lemma}
+ We have:
+ \begin{displaymath}
+ OPT(L_\mathcal{N}, B) \leq \frac{1}{C_\mu}\big(2 OPT(V,\mathcal{N},B)
+ + \max_{i\in\mathcal{N}}V(i)\big)
+ \end{displaymath}
+\end{lemma}
+
+\begin{lemma}
+ Let us denote by $S_M$ the set returned by the mechanism. Let us also
+ write:
+ \begin{displaymath}
+ C_{\textrm{max}} = \max\left(1+C,\frac{e}{e-1}\left( 3 + \frac{12e}{C\cdot
+ C_\mu(e-1) -5e +1}\right)\right)
+ \end{displaymath}
+
+ Then:
+ \begin{displaymath}
+ OPT(V, \mathcal{N}, B) \leq
+ C_\text{max}\cdot V(S_M)
+ \end{displaymath}
+\end{lemma}
+
+\begin{proof}
+
+ If the condition on line 3 of the algorithm holds, then:
+ \begin{displaymath}
+ V(i^*) \geq \frac{1}{C}L(x^*) \geq
+ \frac{1}{C}OPT(V,\mathcal{N}\setminus\{i\}, B)
+ \end{displaymath}
+
+ But:
+ \begin{displaymath}
+ OPT(V,\mathcal{N},B) \leq OPT(V,\mathcal{N}\setminus\{i\}, B) + V(i^*)
+ \end{displaymath}
+
+ Hence:
+ \begin{displaymath}
+ V(i^*) \geq \frac{1}{C+1} OPT(V,\mathcal{N}, B)
+ \end{displaymath}
+
+ If the condition of the algorithm does not hold:
+ \begin{align*}
+ V(i^*) & \leq \frac{1}{C}L(x^*) \leq \frac{1}{C\cdot C_\mu}
+ \big(2 OPT(V,\mathcal{N}, B) + V(i^*)\big)\\
+ & \leq \frac{1}{C\cdot C_\mu}\left(\frac{2e}{e-1}\big(3 V(S_M)
+ + 2 V(i^*)\big)
+ + V(i^*)\right)
+ \end{align*}
+
+ Thus:
+ \begin{align*}
+ V(i^*) \leq \frac{6e}{C\cdot C_\mu(e-1)- 5e + 1} V(S_M)
+ \end{align*}
+
+ Finally, using again that:
+ \begin{displaymath}
+ OPT(V,\mathcal{N},B) \leq \frac{e}{e-1}\big(3 V(S_M) + 2 V(i^*)\big)
+ \end{displaymath}
+
+ We get:
+ \begin{displaymath}
+ OPT(V, \mathcal{N}, B) \leq \frac{e}{e-1}\left( 3 + \frac{12e}{C\cdot
+ C_\mu(e-1) -5e +1}\right) V(S_M)
+ \end{displaymath}
+\end{proof}
+
+The optimal value for $C$ is:
+\begin{displaymath}
+ C^* = \arg\min C_{\textrm{max}}
+\end{displaymath}
+
+This equation has two solutions. Only one of those is such that:
+\begin{displaymath}
+ C\cdot C_\mu(e-1) -5e +1 \geq 0
+\end{displaymath}
+which is needed in the proof of the previous lemma. Computing this solution,
+we can state the main result of this section.
+
+\subsection{Relaxations of the value function}
+
+To prove lemma TODO, we will use a general method called pipage rounding,
+introduced in TODO. Two consecutive relaxations are used: the one that we are
+interested in, whose optimization can be computed efficiently, and the
+multilinear extension which presents a \emph{cross-convexity} like behavior
+which allows for rounding of fractional solution without decreasing the value
+of the objective function and thus ensures a constant approximation of the
+value function. The difficulty resides in showing that the ratio of the two
+relaxations is bounded.
+
+We say that $R_\mathcal{N}:[0,1]^n\rightarrow\mathbf{R}$ is a relaxation of the
+value function $V$ over $\mathcal{N}$ if it coincides with $V$ at binary
+points. Formally, for any $S\subset\mathcal{N}$, let $\mathbf{1}_S$ denote the
+indicator vector of $S$. $R_\mathcal{N}$ is a relaxation of $V$ over
+$\mathcal{N}$ iff:
+\begin{displaymath}
+ \forall S\subset\mathcal{N},\; R_\mathcal{N}(\mathbf{1}_S) = V(S)
+\end{displaymath}
+
+We can extend the optimisation problem defined above to a relaxation by
+extending the cost function:
+\begin{displaymath}
+ \forall \lambda\in[0,1]^n,\; c(\lambda)
+ = \sum_{i\in\mathcal{N}}\lambda_ic_i
+\end{displaymath}
+The optimisation problem becomes:
+\begin{displaymath}
+ OPT(R_\mathcal{N}, B) =
+ \max_{\lambda\in[0,1]^n}\left\{R_\mathcal{N}(\lambda)\,|\, c(\lambda)\leq B\right\}
+\end{displaymath}
+The relaxations we will consider here rely on defining a probability
+distribution over subsets of $\mathcal{N}$.
+
+Let $\lambda\in[0,1]^n$, let us define:
+\begin{displaymath}
+ P_\mathcal{N}^\lambda(S) = \prod_{i\in S}\lambda_i
+ \prod_{i\in\mathcal{N}\setminus S}(1-\lambda_i)
+\end{displaymath}
+$P_\mathcal{N}^\lambda(S)$ is the probability of picking the set $S$ if we select
+a subset of $\mathcal{N}$ at random by deciding independently for each point to
+include it in the set with probability $\lambda_i$ (and to exclude it with
+probability $1-\lambda_i$).
+
+We will consider two relaxations of the value function $V$ over $\mathcal{N}$:
+\begin{itemize}
+ \item the \emph{multi-linear extension} of $V$:
+ \begin{align*}
+ F_\mathcal{N}(\lambda)
+ & = \mathbb{E}_{S\sim P_\mathcal{N}^\lambda}\big[\log\det A(S)\big]\\
+ & = \sum_{S\subset\mathcal{N}} P_\mathcal{N}^\lambda(S) V(S)\\
+ & = \sum_{S\subset\mathcal{N}} P_\mathcal{N}^\lambda(S) \log\det A(S)\\
+ \end{align*}
+ \item the \emph{concave relaxation} of $V$:
+ \begin{align*}
+ L_{\mathcal{N}}(\lambda)
+ & = \log\det \mathbb{E}_{S\sim P_\mathcal{N}^\lambda}\big[A(S)\big]\\
+ & = \log\det\left(\sum_{S\subset N}
+ P_\mathcal{N}^\lambda(S)A(S)\right)\\
+ & = \log\det\left(I_d + \mu\sum_{i\in\mathcal{N}}
+ \lambda_ix_ix_i^*\right)\\
+ & \defeq \log\det \tilde{A}(\lambda)
+ \end{align*}
+\end{itemize}
+
+\begin{lemma}
+ The \emph{concave relaxation} $L_\mathcal{N}$ is concave\footnote{Hence
+ this relaxation is well-named!}.
+\end{lemma}
+
+\begin{proof}
+ This follows from the concavity of the $\log\det$ function over symmetric
+ positive semi-definite matrices. More precisely, if $A$ and $B$ are two
+ symmetric positive semi-definite matrices, then:
+ \begin{multline*}
+ \forall\alpha\in [0, 1],\; \log\det\big(\alpha A + (1-\alpha) B\big)\\
+ \geq \alpha\log\det A + (1-\alpha)\log\det B
+ \end{multline*}
+\end{proof}
+
+\begin{lemma}[Rounding]\label{lemma:rounding}
+ For any feasible $\lambda\in[0,1]^n$, there exists a feasible
+ $\bar{\lambda}\in[0,1]^n$ such that at most one of its component is
+ fractional, that is, lies in $(0,1)$ and:
+ \begin{displaymath}
+ F_{\mathcal{N}}(\lambda)\leq F_{\mathcal{N}}(\bar{\lambda})
+ \end{displaymath}
+\end{lemma}
+
+\begin{proof}
+ We give a rounding procedure which given a feasible $\lambda$ with at least
+ two fractional components, returns some $\lambda'$ with one less fractional
+ component, feasible such that:
+ \begin{displaymath}
+ F_\mathcal{N}(\lambda) \leq F_\mathcal{N}(\lambda')
+ \end{displaymath}
+ Applying this procedure recursively yields the lemma's result.
+
+ Let us consider such a feasible $\lambda$. Let $i$ and $j$ be two
+ fractional components of $\lambda$ and let us define the following
+ function:
+ \begin{displaymath}
+ F_\lambda(\varepsilon) = F(\lambda_\varepsilon)
+ \quad\textrm{where} \quad
+ \lambda_\varepsilon = \lambda + \varepsilon\left(e_i-\frac{c_i}{c_j}e_j\right)
+ \end{displaymath}
+
+ It is easy to see that if $\lambda$ is feasible, then:
+ \begin{multline}\label{eq:convex-interval}
+ \forall\varepsilon\in\Big[\max\Big(-\lambda_i,(\lambda_j-1)\frac{c_j}{c_i}\Big), \min\Big(1-\lambda_i, \lambda_j
+ \frac{c_j}{c_i}\Big)\Big],\;\\
+ \lambda_\varepsilon\;\;\textrm{is feasible}
+ \end{multline}
+
+ Furthermore, the function $F_\lambda$ is convex, indeed:
+ \begin{align*}
+ F_\lambda(\varepsilon)
+ & = \mathbb{E}_{S'\sim P_{\mathcal{N}\setminus\{i,j\}}^\lambda(S')}\Big[
+ (\lambda_i+\varepsilon)\Big(\lambda_j-\varepsilon\frac{c_i}{c_j}\Big)V(S'\cup\{i,j\})\\
+ & + (\lambda_i+\varepsilon)\Big(1-\lambda_j+\varepsilon\frac{c_i}{c_j}\Big)V(S'\cup\{i\})\\
+ & + (1-\lambda_i-\varepsilon)\Big(\lambda_j-\varepsilon\frac{c_i}{c_j}\Big)V(S'\cup\{j\})\\
+ & + (1-\lambda_i-\varepsilon)\Big(1-\lambda_j+\varepsilon\frac{c_i}{c_j}\Big)V(S')\Big]\\
+ \end{align*}
+ Thus, $F_\lambda$ is a degree 2 polynomial whose dominant coefficient is:
+ \begin{multline*}
+ \frac{c_i}{c_j}\mathbb{E}_{S'\sim
+ P_{\mathcal{N}\setminus\{i,j\}}^\lambda(S')}\Big[
+ V(S'\cup\{i\})+V(S'\cup\{i\})\\
+ -V(S'\cup\{i,j\})-V(S')\Big]
+ \end{multline*}
+ which is positive by submodularity of $V$. Hence, the maximum of
+ $F_\lambda$ over the interval given in \eqref{eq:convex-interval} is
+ attained at one of its limit, at which either the $i$-th or $j$-th component of
+ $\lambda_\varepsilon$ becomes integral.
+\end{proof}
+
+\begin{lemma}\label{lemma:relaxation-ratio}
+ The following inequality holds:
+ \begin{displaymath}
+ \forall\lambda\in[0,1]^n,\;
+ \frac{\log\big(1+\mu\big)}{2\mu}
+ \,L_\mathcal{N}(\lambda)\leq
+ F_\mathcal{N}(\lambda)\leq L_{\mathcal{N}}(\lambda)
+ \end{displaymath}
+\end{lemma}
+
+\begin{proof}
+
+ We will prove that:
+ \begin{displaymath}
+ \frac{\log\big(1+\mu\big)}{2\mu}
+ \end{displaymath}
+ is a lower bound of the ratio $\partial_i F_\mathcal{N}(\lambda)/\partial_i
+ L_\mathcal{N}(\lambda)$.
+
+ This will be enough to conclude, by observing that:
+ \begin{displaymath}
+ \frac{F_\mathcal{N}(\lambda)}{L_\mathcal{N}(\lambda)}
+ \sim_{\lambda\rightarrow 0}
+ \frac{\sum_{i\in \mathcal{N}}\lambda_i\partial_i F_\mathcal{N}(0)}
+ {\sum_{i\in\mathcal{N}}\lambda_i\partial_i L_\mathcal{N}(0)}
+ \end{displaymath}
+ and that an interior critical point of the ratio
+ $F_\mathcal{N}(\lambda)/L_\mathcal{N}(\lambda)$ is defined by:
+ \begin{displaymath}
+ \frac{F_\mathcal{N}(\lambda)}{L_\mathcal{N}(\lambda)}
+ = \frac{\partial_i F_\mathcal{N}(\lambda)}{\partial_i
+ L_\mathcal{N}(\lambda)}
+ \end{displaymath}
+
+ Let us start by computing the derivatives of $F_\mathcal{N}$ and
+ $L_\mathcal{N}$ with respect to
+ the $i$-th component.
+
+ For $F$, it suffices to look at the derivative of
+ $P_\mathcal{N}^\lambda(S)$:
+ \begin{displaymath}
+ \partial_i P_\mathcal{N}^\lambda(S) = \left\{
+ \begin{aligned}
+ & P_{\mathcal{N}\setminus\{i\}}^\lambda(S\setminus\{i\})\;\textrm{if}\; i\in S \\
+ & - P_{\mathcal{N}\setminus\{i\}}^\lambda(S)\;\textrm{if}\;
+ i\in \mathcal{N}\setminus S \\
+ \end{aligned}\right.
+ \end{displaymath}
+
+ Hence:
+ \begin{multline*}
+ \partial_i F_\mathcal{N} =
+ \sum_{\substack{S\subset\mathcal{N}\\ i\in S}}
+ P_{\mathcal{N}\setminus\{i\}}^\lambda(S\setminus\{i\})V(S)\\
+ - \sum_{\substack{S\subset\mathcal{N}\\ i\in \mathcal{N}\setminus S}}
+ P_{\mathcal{N}\setminus\{i\}}^\lambda(S)V(S)\\
+ \end{multline*}
+
+ Now, using that every $S$ such that $i\in S$ can be uniquely written as
+ $S'\cup\{i\}$, we can write:
+ \begin{multline*}
+ \partial_i F_\mathcal{N} =
+ \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}}
+ P_{\mathcal{N}\setminus\{i\}}^\lambda(S)V(S\cup\{i\})\\
+ - \sum_{\substack{S\subset\mathcal{N}\\ i\in \mathcal{N}\setminus S}}
+ P_{\mathcal{N}\setminus\{i\}}^\lambda(S)V(S)\\
+ \end{multline*}
+
+ Finally, by using the expression for the marginal contribution of $i$ to
+ $S$:
+ \begin{displaymath}
+ \partial_i F_\mathcal{N}(\lambda) =
+ \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}}
+ P_{\mathcal{N}\setminus\{i\}}^\lambda(S)
+ \log\Big(1 + \mu x_i^*A(S)^{-1}x_i\Big)
+ \end{displaymath}
+
+ The computation of the derivative of $L_\mathcal{N}$ uses standard matrix
+ calculus and gives:
+ \begin{displaymath}
+ \partial_i L_\mathcal{N}(\lambda)
+ = \mu x_i^* \tilde{A}(\lambda)^{-1}x_i
+ \end{displaymath}
+
+ Using the following inequalities:
+ \begin{gather*}
+ \forall S\subset\mathcal{N}\setminus\{i\},\quad
+ P_{\mathcal{N}\setminus\{i\}}^\lambda(S)\geq
+ P_{\mathcal{N}\setminus\{i\}}^\lambda(S\cup\{i\})\\
+ \forall S\subset\mathcal{N},\quad P_{\mathcal{N}\setminus\{i\}}^\lambda(S)
+ \geq P_\mathcal{N}^\lambda(S)\\
+ \forall S\subset\mathcal{N},\quad A(S)^{-1} \geq A(S\cup\{i\})^{-1}\\
+ \end{gather*}
+ we get:
+ \begin{align*}
+ \partial_i F_\mathcal{N}(\lambda)
+ & \geq \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}}
+ P_\mathcal{N}^\lambda(S)
+ \log\Big(1 + \mu x_i^*A(S)^{-1}x_i\Big)\\
+ & \geq \frac{1}{2}
+ \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}}
+ P_\mathcal{N}^\lambda(S)
+ \log\Big(1 + \mu x_i^*A(S)^{-1}x_i\Big)\\
+ &\hspace{-3.5em}+\frac{1}{2}
+ \sum_{\substack{S\subset\mathcal{N}\\ i\in\mathcal{N}\setminus S}}
+ P_\mathcal{N}^\lambda(S\cup\{i\})
+ \log\Big(1 + \mu x_i^*A(S\cup\{i\})^{-1}x_i\Big)\\
+ &\geq \frac{1}{2}
+ \sum_{S\subset\mathcal{N}}
+ P_\mathcal{N}^\lambda(S)
+ \log\Big(1 + \mu x_i^*A(S)^{-1}x_i\Big)\\
+ \end{align*}
+
+ Using that $A(S)\geq I_d$ we get that:
+ \begin{displaymath}
+ \mu x_i^*A(S)^{-1}x_i \leq \mu
+ \end{displaymath}
+
+ Moreover:
+ \begin{displaymath}
+ \forall x\leq\mu,\; \log(1+x)\geq
+ \frac{\log\big(1+\mu\big)}{\mu} x
+ \end{displaymath}
+
+ Hence:
+ \begin{displaymath}
+ \partial_i F_\mathcal{N}(\lambda) \geq
+ \frac{\log\big(1+\mu\big)}{2\mu}
+ x_i^*\bigg(\sum_{S\subset\mathcal{N}}P_\mathcal{N}^\lambda(S)A(S)^{-1}\bigg)x_i
+ \end{displaymath}
+
+ Finally, using that the inverse is a matrix convex function over symmetric
+ positive definite matrices:
+ \begin{align*}
+ \partial_i F_\mathcal{N}(\lambda) &\geq
+ \frac{\log\big(1+\mu\big)}{2\mu}
+ x_i^*\bigg(\sum_{S\subset\mathcal{N}}P_\mathcal{N}^\lambda(S)A(S)\bigg)^{-1}x_i\\
+ & \geq \frac{\log\big(1+\mu\big)}{2\mu}
+ \partial_i L_\mathcal{N}(\lambda)
+ \end{align*}
+\end{proof}
+
+We can now prove lemma TODO from previous section.
+
+\begin{proof}
+ Let us consider a feasible point $\lambda^*\in[0,1]^n$ such that $L_\mathcal{N}(\lambda^*)
+ = OPT(L_\mathcal{N}, B)$. By applying lemma~\ref{lemma:relaxation-ratio}
+ and lemma~\ref{lemma:rounding} we get a feasible point $\bar{\lambda}$ with at most
+ one fractional component such that:
+ \begin{equation}\label{eq:e1}
+ L_\mathcal{N}(\lambda^*) \leq \frac{1}{C_\mu}
+ F_\mathcal{N}(\bar{\lambda})
+ \end{equation}
+
+ Let $\lambda_i$ denote the fractional component of $\bar{\lambda}$ and $S$
+ denote the set whose indicator vector is $\bar{\lambda} - \lambda_i e_i$.
+ Using the fact that $F_\mathcal{N}$ is linear with respect to the $i$-th
+ component and is a relaxation of the value function, we get:
+ \begin{displaymath}
+ F_\mathcal{N}(\bar{\lambda}) = V(S) +\lambda_i V(S\cup\{i\})
+ \end{displaymath}
+
+ Using the submodularity of $V$:
+ \begin{displaymath}
+ F_\mathcal{N}(\bar{\lambda}) \leq 2 V(S) + V(i)
+ \end{displaymath}
+
+ Note that since $\bar{\lambda}$ is feasible, $S$ is also feasible and
+ $V(S)\leq OPT(V,\mathcal{N}, B)$. Hence:
+ \begin{equation}\label{eq:e2}
+ F_\mathcal{N}(\bar{\lambda}) \leq 2 OPT(V,\mathcal{N}, B)
+ + \max_{i\in\mathcal{N}} V(i)
+ \end{equation}
+
+ Putting \eqref{eq:e1} and \eqref{eq:e2} together gives the results.
+\end{proof}
+
\section{General setup}
\section{Conclusion}