\section{Proofs of Statements in Section~\ref{sec:concave}} \subsection{Proof of Lemma~\ref{lemma:relaxation-ratio}}\label{proofofrelaxation-ratio} %\begin{proof} The bound $F_{\mathcal{N}}(\lambda)\leq L_{\mathcal{N}}(\lambda)$ follows by the concavity of the $\log\det$ function and Jensen's inequality. To show the lower bound, we first prove that $\frac{1}{2}$ is a lower bound of the ratio $\partial_i F(\lambda)/\partial_i L(\lambda)$, where we use $\partial_i\, \cdot$ as a shorthand for $\frac{\partial}{\partial \lambda_i}$, the partial derivative with respect to the $i$-th variable. Let us start by computing the partial derivatives of $F$ and $L$ with respect to the $i$-th component. Observe that \begin{displaymath} \partial_i P_\mathcal{N}^\lambda(S) = \left\{ \begin{aligned} & P_{\mathcal{N}\setminus\{i\}}^\lambda(S\setminus\{i\})\;\textrm{if}\; i\in S, \\ & - P_{\mathcal{N}\setminus\{i\}}^\lambda(S)\;\textrm{if}\; i\in \mathcal{N}\setminus S. \\ \end{aligned}\right. \end{displaymath} Hence, \begin{displaymath} \partial_i F(\lambda) = \sum_{\substack{S\subseteq\mathcal{N}\\ i\in S}} P_{\mathcal{N}\setminus\{i\}}^\lambda(S\setminus\{i\})V(S) - \sum_{\substack{S\subseteq\mathcal{N}\\ i\in \mathcal{N}\setminus S}} P_{\mathcal{N}\setminus\{i\}}^\lambda(S)V(S). \end{displaymath} Now, using that every $S$ such that $i\in S$ can be uniquely written as $S'\cup\{i\}$, we can write: \begin{displaymath} \partial_i F(\lambda) = \sum_{\substack{S\subseteq\mathcal{N}\\ i\in\mathcal{N}\setminus S}} P_{\mathcal{N}\setminus\{i\}}^\lambda(S)\big(V(S\cup\{i\}) - V(S)\big). \end{displaymath} The marginal contribution of $i$ to $S$ can be written as \begin{align*} V(S\cup \{i\}) - V(S)& = \frac{1}{2}\log\det(I_d + \T{X_S}X_S + x_i\T{x_i}) - \frac{1}{2}\log\det(I_d + \T{X_S}X_S)\\ & = \frac{1}{2}\log\det(I_d + x_i\T{x_i}(I_d + \T{X_S}X_S)^{-1}) = \frac{1}{2}\log(1 + \T{x_i}A(S)^{-1}x_i) \end{align*} where $A(S) \defeq I_d+ \T{X_S}X_S$, and the last equality follows from the Sylvester's determinant identity~\cite{sylvester}. % $ V(S\cup\{i\}) - V(S) = \frac{1}{2}\log\left(1 + \T{x_i} A(S)^{-1}x_i\right)$. Using this, \begin{displaymath} \partial_i F(\lambda) = \frac{1}{2} \sum_{\substack{S\subseteq\mathcal{N}\\ i\in\mathcal{N}\setminus S}} P_{\mathcal{N}\setminus\{i\}}^\lambda(S) \log\Big(1 + \T{x_i}A(S)^{-1}x_i\Big) \end{displaymath} The computation of the derivative of $L$ uses standard matrix calculus: writing $\tilde{A}(\lambda) = I_d+\sum_{i\in \mathcal{N}}\lambda_ix_i\T{x_i}$, \begin{displaymath} \det \tilde{A}(\lambda + h\cdot e_i) = \det\big(\tilde{A}(\lambda) + hx_i\T{x_i}\big) =\det \tilde{A}(\lambda)\big(1+ h\T{x_i}\tilde{A}(\lambda)^{-1}x_i\big). \end{displaymath} Hence, \begin{displaymath} \log\det\tilde{A}(\lambda + h\cdot e_i)= \log\det\tilde{A}(\lambda) + h\T{x_i}\tilde{A}(\lambda)^{-1}x_i + o(h), \end{displaymath} which implies \begin{displaymath} \partial_i L(\lambda) =\frac{1}{2} \T{x_i}\tilde{A}(\lambda)^{-1}x_i. \end{displaymath} For two symmetric matrices $A$ and $B$, we write $A\succ B$ ($A\succeq B$) if $A-B$ is positive definite (positive semi-definite). This order allows us to define the notion of a \emph{decreasing} as well as \emph{convex} matrix function, similarly to their real counterparts. With this definition, matrix inversion is decreasing and convex over symmetric positive definite matrices (see Example 3.48 p. 110 in \cite{boyd2004convex}). In particular, \begin{gather*} \forall S\subseteq\mathcal{N},\quad A(S)^{-1} \succeq A(S\cup\{i\})^{-1} \end{gather*} as $A(S)\preceq A(S\cup\{i\})$. Observe that since $1\leq \lambda_i\leq 1$, $P_{\mathcal{N}\setminus\{i\}}^\lambda(S) \geq P_\mathcal{N}^\lambda(S)$ and $P_{\mathcal{N}\setminus\{i\}}^\lambda(S)\geq P_{\mathcal{N}}^\lambda(S\cup\{i\})$ for all $S\subseteq\mathcal{N}\setminus\{i\}$. Hence, \begin{align*} \partial_i F(\lambda) & \geq \frac{1}{4} \sum_{\substack{S\subseteq\mathcal{N}\\ i\in\mathcal{N}\setminus S}} P_{\mathcal{N}}^\lambda(S) \log\Big(1 + \T{x_i}A(S)^{-1}x_i\Big)\\ &\hspace{-3.5em}+\frac{1}{4} \sum_{\substack{S\subseteq\mathcal{N}\\ i\in\mathcal{N}\setminus S}} P_{\mathcal{N}}^\lambda(S\cup\{i\}) \log\Big(1 + \T{x_i}A(S\cup\{i\})^{-1}x_i\Big)\\ &\geq \frac{1}{4} \sum_{S\subseteq\mathcal{N}} P_\mathcal{N}^\lambda(S) \log\Big(1 + \T{x_i}A(S)^{-1}x_i\Big). \end{align*} Using that $A(S)\succeq I_d$ we get that $\T{x_i}A(S)^{-1}x_i \leq \norm{x_i}_2^2 \leq 1$. Moreover, $\log(1+x)\geq x$ for all $x\leq 1$. Hence, \begin{displaymath} \partial_i F(\lambda) \geq \frac{1}{4} \T{x_i}\bigg(\sum_{S\subseteq\mathcal{N}}P_\mathcal{N}^\lambda(S)A(S)^{-1}\bigg)x_i. \end{displaymath} Finally, using that the inverse is a matrix convex function over symmetric positive definite matrices: \begin{displaymath} \partial_i F(\lambda) \geq \frac{1}{4} \T{x_i}\bigg(\sum_{S\subseteq\mathcal{N}}P_\mathcal{N}^\lambda(S)A(S)\bigg)^{-1}x_i = \frac{1}{4}\T{x_i}\tilde{A}(\lambda)^{-1}x_i = \frac{1}{2} \partial_i L(\lambda). \end{displaymath} Having bound the ratio between the partial derivatives, we now bound the ratio $F(\lambda)/L(\lambda)$ from below. Consider the following cases. First, if the minimum is attained as $\lambda$ converges to zero in, \emph{e.g.}, the $l_2$ norm, by the Taylor approximation, one can write: \begin{displaymath} \frac{F(\lambda)}{L(\lambda)} \sim_{\lambda\rightarrow 0} \frac{\sum_{i\in \mathcal{N}}\lambda_i\partial_i F(0)} {\sum_{i\in\mathcal{N}}\lambda_i\partial_i L(0)} \geq \frac{1}{2}, \end{displaymath} \emph{i.e.}, the ratio $\frac{F(\lambda)}{L(\lambda)}$ is necessarily bounded from below by 1/2 for small enough $\lambda$. Second, if the minimum of the ratio $F(\lambda)/L(\lambda)$ is attained at a vertex of the hypercube $[0,1]^n$ different from 0. $F$ and $L$ being relaxations of the value function $V$, they are equal to $V$ on the vertices which are exactly the binary points. Hence, the minimum is equal to 1 in this case; in particular, it is greater than $1/2$. Finally, if the minimum is attained at a point $\lambda^*$ with at least one coordinate belonging to $(0,1)$, let $i$ be one such coordinate and consider the function $G_i$: \begin{displaymath} G_i: x \mapsto \frac{F}{L}(\lambda_1^*,\ldots,\lambda_{i-1}^*, x, \lambda_{i+1}^*, \ldots, \lambda_n^*). \end{displaymath} Then this function attains a minimum at $\lambda^*_i\in(0,1)$ and its derivative is zero at this point. Hence: \begin{displaymath} 0 = G_i'(\lambda^*_i) = \partial_i\left(\frac{F}{L}\right)(\lambda^*). \end{displaymath} But $\partial_i(F/L)(\lambda^*)=0$ implies that \begin{displaymath} \frac{F(\lambda^*)}{L(\lambda^*)} = \frac{\partial_i F(\lambda^*)}{\partial_i L(\lambda^*)}\geq \frac{1}{2} \end{displaymath} using the lower bound on the ratio of the partial derivatives. This concludes the proof of the lemma. \qed %\end{proof} %We now prove that $F$ admits the following exchange property: let $\lambda$ be a feasible element of $[0,1]^n$, it is possible to trade one fractional component of $\lambda$ for another until one of them becomes integral, obtaining a new element $\tilde{\lambda}$ which is both feasible and for which $F(\tilde{\lambda})\geq F(\lambda)$. Here, by feasibility of a point $\lambda$, we mean that it satisfies the budget constraint $\sum_{i=1}^n \lambda_i c_i \leq B$. This rounding property is referred to in the literature as \emph{cross-convexity} (see, \emph{e.g.}, \cite{dughmi}), or $\varepsilon$-convexity by \citeN{pipage}. %\begin{lemma}[Rounding]\label{lemma:rounding} % For any feasible $\lambda\in[0,1]^{n}$, there exists a feasible % $\bar{\lambda}\in[0,1]^{n}$ such that at most one of its components is % fractional %, that is, lies in $(0,1)$ and: % and $F_{\mathcal{N}}(\lambda)\leq F_{\mathcal{N}}(\bar{\lambda})$. %\end{lemma} \subsection{Proof of Lemma~\ref{lemma:rounding}}\label{proofoflemmarounding} %\begin{proof} We give a rounding procedure which, given a feasible $\lambda$ with at least two fractional components, returns some feasible $\lambda'$ with one less fractional component such that $F(\lambda) \leq F(\lambda')$. Applying this procedure recursively yields the lemma's result. Let us consider such a feasible $\lambda$. Let $i$ and $j$ be two fractional components of $\lambda$ and let us define the following function: \begin{displaymath} F_\lambda(\varepsilon) = F(\lambda_\varepsilon) \quad\textrm{where} \quad \lambda_\varepsilon = \lambda + \varepsilon\left(e_i-\frac{c_i}{c_j}e_j\right) \end{displaymath} It is easy to see that if $\lambda$ is feasible, then: \begin{equation}\label{eq:convex-interval} \forall\varepsilon\in\Big[\max\Big(-\lambda_i,(\lambda_j-1)\frac{c_j}{c_i}\Big), \min\Big(1-\lambda_i, \lambda_j \frac{c_j}{c_i}\Big)\Big],\; \lambda_\varepsilon\;\;\textrm{is feasible} \end{equation} Furthermore, the function $F_\lambda$ is convex; indeed: \begin{align*} F_\lambda(\varepsilon) & = \mathbb{E}_{S'\sim P_{\mathcal{N}\setminus\{i,j\}}^\lambda(S')}\Big[ (\lambda_i+\varepsilon)\Big(\lambda_j-\varepsilon\frac{c_i}{c_j}\Big)V(S'\cup\{i,j\})\\ & + (\lambda_i+\varepsilon)\Big(1-\lambda_j+\varepsilon\frac{c_i}{c_j}\Big)V(S'\cup\{i\}) + (1-\lambda_i-\varepsilon)\Big(\lambda_j-\varepsilon\frac{c_i}{c_j}\Big)V(S'\cup\{j\})\\ & + (1-\lambda_i-\varepsilon)\Big(1-\lambda_j+\varepsilon\frac{c_i}{c_j}\Big)V(S')\Big] \end{align*} Thus, $F_\lambda$ is a degree 2 polynomial whose dominant coefficient is: \begin{displaymath} \frac{c_i}{c_j}\mathbb{E}_{S'\sim P_{\mathcal{N}\setminus\{i,j\}}^\lambda(S')}\Big[ V(S'\cup\{i\})+V(S'\cup\{i\})\\ -V(S'\cup\{i,j\})-V(S')\Big] \end{displaymath} which is positive by submodularity of $V$. Hence, the maximum of $F_\lambda$ over the interval given in \eqref{eq:convex-interval} is attained at one of its limit, at which either the $i$-th or $j$-th component of $\lambda_\varepsilon$ becomes integral. \qed %\end{proof} \subsection{Proof of Proposition~\ref{prop:relaxation}}\label{proofofproprelaxation} The lower bound on $L^*_c$ follows immediately from the fact that $L$ extends $V$ to $[0,1]^n$. For the upper bound, let us consider a feasible point $\lambda^*\in \dom_c$ such that $L(\lambda^*) = L^*_c$. By applying Lemma~\ref{lemma:relaxation-ratio} and Lemma~\ref{lemma:rounding} we get a feasible point $\bar{\lambda}$ with at most one fractional component such that \begin{equation}\label{eq:e1} L(\lambda^*) \leq 2 F(\bar{\lambda}). \end{equation} Let $\lambda_i$ denote the fractional component of $\bar{\lambda}$ and $S$ denote the set whose indicator vector is $\bar{\lambda} - \lambda_i e_i$. By definition of the multi-linear extension $F$: \begin{displaymath} F(\bar{\lambda}) = (1-\lambda_i)V(S) +\lambda_i V(S\cup\{i\}). \end{displaymath} By submodularity of $V$, $V(S\cup\{i\})\leq V(S) + V(\{i\})$. Hence, \begin{displaymath} F(\bar{\lambda}) \leq V(S) + V(i). \end{displaymath} Note that since $\bar{\lambda}$ is feasible, $S$ is also feasible and $V(S)\leq OPT$. Hence, \begin{equation}\label{eq:e2} F(\bar{\lambda}) \leq OPT + \max_{i\in\mathcal{N}} V(i). \end{equation} Together, \eqref{eq:e1} and \eqref{eq:e2} imply the proposition.\qed \section{Proof of Proposition~\ref{prop:monotonicity}}\label{proofofpropmonotonicity} %The $\log\det$ function is concave and self-concordant (see %\cite{boyd2004convex}), in this case, the analysis of the barrier method in %in \cite{boyd2004convex} (Section 11.5.5) can be summarized in the following %lemma: %\begin{lemma}\label{lemma:barrier} %For any $\varepsilon>0$, the barrier method computes an $\varepsilon$-accurate %approximation of $L^*_c$ in time $O(poly(n,d,\log\log\varepsilon^{-1})$. %\end{lemma} We proceed by showing that the optimal value of \eqref{eq:perturbed-primal} is close to the optimal value of \eqref{eq:primal} (Lemma~\ref{lemma:proximity}) while being well-behaved with respect to changes of the cost (Lemma~\ref{lemma:monotonicity}). These lemmas together imply Proposition~\ref{prop:monotonicity}. Note that the choice of $\alpha$ given in Algorithm~\ref{alg:monotone} implies that $\alpha<\frac{1}{n}$. This in turn implies that the feasible set $\mathcal{D}_{c, \alpha}$ of \eqref{eq:perturbed-primal} is non-empty: it contains the strictly feasible point $\lambda=(\frac{1}{n},\ldots,\frac{1}{n})$. \begin{lemma}\label{lemma:derivative-bounds} Let $\partial_i L(\lambda)$ denote the $i$-th derivative of $L$, for $i\in\{1,\ldots, n\}$, then: \begin{displaymath} \forall\lambda\in[0, 1]^n,\;\frac{b}{2^n} \leq \partial_i L(\lambda) \leq 1 \end{displaymath} \end{lemma} \begin{proof} Recall that we had defined: \begin{displaymath} \tilde{A}(\lambda)\defeq I_d + \sum_{i=1}^n \lambda_i x_i\T{x_i} \quad\mathrm{and}\quad A(S) \defeq I_d + \sum_{i\in S} x_i\T{x_i} \end{displaymath} Let us also define $A_k\defeq A(\{x_1,\ldots,x_k\})$. We have $\partial_i L(\lambda) = \T{x_i}\tilde{A}(\lambda)^{-1}x_i$. Since $\tilde{A}(\lambda)\succeq I_d$, $\partial_i L(\lambda)\leq \T{x_i}x_i \leq 1$, which is the right-hand side of the lemma. For the left-hand side, note that $\tilde{A}(\lambda) \preceq A_n$. Hence $\partial_iL(\lambda)\geq \T{x_i}A_n^{-1}x_i$. Using the Sherman-Morrison formula, for all $k\geq 1$: \begin{displaymath} \T{x_i}A_k^{-1} x_i = \T{x_i}A_{k-1}^{-1}x_i - \frac{(\T{x_i}A_{k-1}^{-1}x_k)^2}{1+\T{x_k}A_{k-1}^{-1}x_k} \end{displaymath} By the Cauchy-Schwarz inequality: \begin{displaymath} (\T{x_i}A_{k-1}^{-1}x_k)^2 \leq \T{x_i}A_{k-1}^{-1}x_i\;\T{x_k}A_{k-1}^{-1}x_k \end{displaymath} Hence: \begin{displaymath} \T{x_i}A_k^{-1} x_i \geq \T{x_i}A_{k-1}^{-1}x_i - \T{x_i}A_{k-1}^{-1}x_i\frac{\T{x_k}A_{k-1}^{-1}x_k}{1+\T{x_k}A_{k-1}^{-1}x_k} \end{displaymath} But $\T{x_k}A_{k-1}^{-1}x_k\leq 1$ and $\frac{a}{1+a}\leq \frac{1}{2}$ if $0\leq a\leq 1$, so: \begin{displaymath} \T{x_i}A_{k}^{-1}x_i \geq \T{x_i}A_{k-1}^{-1}x_i - \frac{1}{2}\T{x_i}A_{k-1}^{-1}x_i\geq \frac{\T{x_i}A_{k-1}^{-1}x_i}{2} \end{displaymath} By induction: \begin{displaymath} \T{x_i}A_n^{-1} x_i \geq \frac{\T{x_i}x_i}{2^n} \end{displaymath} Using that $\T{x_i}{x_i}\geq b$ concludes the proof of the left-hand side of the lemma's inequality. \end{proof} Let us introduce the Lagrangian of problem \eqref{eq:perturbed-primal}: \begin{displaymath} \mathcal{L}_{c, \alpha}(\lambda, \mu, \nu, \xi) \defeq L(\lambda) + \T{\mu}(\lambda-\alpha\mathbf{1}) + \T{\nu}(\mathbf{1}-\lambda) + \xi(B-\T{c}\lambda) \end{displaymath} so that: \begin{displaymath} L^*_{c,\alpha} = \min_{\mu, \nu, \xi\geq 0}\max_\lambda \mathcal{L}_{c, \alpha}(\lambda, \mu, \nu, \xi) \end{displaymath} Similarly, we define $\mathcal{L}_{c}\defeq\mathcal{L}_{c, 0}$ the lagrangian of \eqref{eq:primal}. Let $\lambda^*$ be primal optimal for \eqref{eq:perturbed-primal}, and $(\mu^*, \nu^*, \xi^*)$ be dual optimal for the same problem. In addition to primal and dual feasibility, the KKT conditions give $\forall i\in\{1, \ldots, n\}$: \begin{gather*} \partial_i L(\lambda^*) + \mu_i^* - \nu_i^* - \xi^* c_i = 0\\ \mu_i^*(\lambda_i^* - \alpha) = 0\\ \nu_i^*(1 - \lambda_i^*) = 0 \end{gather*} \begin{lemma}\label{lemma:proximity} We have: \begin{displaymath} L^*_c - \alpha n^2\leq L^*_{c,\alpha} \leq L^*_c \end{displaymath} In particular, $|L^*_c - L^*_{c,\alpha}| \leq \alpha n^2$. \end{lemma} \begin{proof} $\alpha\mapsto L^*_{c,\alpha}$ is a decreasing function as it is the maximum value of the $L$ function over a set-decreasing domain, which gives the rightmost inequality. Let $\mu^*, \nu^*, \xi^*$ be dual optimal for $(P_{c, \alpha})$, that is: \begin{displaymath} L^*_{c,\alpha} = \max_\lambda \mathcal{L}_{c, \alpha}(\lambda, \mu^*, \nu^*, \xi^*) \end{displaymath} Note that $\mathcal{L}_{c, \alpha}(\lambda, \mu^*, \nu^*, \xi^*) = \mathcal{L}_{c}(\lambda, \mu^*, \nu^*, \xi^*) - \alpha\T{\mathbf{1}}\mu^*$, and that for any $\lambda$ feasible for problem \eqref{eq:primal}, $\mathcal{L}_{c}(\lambda, \mu^*, \nu^*, \xi^*) \geq L(\lambda)$. Hence, \begin{displaymath} L^*_{c,\alpha} \geq L(\lambda) - \alpha\T{\mathbf{1}}\mu^* \end{displaymath} for any $\lambda$ feasible for \eqref{eq:primal}. In particular, for $\lambda$ primal optimal for $\eqref{eq:primal}$: \begin{equation}\label{eq:local-1} L^*_{c,\alpha} \geq L^*_c - \alpha\T{\mathbf{1}}\mu^* \end{equation} Let us denote by the $M$ the support of $\mu^*$, that is $M\defeq \{i|\mu_i^* > 0\}$, and by $\lambda^*$ a primal optimal point for $\eqref{eq:perturbed-primal}$. From the KKT conditions we see that: \begin{displaymath} M \subseteq \{i|\lambda_i^* = \alpha\} \end{displaymath} Let us first assume that $|M| = 0$, then $\T{\mathbf{1}}\mu^*=0$ and the lemma follows. We will now assume that $|M|\geq 1$. In this case $\T{c}\lambda^* = B$, otherwise we could increase the coordinates of $\lambda^*$ in $M$, which would increase the value of the objective function and contradict the optimality of $\lambda^*$. Note also, that $|M|\leq n-1$, otherwise, since $\alpha< \frac{1}{n}$, we would have $\T{c}\lambda^*\ < B$, which again contradicts the optimality of $\lambda^*$. Let us write: \begin{displaymath} B = \T{c}\lambda^* = \alpha\sum_{i\in M}c_i + \sum_{i\in \bar{M}}\lambda_i^*c_i \leq \alpha |M|B + (n-|M|)\max_{i\in \bar{M}} c_i \end{displaymath} That is: \begin{equation}\label{local-2} \max_{i\in\bar{M}} c_i \geq \frac{B - B|M|\alpha}{n-|M|}> \frac{B}{n} \end{equation} where the last inequality uses again that $\alpha<\frac{1}{n}$. From the KKT conditions, we see that for $i\in M$, $\nu_i^* = 0$ and: \begin{equation}\label{local-3} \mu_i^* = \xi^*c_i - \partial_i L(\lambda^*)\leq \xi^*c_i\leq \xi^*B \end{equation} since $\partial_i L(\lambda^*)\geq 0$ and $c_i\leq 1$. Furthermore, using the KKT conditions again, we have that: \begin{equation}\label{local-4} \xi^* \leq \inf_{i\in \bar{M}}\frac{\partial_i L(\lambda^*)}{c_i}\leq \inf_{i\in \bar{M}} \frac{1}{c_i} = \frac{1}{\max_{i\in\bar{M}} c_i} \end{equation} where the last inequality uses Lemma~\ref{lemma:derivative-bounds}. Combining \eqref{local-2}, \eqref{local-3} and \eqref{local-4}, we get that: \begin{displaymath} \sum_{i\in M}\mu_i^* \leq |M|\xi^*B \leq n\xi^*B\leq \frac{nB}{\max_{i\in\bar{M}} c_i} \leq n^2 \end{displaymath} This implies that: \begin{displaymath} \T{\mathbf{1}}\mu^* = \sum_{i=1}^n \mu^*_i = \sum_{i\in M}\mu_i^*\leq n^2 \end{displaymath} which in addition to \eqref{eq:local-1} proves the lemma. \end{proof} \begin{lemma}\label{lemma:monotonicity} If $c'$ = $(c_i', c_{-i})$, with $c_i'\leq c_i - \delta$, we have: \begin{displaymath} L^*_{c',\alpha} \geq L^*_{c,\alpha} + \frac{\alpha\delta b}{2^nB} \end{displaymath} \end{lemma} \begin{proof} Let $\mu^*, \nu^*, \xi^*$ be dual optimal for $(P_{c', \alpha})$. Noting that: \begin{displaymath} \mathcal{L}_{c', \alpha}(\lambda, \mu^*, \nu^*, \xi^*) \geq \mathcal{L}_{c, \alpha}(\lambda, \mu^*, \nu^*, \xi^*) + \lambda_i\xi^*\delta, \end{displaymath} we get similarly to Lemma~\ref{lemma:proximity}: \begin{displaymath} L^*_{c',\alpha} \geq L(\lambda) + \lambda_i\xi^*\delta \end{displaymath} for any $\lambda$ feasible for \eqref{eq:perturbed-primal}. In particular, for $\lambda^*$ primal optimal for \eqref{eq:perturbed-primal}: \begin{displaymath} L^*_{c',\alpha} \geq L^*_{c,\alpha} + \alpha\xi^*\delta \end{displaymath} since $\lambda_i^*\geq \alpha$. Using the KKT conditions for $(P_{c', \alpha})$, we can write: \begin{displaymath} \xi^* = \inf_{i:\lambda^{'*}_i>\alpha} \frac{\T{x_i}S(\lambda^{'*})^{-1}x_i}{c_i'} \end{displaymath} with $\lambda^{'*}$ optimal for $(P_{c', \alpha})$. Since $c_i'\leq B$, using Lemma~\ref{lemma:derivative-bounds}, we get that $\xi^*\geq \frac{b}{2^nB}$, which concludes the proof. \end{proof} %\subsection*{End of the proof of Proposition~\ref{prop:monotonicity}} We are now ready to conclude the proof of Proposition~\ref{prop:monotonicity}. Let $\hat{L}^*_{c,\alpha}$ be the approximation computed by Algorithm~\ref{alg:monotone}. \begin{enumerate} \item using Lemma~\ref{lemma:proximity}: \begin{displaymath} |\hat{L}^*_{c,\alpha} - L^*_c| \leq |\hat{L}^*_{c,\alpha} - L^*_{c,\alpha}| + |L^*_{c,\alpha} - L^*_c| \leq \frac{\alpha\delta}{B} + \alpha n^2 = \varepsilon \end{displaymath} which proves the $\varepsilon$-accuracy. \item for the $\delta$-decreasingness, let $c' = (c_i', c_{-i})$ with $c_i'\leq c_i-\delta$, then: \begin{displaymath} \hat{L}^*_{c',\alpha} \geq L^*_{c',\alpha} - \frac{\alpha\delta b}{2^{n+1}B} \geq L^*_{c,\alpha} + \frac{\alpha\delta b}{2^{n+1}B} \geq \hat{L}^*_{c,\alpha} \end{displaymath} where the first and last inequalities follow from the accuracy of the approximation, and the inner inequality follows from Lemma~\ref{lemma:monotonicity}. \item the accuracy of the approximation $\hat{L}^*_{c,\alpha}$ is: \begin{displaymath} A\defeq\frac{\varepsilon\delta b}{2^{n+1}(\delta + n^2B)} \end{displaymath} Note that: \begin{displaymath} \log\log A^{-1} = O\bigg(\log\log\frac{B}{\varepsilon\delta b} + \log n\bigg) \end{displaymath} Using Lemma~\ref{lemma:barrier} concludes the proof of the running time.\qed \end{enumerate} \section{Budget Feasible Reverse Auction Mechanisms}\label{app:budgetfeasible} We review in this appendix the formal definition of a budget feasible reverse auction mechanisms, as introduced by \citeN{singer-mechanisms}. We depart from the definitions in \cite{singer-mechanisms} only in considering $\delta$-truthful, rather than truthful, mechanisms. Given a budget $B$ and a value function $V:2^{\mathcal{N}}\to\reals_+$, a \emph{mechanism} $\mathcal{M} = (S,p)$ comprises (a) an \emph{allocation function}\footnote{Note that $S$ would be more aptly termed as a \emph{selection} function, as this is a reverse auction, but we retain the term ``allocation'' to align with the familiar term from standard auctions.} $S:\reals_+^n \to 2^\mathcal{N}$ and (b) a \emph{payment function} $p:\reals_+^n\to \reals_+^n$. Let $s_i(c) = \id_{i\in S(c)}$ be the binary indicator of $i\in S(c)$. We seek mechanisms that have the following properties \cite{singer-mechanisms}: \begin{itemize} \item \emph{Normalization.} Unallocated experiments receive zero payments: $s_i(c)=0\text{ implies }p_i(c)=0.\label{normal}$ \item\emph{Individual Rationality.} Payments for allocated experiments exceed costs: $p_i(c)\geq c_i\cdot s_i(c).\label{ir}$ \item \emph{No Positive Transfers.} Payments are non-negative: $p_i(c)\geq 0\label{pt}$. \item \emph{$\delta$-Truthfulness.} Reporting one's true cost is an \emph{almost-dominant} \cite{schummer2004almost} strategy. Formally, let $c_{-i}$ be a vector of costs of all agents except $i$. Then, $p_i(c_i,c_{-i}) - s_i(c_i,c_{-i})\cdot c_i \geq p_i(c_i',c_{-i}) - s_i(c_i',c_{-i})\cdot c_i, $ for every $i \in \mathcal{N}$ and every two cost vectors $(c_i,c_{-i})$ and $(c_i',c_{-i})$ such that $|c_i-c_i'|>\delta.$ The mechanism is \emph{truthful} if $\delta=0$. \item \emph{Budget Feasibility.} The sum of the payments should not exceed the budget constraint, \emph{i.e.} $\sum_{i\in\mathcal{N}} p_i(c) \leq B.\label{budget}$ \item \emph{Approximation Ratio.} The value of the allocated set should not be too far from the optimum value of the full information case, as given by \eqref{eq:non-strategic}. Formally, there must exist some $\alpha\geq 1$ such that $OPT \leq \alpha V(S(p))$, where $OPT = \max_{S\subseteq\mathcal{N}} \left\{V(S) \;\mid \; \sum_{i\in S}c_i\leq B\right\}$. % The approximation ratio captures the \emph{price of truthfulness}, \emph{i.e.}, the relative value loss incurred by adding the truthfulness constraint. % to the value function maximization. %We stress here that the approximation ratio is defined with respect \emph{to the maximal value in the full information case}. \item \emph{Computational Efficiency.} The allocation and payment function should be computable in time polynomial in various parameters. %time in the number of agents $n$. %\thibaut{Should we say something about the black-box model for $V$? Needed to say something in general, but not in our case where the value function can be computed in polynomial time}. \end{itemize} \section{Proof of Theorem~\ref{thm:main}}\label{sec:proofofmainthm} We now present the proof of Theorem~\ref{thm:main}. We use the notation $OPT_{-i^*}$ to denote the optimal value of \EDP{} when the maximum value element $i^*$ is excluded. We also use $OPT'_{-i^*}$ the approximation computed by the $\delta$-decreasing, $\epsilon$-accurate approximation of $L^*_{c_{-i^*}}$, as defined in Algorithm~\ref{mechanism}. The properties of $\delta$-truthfulness and individual rationality follow from $\delta$-monotonicity and threshold payments. $\delta$-monotonicity and budget feasibility follow similar steps as the analysis of \citeN{chen}, adapted to account for $\delta$-monotonicity: \begin{lemma}\label{lemma:monotone} Our mechanism for \EDP{} is $\delta$-monotone and budget feasible. \end{lemma} \begin{proof} Consider an agent $i$ with cost $c_i$ that is selected by the mechanism, and suppose that she reports a cost $c_i'\leq c_i-\delta$ while all other costs stay the same. Suppose that when $i$ reports $c_i$, $OPT'_{-i^*} \geq C V(i^*)$; then, as $s_i(c_i,c_{-i})=1$, $i\in S_G$. By reporting cost $c_i'$, $i$ may be selected at an earlier iteration of the greedy algorithm. %using the submodularity of $V$, we see that $i$ will satisfy the greedy %selection rule: %\begin{displaymath} % i = \argmax_{j\in\mathcal{N}\setminus S} \frac{V(S\cup\{j\}) % - V(S)}{c_j} %\end{displaymath} %in an earlier iteration of the greedy heuristic. Denote by $S_i$ (resp. $S_i'$) the set to which $i$ is added when reporting cost $c_i$ (resp. $c_i'$). We have $S_i'\subseteq S_i$; in addition, $S_i'\subseteq S_G'$, the set selected by the greedy algorithm under $(c_i',c_{-i})$; if not, then greedy selection would terminate prior to selecting $i$ also when she reports $c_i$, a contradiction. Moreover, we have \begin{align*} c_i' & \leq c_i \leq \frac{B}{2}\frac{V(S_i\cup\{i\})-V(S_i)}{V(S_i\cup\{i\})} \leq \frac{B}{2}\frac{V(S_i'\cup\{i\})-V(S_i')}{V(S_i'\cup\{i\})} \end{align*} by the monotonicity and submodularity of $V$. Hence $i\in S_G'$. By $\delta$-decreasingness of $OPT'_{-i^*}$, under $c'_i\leq c_i-\delta$ the greedy set is still allocated and $s_i(c_i',c_{-i}) =1$. Suppose now that when $i$ reports $c_i$, $OPT'_{-i^*} < C V(i^*)$. Then $s_i(c_i,c_{-i})=1$ iff $i = i^*$. Reporting $c_{i^*}'\leq c_{i^*}$ does not change $V(i^*)$ nor $OPT'_{-i^*} \leq C V(i^*)$; thus $s_{i^*}(c_{i^*}',c_{-i^*})=1$, so the mechanism is monotone. To show budget feasibility, suppose that $OPT'_{-i^*} < C V(i^*)$. Then the mechanism selects $i^*$. Since the bid of $i^*$ does not affect the above condition, the threshold payment of $i^*$ is $B$ and the mechanism is budget feasible. Suppose that $OPT'_{-i^*} \geq C V(i^*)$. Denote by $S_G$ the set selected by the greedy algorithm, and for $i\in S_G$, denote by $S_i$ the subset of the solution set that was selected by the greedy algorithm just prior to the addition of $i$---both sets determined for the present cost vector $c$. %Chen \emph{et al.}~\cite{chen} show that, Then for any submodular function $V$, and for all $i\in S_G$: %the reported cost of an agent selected by the greedy heuristic, and holds for %any submodular function $V$: \begin{equation}\label{eq:budget} \text{if}~c_i'\geq \frac{V(S_i\cup\{i\}) - V(S)}{V(S_G)} B~\text{then}~s_i(c_i',c_{-i})=0 \end{equation} In other words, if $i$ increases her cost to a value higher than $\frac{V(S_i\cup\{i\}) - V(S)}{V(S_G)}$, she will cease to be in the selected set $S_G$. As a result, \eqref{eq:budget} implies that the threshold payment of user $i$ is bounded by the above quantity. %\begin{displaymath} %\frac{V(S_i\cup\{i\}) - V(S_i)}{V(S_G)} = B %\end{displaymath} Hence, the total payment is bounded by the telescopic sum: \begin{displaymath} \sum_{i\in S_G} \frac{V(S_i\cup\{i\}) - V(S_i)}{V(S_G)} B = \frac{V(S_G)-V(\emptyset)}{V(S_G)} B=B\qed \end{displaymath} \end{proof} The complexity of the mechanism is given by the following lemma. \begin{lemma}[Complexity]\label{lemma:complexity} For any $\varepsilon > 0$ and any $\delta>0$, the complexity of the mechanism is $O\big(poly(n, d, \log\log\frac{B}{b\varepsilon\delta})\big)$ \end{lemma} \begin{proof} The value function $V$ in \eqref{modified} can be computed in time $O(\text{poly}(n, d))$ and the mechanism only involves a linear number of queries to the function $V$. By Proposition~\ref{prop:monotonicity}, line 3 of Algorithm~\ref{mechanism} can be computed in time $O(\text{poly}(n, d, \log\log \frac{B}{b\varepsilon\delta}))$. Hence the allocation function's complexity is as stated. %Payments can be easily computed in time $O(\text{poly}(n, d))$ as in prior work. \junk{ Using Singer's characterization of the threshold payments \cite{singer-mechanisms}, one can verify that they can be computed in time $O(\text{poly}(n, d))$. } \end{proof} Finally, we prove the approximation ratio of the mechanism. We use the following lemma from \cite{chen} which bounds $OPT$ in terms of the value of $S_G$, as computed in Algorithm \ref{mechanism}, and $i^*$, the element of maximum value. \begin{lemma}[\cite{chen}]\label{lemma:greedy-bound} Let $S_G$ be the set computed in Algorithm \ref{mechanism} and let $i^*=\argmax_{i\in\mathcal{N}} V(\{i\})$. We have: \begin{displaymath} OPT \leq \frac{e}{e-1}\big( 3 V(S_G) + 2 V(i^*)\big). \end{displaymath} \end{lemma} Using Proposition~\ref{prop:relaxation} and~\ref{lemma:greedy-bound} we can complete the proof of Theorem~\ref{thm:main} by showing that, for any $\varepsilon > 0$, if $OPT_{-i}'$, the optimal value of $L$ when $i^*$ is excluded from $\mathcal{N}$, has been computed to a precision $\varepsilon$, then the set $S^*$ allocated by the mechanism is such that: \begin{equation} \label{approxbound} OPT \leq \frac{10e\!-\!3 + \sqrt{64e^2\!-\!24e\!+\!9}}{2(e\!-\!1)} V(S^*)\! + \! \varepsilon . \end{equation} To see this, let $L^*_{-i^*}$ be the true maximum value of $L$ subject to $\lambda_{i^*}=0$, $\sum_{i\in \mathcal{N}\setminus{i^*}}c_i\leq B$. From line 3 of Algorithm~\ref{mechanism}, we have $L^*_{-i^*}-\varepsilon\leq OPT_{-i^*}' \leq L^*_{-i^*}+\varepsilon$. If the condition on line 4 of the algorithm holds, then \begin{displaymath} V(i^*) \geq \frac{1}{C}L^*_{-i^*}-\frac{\varepsilon}{C} \geq \frac{1}{C}OPT_{-i^*} -\frac{\varepsilon}{C}. \end{displaymath} Indeed, $L^*_{-i^*}\geq OPT_{-i^*}$ as $L$ is a fractional relaxation of $V$. Also, $OPT \leq OPT_{-i^*} + V(i^*)$, hence, \begin{equation}\label{eq:bound1} OPT\leq (1+C)V(i^*) + \varepsilon. \end{equation} If the condition does not hold, by observing that $L^*_{-i^*}\leq L^*_c$ and applying Proposition~\ref{prop:relaxation}, we get \begin{displaymath} V(i^*)\leq \frac{1}{C}L^*_{-i^*} + \frac{\varepsilon}{C} \leq \frac{1}{C} \big(2 OPT + 2 V(i^*)\big) + \frac{\varepsilon}{C}. \end{displaymath} Applying Lemma~\ref{lemma:greedy-bound}, \begin{displaymath} V(i^*) \leq \frac{1}{C}\left(\frac{2e}{e-1}\big(3 V(S_G) + 2 V(i^*)\big) + 2 V(i^*)\right) + \frac{\varepsilon}{C}. \end{displaymath} Thus, if $C$ is such that $C(e-1) -6e +2 > 0$, \begin{align*} V(i^*) \leq \frac{6e}{C(e-1)- 6e + 2} V(S_G) + \frac{(e-1)\varepsilon}{C(e-1)- 6e + 2}. \end{align*} Finally, using Lemma~\ref{lemma:greedy-bound} again, we get \begin{equation}\label{eq:bound2} OPT(V, \mathcal{N}, B) \leq \frac{3e}{e-1}\left( 1 + \frac{4e}{C (e-1) -6e +2}\right) V(S_G) + \frac{2e\varepsilon}{C(e-1)- 6e + 2}. \end{equation} To minimize the coefficients of $V_{i^*}$ and $V(S_G)$ in \eqref{eq:bound1} and \eqref{eq:bound2} respectively, we wish to chose $C$ that minimizes \begin{displaymath} \max\left(1+C,\frac{3e}{e-1}\left( 1 + \frac{4e}{C (e-1) -6e +2} \right)\right). \end{displaymath} This function has two minima, only one of those is such that $C(e-1) -6e +2 \geq 0$. This minimum is \begin{equation}\label{eq:constant} C = \frac{8e-1 + \sqrt{64e^2-24e + 9}}{2(e-1)}. \end{equation} For this minimum, $\frac{2e\varepsilon}{C^*(e-1)- 6e + 2}\leq \varepsilon.$ Placing the expression of $C$ in \eqref{eq:bound1} and \eqref{eq:bound2} gives the approximation ratio in \eqref{approxbound}, and concludes the proof of Theorem~\ref{thm:main}.\hspace*{\stretch{1}}\qed \section{Proof of Theorem \ref{thm:lowerbound}} Suppose, for contradiction, that such a mechanism exists. Consider two experiments with dimension $d=2$, such that $x_1 = e_1=[1 ,0]$, $x_2=e_2=[0,1]$ and $c_1=c_2=B/2+\epsilon$. Then, one of the two experiments, say, $x_1$, must be in the set selected by the mechanism, otherwise the ratio is unbounded, a contradiction. If $x_1$ lowers its value to $B/2-\epsilon$, by monotonicity it remains in the solution; by threshold payment, it is paid at least $B/2+\epsilon$. So $x_2$ is not included in the solution by budget feasibility and individual rationality: hence, the selected set attains a value $\log2$, while the optimal value is $2\log 2$.\hspace*{\stretch{1}}\qed