diff options
| author | Thibaut Horel <thibaut.horel@gmail.com> | 2015-03-05 12:25:08 -0500 |
|---|---|---|
| committer | Thibaut Horel <thibaut.horel@gmail.com> | 2015-03-05 12:25:08 -0500 |
| commit | 579c0b270aa2330a2fa7f2372c0080d610d9f9e9 (patch) | |
| tree | 75142200f6f1c474c34bbeb119b817c6bf5034ea /paper/sections/appendix.tex | |
| parent | 7858d95caef92777b47cb1051e12d964dd5d5d4e (diff) | |
| download | fast-seeding-579c0b270aa2330a2fa7f2372c0080d610d9f9e9.tar.gz | |
Add appendix labels
Diffstat (limited to 'paper/sections/appendix.tex')
| -rw-r--r-- | paper/sections/appendix.tex | 6 |
1 files changed, 6 insertions, 0 deletions
diff --git a/paper/sections/appendix.tex b/paper/sections/appendix.tex index aa0ef1d..c49e8cc 100644 --- a/paper/sections/appendix.tex +++ b/paper/sections/appendix.tex @@ -1,4 +1,5 @@ \section{Adaptivity proofs} +\label{sec:ad-proofs} \begin{proof}[of Proposition~\ref{prop:gap}] We will first show that the optimal adaptive policy can be interpreted as a non-adaptive policy. Let $S$ be the optimal adaptive @@ -75,12 +76,14 @@ $(S, \mathbf{q})$ is an $\alpha$-approximate non adaptive solution, hence $F(\ma \end{proof} \section{Algorithms Proofs} +\label{sec:alg-proofs} We first discuss the NP-hardness of the problem. \noindent \textbf{\textsc{NP}-Hardness.} In contrast to standard influence maximization, adaptive seeding is already \textsc{NP}-Hard even for the simplest cases. In the case when $f(S)=|S|$ and all probabilities equal one, the decision problem is whether given a budget $k$ and target value $\ell$ there exists a subset of $X$ of size $k-t$ which yields a solution with expected value of $\ell$ using $t$ nodes in $\mathcal{N}(X)$. This is equivalent to deciding whether there are $k-t$ nodes in $X$ that have $t$ neighbors in $\mathcal{N}(X)$. To see this is \textsc{NP}-hard, consider reducing from \textsc{Set-Cover} where there is one node $i$ for each input set $T_i$, $1\leq i\leq n$, with $\neigh{i}= T_i$ and integers $k,\ell$, and the output is ``yes'' if there is a family of $k$ sets in the input which cover at least $\ell$ elements, and ``no'' otherwise. \subsection{LP-based approach} +\label{sec:lp-proofs} In the LP-based approach we rounded the solution using the pipage rounding method. We discuss this with greater detail here. \noindent \textbf{Pipage Rounding.} @@ -89,6 +92,7 @@ We make the following key observation: for any given $\textbf{q}$, one can remov Our rounding procedure can therefore be described as follows: given a solution $(\boldsymbol\lambda,\textbf{q})$ we remove all nodes $u \in \mathcal{N}(X)$ for which $q_{u}=0$, which leaves us with a fractional solution to a (weighted) version of the \textsc{Max-K-Cover} problem where nodes in $X$ are the sets and the universe is the set of weighted nodes in $\mathcal{N}(X)$ that were not removed. We can therefore apply pipage rounding and lose only a factor of $(1-1/e)$ in quality of the solution. \subsection{Combinatorial Algorithm} +\label{sec:comb-proofs} We include the missing proofs from the combinatorial algorithm section. The scalability and implementation in MapReduce are discussed in this section as well. \begin{proof}[of Lemma~\ref{lemma:nd}] @@ -249,11 +253,13 @@ which concludes the proof of the lemma. \end{proof} \subsection{Parallelization} +\label{sec:para} As discussed in the body of the paper, the algorithm can be parallelized across $k$ different machines, each one computing an approximation for a fixed budget $k-t$ in the first stage and $t$ in the second. A slightly more sophisticated approach is to consider only $\log n$ splits: $(1,k-1),(2,k-2),\ldots,(2^{\lfloor \log n \rfloor},1)$ and then select the best solution from this set. It is not hard to see that in comparison to the previous approach, this would reduce the approximation guarantee by a factor of at most 2: if the optimal solution is obtained by spending $t$ on the first stage and $k-t$ in the second stage, then since $t \leq 2\cdot2^{\lfloor \log t \rfloor}$ the solution computed for $(2^{\lfloor \log t \rfloor}, k - 2^{\lfloor \log t \rfloor})$ will have at least half that value. More generally, for any $\epsilon>0$ one can parallelize over $\log_{1+\epsilon}n$ machines at the cost of losing a factor of $(1+\epsilon)$ in the approximation guarantee. \subsection{Implementation in MapReduce} +\label{sec:mr} As noted in Section~\ref{sec:comb}, lines 4 to 7 of Algorithm~\ref{alg:comb} correspond to the greedy heuristic of \cite{nemhauser} applied to the |
