WWW version

author: Thibaut Horel <thibaut.horel@gmail.com> 2014-11-22 21:08:41 -0500
committer: Thibaut Horel <thibaut.horel@gmail.com> 2014-11-22 21:08:41 -0500
commit: 36eb1fee5492e57368846cbf4e107f1e4cb31589 (patch)
tree: 6380028284779e10d01fb9ff51f3c561ae9ce57c /paper/sections/adaptivity.tex
parent: 4f7d4804234f5515a4dded8b05d9568653b7ae3c (diff)
download: fast-seeding-36eb1fee5492e57368846cbf4e107f1e4cb31589.tar.gz
1 files changed, 221 insertions, 0 deletions
diff --git a/paper/sections/adaptivity.tex b/paper/sections/adaptivity.tex
new file mode 100644
index 0000000..d590408
--- /dev/null
+++ b/paper/sections/adaptivity.tex
@@ -0,0 +1,221 @@
+The challenging aspect of the adaptive seeding problem expressed in
+Equation~\ref{eq:problem} is its adaptivity: a seed set must be selected during
+the first stage such that in expectation a high influence value can be reached
+when adaptively selecting nodes on the second stage.  A standard approach in 
+stochastic optimization for overcoming this challenge is to use sampling to
+estimate the expectation of the influence value reachable on the second stage.
+However, as will be discussed in Section~\ref{sec:experiments}, this approach
+quickly becomes infeasible even with modest size graphs.
+
+In this section we develop an approach which avoids sampling and allows
+designing adaptive seeding algorithms that can be applied to large graphs.  We
+show that for additive influence functions one can optimize a relaxation of the
+problem which we refer to as the \emph{non-adaptive} version of the problem.
+After defining the non-adaptive version, we show in sections~\ref{sec:gap} that
+the optimal solution for the non-adaptive version is an upper bound on the
+optimal solution of the adaptive seeding problem.  We then argue in
+Section~\ref{sec:round} that any solution to the non-adaptive version of the
+problem can be converted to an adaptive solution, losing an arbitrarily small
+factor in the approximation ratio.  Together, this implies that one can design
+algorithms for the non-adaptive problem instead, as we do in
+Section~\ref{sec:algorithms}.\newline 
+      
+%
+%, namely \emph{non-adaptive}. As we will discuss in
+%Sections~\ref{sec:gap} and~\ref{sec:round}, these non-adaptive policies can be
+%used as tool to construct adaptive solutions.
+%
+
+\noindent \textbf{Non-adaptive policies.}  We say that a policy is \emph{non-adaptive} if it selects a set of nodes $S
+\subseteq X$ to be seeded in the first stage and a vector of probabilities
+$\mathbf{q}\in[0,1]^n$, such that each neighbor $u$ of $S$ which realizes is
+included in the solution independently with probability $q_u$.  The constraint
+will now be that the budget is only respected in expectation, \emph{i.e.}  $|S|
++ \textbf{p}^T\textbf{q} \leq k$. Formally the optimization problem for
+non-adaptive policies can be written as:
+\begin{equation}\label{eq:relaxed}
+    \begin{split}
+    %&\max_{}
+        \max_{\substack{S\subseteq X\\\textbf{q}\in[0,1]^n} }& \;
+    \sum_{R\subseteq\neigh{X}} \Big (\prod_{u\in R} p_uq_u\prod_{u\in\neigh{X}\setminus
+    R}(1-p_uq_u) \Big )
+ f(R)\\
+    \text{s.t. } & \; |S|+\textbf{p}^T\textbf{q} \leq k,\;
+q_u \leq \mathbf{1}\{u\in\neigh{S}\}
+\end{split}
+\end{equation}
+%$\textbf{p}\odot \textbf{q}$ denotes componentwise multiplication and $\textbf{q}_R$ denotes the positive entries of $\textbf{q}$ on nodes in $R$: 
+%
+%\begin{displaymath}
+%    \textbf{p}\circ \textbf{q}_R = \prod_{u\in R} p_uq_u\prod_{u\in\neigh{X}\setminus
+%    R}(1-p_uq_u)
+%\end{displaymath}
+%
+%\begin{displaymath}
+%    \Pr[R \ |\ \textbf{p}\odot \textbf{q} ] = \prod_{u\in R} p_uq_u\prod_{u\in\neigh{X}\setminus
+%    R}(1-p_uq_u)
+%\end{displaymath}
+where we denote by $\mathbf{1}\{E\}$ the indicator variable of the event $E$.
+Note that because of the condition $q_u\leq \mathbf{1}\{u\in\neigh{S}\}$,
+the summand associated with $R$ in \eqref{eq:relaxed} vanishes whenever $R$
+contains $u\in\neigh{X}\setminus\neigh{S}$. Hence, the summation is restricted
+to $R\subseteq\neigh{S}$ as in \eqref{eq:problem}.
+
+%The relaxed non-adaptive optimization \eqref{eq:relaxed} problem can be written
+%more concisely using the \emph{multi-linear extension} of $f$:
+%\begin{displaymath}
+%    \forall p\in[0,1]^m,\; F(p)
+%    \defeq\mathbb{E}_{p_R}\big[f(R)\big]
+%    =\sum_{R\subseteq\neigh{X}} p_R f(R)
+%\end{displaymath}
+%This \emph{extension by expectation} is known to present cross-convecity
+%properties which can exploited in maximization problems when $f$ is
+%a submodular function \cite{dughmi, vondrak}. Using this definition, \eqref{eq:relaxed}
+%can be re-written as:
+%\begin{equation}\label{eq:relaxed-multi}
+%    \begin{split}
+%    &\max_{S\subseteq X} \max_{q\in[0,1]^n}F(p\circ q)\\
+%    &\text{s.t. }|S|+\sum_{u\in\neigh{X}}p_uq_u \leq k,\;
+%q_u \leq \mathbf{1}_{\{u\in\neigh{S}\}}
+%\end{split}
+%\end{equation}
+%
+%When $f$ is an ìnfluence function from the triggering model, it has been proved
+%in \cite{singer} that adaptive seeding has a bounded \emph{adaptivity gap}:
+%denoting by $OPT$ the optimal solution of \eqref{eq:problem} and by $OPT_{NA}$
+%the optimal solution of \eqref{eq:relaxed}, we have $OPT\leq\alpha OPT_{NA}$.
+%This inequality justifies using non-adaptive policies to approximate solutions
+%to the adaptive seeding problem.
+
+\subsection{Adaptivity Gap}\label{sec:gap}
+
+We will now justify the use of non-adaptive strategies by showing that the
+optimal solution for this form of non-adaptive strategies yields a higher value
+than adaptive ones.  For brevity, given a probability vector $\pi\in[0,1]^m$ we
+write:
+\begin{equation}\label{eq:multi}
+        F(\pi) \defeq
+        \sum_{R\subseteq\neigh{X}}\left(\prod_{u\in
+        R}\pi_u\prod_{u\in\neigh{X}\setminus R}(1-\pi_u)\right)
+    f(R)
+\end{equation}
+as well as $\textbf{p}\otimes \textbf{q}$ to denote the component-wise
+multiplication between vectors $\textbf{p}$ and $\textbf{q}$.  Finally, we write
+$\mathcal{F}_{A} \defeq \{S \subseteq X : |S|\leq k\}$, and $\mathcal{F}_{NA}
+\defeq\{(S,\textbf{q}), |S|+\textbf{p}^T\textbf{q} \leq k, q_u \leq
+\mathbf{1}_{\{u\in\neigh{S}\}}\}$ to denote the feasible regions of the
+adaptive and non-adaptive problems, respectively.
+
+%this form of non-adap
+%We already know that the adaptivity gap is bounded for a general class of
+%influence function. For adaptive functions, we get a stronger result:
+%\emph{relaxed-non adaptive policies are stronger than non-adaptive policies}.
+%
+%
+\begin{proposition}\label{prop:gap}
+For additive functions given by \eqref{eq:voter}, the value of the optimal
+adaptive policy is upper bounded by the optimal non-adaptive policy:
+    \begin{displaymath}
+        \begin{aligned}[t]
+            &\max_{S\subseteq X} \sum_{R\subseteq\neigh{S}} p_R
+            \max_{\substack{T\subseteq R\\|T|\leq k-|S|}}f(T)\\
+            &\text{s.t. }S \in \mathcal{F}_{A}
+        \end{aligned}
+        \leq
+        \begin{aligned}[t]
+            &\max_{\substack{S\subseteq X\\\textbf{q}\in[0,1]^n}}
+            F(\mathbf{p}\otimes\mathbf{q})\\
+            &\text{s.t. } (S,\textbf{q}) \in \mathcal{F}_{NA}
+                    \end{aligned}
+    \end{displaymath}
+\end{proposition}
+
+%\begin{proposition}\label{prop:gap}
+%Let $\mathcal{F}_{A} := \{T \subseteq X : |T|\leq k\}$, $\mathcal{F}_{NA} :=\{(T,\textbf{q}), |S|+\textbf{p}^T\textbf{q} \leq k, q_u \leq \mathbf{1}_{\{u\in\neigh{S}\}}\}$.
+%For additive functions of the form given by \eqref{eq:voter}, non-adaptive
+%    policies are stronger than adaptive policies:
+%    \begin{displaymath}
+%        \begin{aligned}[t]
+%            &\max_{S\subseteq X} \sum_{R\subseteq\neigh{S}} p_R
+%            \max_{\substack{T\subseteq R\\|T|\leq k-|S|}}f(T)\\
+%            &\text{s.t. }|S|\leq k
+%        \end{aligned}
+%        \leq
+%        \begin{aligned}[t]
+%            &\max_{S\subseteq X} \max_{q\in[0,1]^n}F(p\circ q)\\
+%            &\text{s.t. }|S|+\sum_{u\in\neigh{X}}p_uq_u \leq k,\;
+%            q_u \leq \mathbf{1}_{\{u\in\neigh{S}\}}
+%        \end{aligned}
+%    \end{displaymath}
+%\end{proposition}
+
+The proof of this proposition can be found in the full version of the
+paper~\cite{full} and relies on the following fact: the optimal adaptive policy
+can be written as a feasible non-adaptive policy, hence it provides a lower
+bound on the value of the optimal non-adaptive policy.
+
+\subsection{From Non-Adaptive to Adaptive Solutions}\label{sec:round}
+
+From the above proposition we now know that optimal non-adaptive solutions have
+higher values than adaptive solutions. Given a non-adaptive solution
+$(S,\mathbf{q})$, a possible scheme would be to use $S$ as an adaptive
+solution.  But since $(S, \mathbf{q})$ is a solution to the non-adaptive
+problem, Proposition~\ref{prop:gap} does not provide any guarantee on how well
+$S$ performs as an adaptive solution.
+
+However, we show that from a non-adaptive solution $(S,\mathbf{q})$, we can
+obtain a lower bound on the adaptive value of $S$, that is, the expected
+influence attainable in expectation over all possible arrivals of neighbors of
+$S$. Starting from $S$, in every realization of neighbors $R$, sample every
+node $u \in R \cap \mathcal{N}(S)$ with probability $q_{u}$, to obtain a random
+set of nodes $I_R \subseteq R \cap S$. $(S, \mathbf{q})$ being a non-adaptive
+solution, it could be that selecting $I_R$ exceeds our budget. Indeed, the only
+guarantee that we have is that $|S| + \mathbb{E}\big[|I_R|\big]\leq k$. As
+a consequence, an adaptive solution starting from $S$ might not be able to
+select $I_R$ on the second stage.
+
+%execute an adaptive policy by selecting a set $S$ in the first stage, and then in the second stage, while there is still budget remaining, select every neighbor $u$ of $S$ that realized with probability $q_{u}$
+
+
+%To do so, one needs to prove that in expectation over all the realizations, the average values obtainable in the second stage, are close to the value of the non-adaptive objective evaluated over $(S,\mathbf{q})$.
+ 
+%the objective function of the adaptive problem \eqref{eq:relaxed}. However, the set resulting from the sampling might exceed the budget on the second stage, hence preventing us from directly obtaining a feasible adaptive solution.    
+%Given a a non-adaptive policy $(S,\mathbf{q})$ one can execute an adaptive policy by selecting a set $S$ in the first stage, and then in the second stage, while there is still budget remaining, select every neighbor $u$ of $S$ that realized with probability $q_{u}$.  
+%In order to use the above lemma, one needs to show that in almost all realizations   
+%non-adaptive policy
+
+%Fortunately, by using contention resolution schemes~\cite{vondrak} given a feasible solution $\mathbf{q}$ to the non-adaptive version, one can compute a \emph{feasible} random set $I$, such that:
+%\begin{equation}\label{eq:cr}
+%\mathbb{E}\big[f(I)\big] 
+%\geq (1-\varepsilon) F(\mathbf{q}) 
+%\end{equation}
+
+Fortunately, the probability of exceeding the budget is small enough and with
+high probability $I_R$ will be feasible. This is exploited in \cite{vondrak} to
+design a randomized rounding method with approximation guarantees. These
+rounding methods are called \emph{contention resolution schemes}.
+%More precisely, we note that once the set $S$ is fixed, the feasibility constraint of problem~\eqref{eq:relaxed} becomes a single Knapsack constraint, for which \cite{vondrak} constructs a $(1-\varepsilon, 1-\varepsilon)$-balanced contention resolution scheme.
+Theorem~1.3 of this paper gives us a contention resolution scheme which will
+compute from $\mathbf{q}$ and for any realization $R$ a \emph{feasible} set
+$\tilde{I}_R$, such that:
+\begin{equation}
+    \label{eq:cr}
+    \mathbb{E}_R\big[f(\tilde{I}_R)\big] \geq (1-\varepsilon) F(\mathbf{q})
+\end{equation}
+What this means is that starting from a non-adaptive solution $(S,\mathbf{q})$,
+there is a way to construct a random \emph{feasible} subset on the second stage
+such that in expectation, this set attains almost the same influence value as
+the non-adaptive solution. Since the adaptive solution starting from $S$ will
+select optimally from the realizations $R\subseteq\neigh{S}$,
+$\mathbb{E}_R[f(\tilde{I}_R)]$ provides a lower bound on the adaptive value of
+$S$ that we denote by $A(S)$.
+
+More precisely, denoting by $\text{OPT}_A$ the optimal value of the adaptive
+problem~\eqref{eq:problem}, we have the following proposition whose proof can
+be found in the full version of this paper~\cite{full}.
+\begin{proposition}\label{prop:cr}
+    Let $(S,\textbf{q})$ be an $\alpha$-approximate solution to the
+    non-adaptive problem \eqref{eq:relaxed}, then $\mathrm{A}(S) \geq \alpha
+    \mathrm{OPT}_A$.
+\end{proposition}  
+
author	Thibaut Horel <thibaut.horel@gmail.com>	2014-11-22 21:08:41 -0500
committer	Thibaut Horel <thibaut.horel@gmail.com>	2014-11-22 21:08:41 -0500
commit	36eb1fee5492e57368846cbf4e107f1e4cb31589 (patch)
tree	6380028284779e10d01fb9ff51f3c561ae9ce57c /paper/sections/adaptivity.tex
parent	4f7d4804234f5515a4dded8b05d9568653b7ae3c (diff)
download	fast-seeding-36eb1fee5492e57368846cbf4e107f1e4cb31589.tar.gz