Poster abstract: algorithms and experiments

author: Thibaut Horel <thibaut.horel@gmail.com> 2015-03-15 23:57:58 -0400
committer: Thibaut Horel <thibaut.horel@gmail.com> 2015-03-15 23:57:58 -0400
commit: 6ea47e2ef70f627f6ffa2dec997b1634e9c17b34 (patch)
tree: cbd0122cf6d0d51de4380a5fd8e6ea4ff23fea32
parent: 2b0c6dee9da471600a7b7127d9f55c7635458a99 (diff)
download: fast-seeding-6ea47e2ef70f627f6ffa2dec997b1634e9c17b34.tar.gz
1 files changed, 102 insertions, 15 deletions
diff --git a/poster_abstract/main.tex b/poster_abstract/main.tex
index e5688ff..7ed5caa 100644
--- a/poster_abstract/main.tex
+++ b/poster_abstract/main.tex
@@ -2,6 +2,10 @@
 \pdfpagewidth=8.5in
 \pdfpageheight=11in
 \usepackage[utf8x]{inputenc}
+\usepackage[english]{babel}
+\usepackage{microtype}
+\usepackage{caption}
+\usepackage{subcaption}
 
 \usepackage{amsmath, amsfonts, amssymb, bbm}
 \usepackage{verbatim}
@@ -72,7 +76,7 @@ Despite the various complexities in such optimization problems, we show that
 scalable adaptive seeding is achievable. To show the effectiveness of our
 methods we collected data from various verticals social network users follow,
 and applied our methods on it.  Our experiments show that adaptive seeding is
-scalable, and importantly, that it obtains dramatic improvements over standard
+scalable, and that it obtains dramatic improvements over standard
 approaches of information dissemination.
 \end{abstract}
 
@@ -82,7 +86,7 @@ Algorithms and Problems}
 
 \section{Introduction}
 
-Influence Maximization~\cite{KKT03, DR01} is the algorithmic challenge of
+Influence Maximization~\cite{DR01} is the algorithmic challenge of
 selecting a fixed number of individuals who can serve as early adopters of
 a new idea, product, or technology in a manner that will trigger a large
 cascade in the social network.  In many cases where influence maximization
@@ -100,8 +104,8 @@ core set is ineffective.
     \centering
     \includegraphics[scale=0.55]{images/dist.pdf}
     \vspace{-20pt}
-    \caption{CDF of the degree distribution of users who liked a post by Kiva
-    on Facebook and that of their friends.}
+    \caption{\small{CDF of the degree distribution of users who liked a post by Kiva
+    on Facebook and that of their friends.}}
     \label{fig:para}
     \vspace{-15pt}
 \end{figure}
@@ -114,8 +118,8 @@ budget on the influential friends who hopefully have arrived.  The idea behind
 this approach is to leverage a structural phenomenon in social networks known
 as the friendship paradox~\cite{feld1991}: even though individuals are not
 likely to have many friends, they likely have a friend that does (``your
-friends have more friends than you''). Figure~\ref{fig:para} gives an
-example of such an effect on Facebook.
+friends have more friends than you''). Figure~\ref{fig:para} illustrates this
+effect on Facebook.
 
 In this work, we present efficient algorithms for adaptive seeding achieving an
 optimal approximation ratio of $(1-1/e)$.  The guarantees of our algorithms
@@ -135,15 +139,15 @@ networks.
 
 \section{Framework}
 
-\noindent\textbf{Model.} Given a graph $G=(V,E)$, for $S\subseteq  V$ we denote by $\neigh{S}$ the
-neighborhood of $S$. The notion of influence in the graph is captured by
-a function $f:2^{|V|}\rightarrow \reals_+$ mapping a subset of nodes to
-a non-negative influence value. In this work, we focus on linear influence
-functions: $f(S) = \sum_{u\in S} w_u$ where $(w_u)_{u\in V}$ are non-negative
-weights capturing the influence of individual vertices. The input of the
-\emph{adaptive seeding} problem is a \emph{core set} of nodes $X\subseteq V$
-and for any node $u\in\neigh{X}$ a probability $p_u$ that $u$ realizes if one
-of its neighbor in $X$ is seeded. The goal is to solve the following problem:
+\noindent\textbf{Model.} Given a graph $G=(V,E)$, for $S\subseteq  V$ we denote
+by $\neigh{S}$ the neighborhood of $S$. The notion of influence in the graph is
+captured by a function $f:2^{|V|}\rightarrow \reals_+$ mapping a subset of
+nodes to a non-negative influence value. In this work, we focus on linear
+influence functions: $f(S) = \sum_{u\in S} w_u$ where $(w_u)_{u\in V}$ are
+non-negative weights capturing the influence of individual vertices. The input
+of the \emph{adaptive seeding} problem is a \emph{core set} of nodes
+$X\subseteq V$ and for any node $u\in\neigh{X}$ a probability $p_u$ that $u$
+realizes if one of its neighbor in $X$ is seeded. The goal is to solve:
 \begin{equation}\label{eq:problem}
     \begin{split}
         &\max_{S\subseteq X} \sum_{R\subseteq\neigh{S}} p_R
@@ -183,6 +187,89 @@ Non-adaptive policies are related to adaptive policies:
     \eqref{eq:problem}.
 \end{proposition}  
 
+\begin{figure}[t]
+    \centerline{ \includegraphics[width=0.4\textwidth]{images/comp2.pdf} }
+    \vspace{-10pt}
+    \caption{\small{Ratio of the performance of adaptive seeding to \textsf{IM}. Bars
+    represents the mean improvement across all verticals, and the ``error bars''
+represents the range of improvement across verticals.}}
+    \label{fig:compare}
+    \vspace{-15pt}
+\end{figure}
+
+\section{Algorithms}
+
+Proposition~\ref{prop:cr} allows us to focus on designing non-adaptive policies
+for \eqref{eq:relaxed} which is easier to solve than \eqref{eq:problem}.
+
+Our first algorithm is obtained by considering a relaxation of
+\eqref{eq:relaxed} where the binary choices of including vertices in $S$ are
+relaxed to fractional values. The solution must then be rounded using the
+Pipage Rounding framework.
+
+The second algorithm is combinatorial: first, we
+note that for additive influence functions and for fixed $S$, the maximization
+over $\mathbf{q}$ in \eqref{eq:relaxed} is a simple fractional knapsack problem
+which can be solved efficiently. Furthermore, the optimal value of this problem
+is a monotone submodular function of $S$. Our algorithm can thus be obtained by
+applying the celebrated greedy algorithm for monotone submodular maximization
+where we repeatedly solve fractional knapsack problems when greedily
+constructing the solution.
+
+Both algorithms achieve an optimal $(1-1/e)$ approximation ratio. The first
+algorithm is extremely efficient over instances where there is a large budget.
+The second algorithm can be easily parallelized and implemented in MapReduce,
+has good theoretical guarantees on its running time and does well on instances
+with smaller budgets.
+
+\section{Experiments}
+
+The main component of our experiments involved collecting publicly available
+data from Facebook.  Despite the extreme difficulty of collecting such data, we
+were able to collect large networks. For 10 several Facebook Pages, each
+associated with a commercial entity that uses the Facebook page to communicate
+with its followers, we selected a post and then collected data about the users
+who expressed interest (``liked'') the post and their friends.  The advantage
+of this data set is that it is highly representative of the scenario we study
+here. We focused on posts which were liked by about 1,000 users, which when we
+include their friends, leads to networks of about 100,000 users.
+
+
+\begin{figure}[t]
+    \begin{subfigure}[t]{0.23\textwidth}
+    \includegraphics[scale=0.48]{images/prob.pdf}
+    \vspace{-15pt}
+    \caption{}
+    \label{fig:prob}
+\end{subfigure}
+\hspace{1pt}
+\begin{subfigure}[t]{0.23\textwidth}
+    \includegraphics[scale=0.48]{images/hbo_likes.pdf}
+    \vspace{-15pt}
+    \caption{}
+    \label{fig:killer}
+     \end{subfigure}
+     \vspace{-10pt}
+ \caption{\small{(a) Performance of adaptive seeding for various propagation
+     probabilities. (b) Performance of \emph{adaptive seeding} when restricted
+ to the subgraph of users who \emph{liked} HBO (red line).}}
+ \vspace{-15pt}
+\end{figure}
+
+Figure~\ref{fig:compare} compares the performance of our approach to running
+the standard influence maximization (IM) approach to the core set.
+Figure~\ref{fig:prob} shows the impact of the probability of neighbors
+realizing, while Figure~\ref{fig:killer} shows the performance of adaptive
+seeding when restricted to users who previously expressed interest in the
+vertical and for which we could expect the probability of realizing to be close
+to one.  These suggest that adaptive seeding can have dramatic improvements
+over standard IM. \cite{full} contains additional experiments to analyze the
+impact of various parameters as well as evaluations on synthetic data.\newline
+
+\noindent\textbf{Acknowledgement.}
+This research is supported in part by a Google Research Grant and NSF grant
+CCF-1301976.
+
 \bibliographystyle{abbrv}
 \bibliography{main}
author	Thibaut Horel <thibaut.horel@gmail.com>	2015-03-15 23:57:58 -0400
committer	Thibaut Horel <thibaut.horel@gmail.com>	2015-03-15 23:57:58 -0400
commit	6ea47e2ef70f627f6ffa2dec997b1634e9c17b34 (patch)
tree	cbd0122cf6d0d51de4380a5fd8e6ea4ff23fea32
parent	2b0c6dee9da471600a7b7127d9f55c7635458a99 (diff)
download	fast-seeding-6ea47e2ef70f627f6ffa2dec997b1634e9c17b34.tar.gz