aboutsummaryrefslogtreecommitdiffstats
path: root/paper/sections/intro.tex
diff options
context:
space:
mode:
Diffstat (limited to 'paper/sections/intro.tex')
-rw-r--r--paper/sections/intro.tex55
1 files changed, 40 insertions, 15 deletions
diff --git a/paper/sections/intro.tex b/paper/sections/intro.tex
index f4f2bd5..bbbed11 100644
--- a/paper/sections/intro.tex
+++ b/paper/sections/intro.tex
@@ -8,11 +8,11 @@ Tackling the graph inference problem means constructing a polynomial-time algori
Throughout this paper, we focus on the three main discrete-time diffusion processes: the independent cascade model, the voter model, and the linear threshold model...
\end{comment}
-Graphs have been extensively studied in their ability to propagate
-
-Networks and social networks in particular have become an increasingly common
-subject of study in the recent years.
-
+Graphs have been extensively studied for their propagative abilities:
+connectivity, routing, gossip algorithms, etc.
+%One question is: can we
+%understand and predict a diffusion process from the graph? Conversely, can we
+%learn a graph from the diffusion process?
A diffusion process taking place over a graph provides valuable information
about the presence and weights of its edges. \emph{Influence cascades} are a
specific type of diffusion processes in which a particular infection behavior
@@ -68,19 +68,44 @@ Section~\ref{sec:experiments}.
\paragraph{Related Work}
-The study of edge prediction in graph has been an active field of research for over a decade \cite{Nowell08} \cite{Leskovec07} \cite{AdarA05}. \cite{GomezRodriguez:2010} introduced the submodular {\scshape netinf} algorithm, which approximates the likelihood of cascades represented as a continuous process. The algorithm was later improved in later work \cite{gomezbalduzzi:2011}, but is not known to have any recovery guarantees beside empirical validation on synthetic networks. \cite{Netrapalli:2012} studied the discrete-time version of the independent cascade model and obtained the first ${\cal O}(s^2 \log m)$ guarantee on general networks. The algorithm is based around the same likelihood function we suggest, without the $\ell1$-norm penalty. However, the analysis depended strongly on a restrictive {\it correlation decay} assumption, which strongly limits the number of new infections at every step. In this restricted setting, they show a complex lower bound of the number of cascades needed for perfect support recovery with constant probability of the order $\Omega(s \log (m/s))$. They also suggest a {\scshape greedy} algorithm, which matches ${\cal O}(s \log m)$ guarantee in the case of \emph{tree} graphs.
+The study of edge prediction in graphs has been an active field of research for
+over a decade \cite{Nowell08} \cite{Leskovec07} \cite{AdarA05}.
+\cite{GomezRodriguez:2010} introduced the {\scshape Netinf} algorithm, which
+approximates the likelihood of cascades represented as a continuous process.
+The algorithm was improved in later work \cite{gomezbalduzzi:2011}, but is not
+known to have any recovery guarantees beside empirical validation on synthetic
+networks. \cite{Netrapalli:2012} studied the discrete-time version of the
+independent cascade model and obtained the first ${\cal O}(s^2 \log m)$
+recovery guarantee on general networks. The algorithm is based around the same
+likelihood function we suggest, without the $\ell_1$-norm penalty. Their
+analysis depends on a {\it correlation decay} assumption, which limits the
+number of new infections at every step. In this setting, they show a lower
+bound of the number of cascades needed for support recovery with constant
+probability of the order $\Omega(s \log (m/s))$. They also suggest a {\scshape
+Greedy} algorithm, which matches the ${\cal O}(s \log m)$ guarantee in the case
+of tree graphs. The work of \cite{Abrahao:13} study the same continuous-model framework as
+\cite{GomezRodriguez:2010} and obtain a ${\cal O}(s^9 \log^2 s \log m)$ support
+recovery algorithm without the \emph{correlation decay} assumption.
+
+\begin{comment}
+They assume a single-source model, where only one source is selected at random, which is less realistic in practice since `patient 0' is rarely unknown to us.
+\end{comment}
-The work of \cite{Abrahao:13} study the same continuous-model framework as \cite{GomezRodriguez:2010} and obtain a ${\cal O}(s^9 \log^2 s \log m)$ support recovery algorithm without the \emph{correlation decay} assumption. Their work also studies the information leveraged by different `parts' of the cascade, showing that a surprisingly important amount of information can be gleaned from the first and last `infections' of the cascade. They assume a single-source model, where only one source is selected at random, which is less realistic in practice since `patient 0' is rarely unknown to us.
+Closest to this work is a recent paper by \cite{Daneshmand:2014}, wherein they
+consider a $\ell_1$-regularized objective function. They adapt standard results
+from sparse recovery to obtain a recovery bound of ${\cal O}(s^3 \log m)$ on
+their algorithm under an irrepresentability condition. Under stronger
+assumptions, they match the \cite{Netrapalli:2012} bound of ${\cal O}(s^2 \log
+m)$, by exploiting similar properties of the convex program's KKT conditions.
+In contrast, our work studies discrete-time diffusion processes including the
+Independent Cascade model under weaker assumptions. Furthermore, we analyze
+both the recovery of the graph's edges and the estimation of the model's
+parameters and achieve close to optimal bounds.
-Closest to this work is a recent paper by \cite{Daneshmand:2014}, they consider
-a $\ell_1$-norm penalty to their objective function, adapting standard results
-from sparse recovery to obtain a ${\cal O}(s^3 \log m)$ algorithm under an
-irrepresentability condition. With stronger assumptions, they match the
-\cite{Netrapalli:2012} bound of ${\cal O}(s^2 \log m)$, by exploiting
-a similar properties of the objective's KKT conditions. Their work has the merit of studying a generalization of the discrete-time independent cascade model to continuous functions. Similarly to \cite{Abrahao:13}, they place themselves in
+\begin{comment}
+Their work has the merit of studying a generalization of the discrete-time independent cascade model to continuous functions. Similarly to \cite{Abrahao:13}, they place themselves in
the restrictive single-source context.
-
-TODO: add related works on lower bounds.
+\end{comment}
\begin{comment}
\paragraph{Our contributions}