diff options
Diffstat (limited to 'paper/sections/intro.tex')
| -rw-r--r-- | paper/sections/intro.tex | 55 |
1 files changed, 40 insertions, 15 deletions
diff --git a/paper/sections/intro.tex b/paper/sections/intro.tex index f4f2bd5..bbbed11 100644 --- a/paper/sections/intro.tex +++ b/paper/sections/intro.tex @@ -8,11 +8,11 @@ Tackling the graph inference problem means constructing a polynomial-time algori Throughout this paper, we focus on the three main discrete-time diffusion processes: the independent cascade model, the voter model, and the linear threshold model... \end{comment} -Graphs have been extensively studied in their ability to propagate - -Networks and social networks in particular have become an increasingly common -subject of study in the recent years. - +Graphs have been extensively studied for their propagative abilities: +connectivity, routing, gossip algorithms, etc. +%One question is: can we +%understand and predict a diffusion process from the graph? Conversely, can we +%learn a graph from the diffusion process? A diffusion process taking place over a graph provides valuable information about the presence and weights of its edges. \emph{Influence cascades} are a specific type of diffusion processes in which a particular infection behavior @@ -68,19 +68,44 @@ Section~\ref{sec:experiments}. \paragraph{Related Work} -The study of edge prediction in graph has been an active field of research for over a decade \cite{Nowell08} \cite{Leskovec07} \cite{AdarA05}. \cite{GomezRodriguez:2010} introduced the submodular {\scshape netinf} algorithm, which approximates the likelihood of cascades represented as a continuous process. The algorithm was later improved in later work \cite{gomezbalduzzi:2011}, but is not known to have any recovery guarantees beside empirical validation on synthetic networks. \cite{Netrapalli:2012} studied the discrete-time version of the independent cascade model and obtained the first ${\cal O}(s^2 \log m)$ guarantee on general networks. The algorithm is based around the same likelihood function we suggest, without the $\ell1$-norm penalty. However, the analysis depended strongly on a restrictive {\it correlation decay} assumption, which strongly limits the number of new infections at every step. In this restricted setting, they show a complex lower bound of the number of cascades needed for perfect support recovery with constant probability of the order $\Omega(s \log (m/s))$. They also suggest a {\scshape greedy} algorithm, which matches ${\cal O}(s \log m)$ guarantee in the case of \emph{tree} graphs. +The study of edge prediction in graphs has been an active field of research for +over a decade \cite{Nowell08} \cite{Leskovec07} \cite{AdarA05}. +\cite{GomezRodriguez:2010} introduced the {\scshape Netinf} algorithm, which +approximates the likelihood of cascades represented as a continuous process. +The algorithm was improved in later work \cite{gomezbalduzzi:2011}, but is not +known to have any recovery guarantees beside empirical validation on synthetic +networks. \cite{Netrapalli:2012} studied the discrete-time version of the +independent cascade model and obtained the first ${\cal O}(s^2 \log m)$ +recovery guarantee on general networks. The algorithm is based around the same +likelihood function we suggest, without the $\ell_1$-norm penalty. Their +analysis depends on a {\it correlation decay} assumption, which limits the +number of new infections at every step. In this setting, they show a lower +bound of the number of cascades needed for support recovery with constant +probability of the order $\Omega(s \log (m/s))$. They also suggest a {\scshape +Greedy} algorithm, which matches the ${\cal O}(s \log m)$ guarantee in the case +of tree graphs. The work of \cite{Abrahao:13} study the same continuous-model framework as +\cite{GomezRodriguez:2010} and obtain a ${\cal O}(s^9 \log^2 s \log m)$ support +recovery algorithm without the \emph{correlation decay} assumption. + +\begin{comment} +They assume a single-source model, where only one source is selected at random, which is less realistic in practice since `patient 0' is rarely unknown to us. +\end{comment} -The work of \cite{Abrahao:13} study the same continuous-model framework as \cite{GomezRodriguez:2010} and obtain a ${\cal O}(s^9 \log^2 s \log m)$ support recovery algorithm without the \emph{correlation decay} assumption. Their work also studies the information leveraged by different `parts' of the cascade, showing that a surprisingly important amount of information can be gleaned from the first and last `infections' of the cascade. They assume a single-source model, where only one source is selected at random, which is less realistic in practice since `patient 0' is rarely unknown to us. +Closest to this work is a recent paper by \cite{Daneshmand:2014}, wherein they +consider a $\ell_1$-regularized objective function. They adapt standard results +from sparse recovery to obtain a recovery bound of ${\cal O}(s^3 \log m)$ on +their algorithm under an irrepresentability condition. Under stronger +assumptions, they match the \cite{Netrapalli:2012} bound of ${\cal O}(s^2 \log +m)$, by exploiting similar properties of the convex program's KKT conditions. +In contrast, our work studies discrete-time diffusion processes including the +Independent Cascade model under weaker assumptions. Furthermore, we analyze +both the recovery of the graph's edges and the estimation of the model's +parameters and achieve close to optimal bounds. -Closest to this work is a recent paper by \cite{Daneshmand:2014}, they consider -a $\ell_1$-norm penalty to their objective function, adapting standard results -from sparse recovery to obtain a ${\cal O}(s^3 \log m)$ algorithm under an -irrepresentability condition. With stronger assumptions, they match the -\cite{Netrapalli:2012} bound of ${\cal O}(s^2 \log m)$, by exploiting -a similar properties of the objective's KKT conditions. Their work has the merit of studying a generalization of the discrete-time independent cascade model to continuous functions. Similarly to \cite{Abrahao:13}, they place themselves in +\begin{comment} +Their work has the merit of studying a generalization of the discrete-time independent cascade model to continuous functions. Similarly to \cite{Abrahao:13}, they place themselves in the restrictive single-source context. - -TODO: add related works on lower bounds. +\end{comment} \begin{comment} \paragraph{Our contributions} |
