diff options
Diffstat (limited to 'paper/sections/intro.tex')
| -rw-r--r-- | paper/sections/intro.tex | 8 |
1 files changed, 5 insertions, 3 deletions
diff --git a/paper/sections/intro.tex b/paper/sections/intro.tex index aa65b5d..d2db035 100644 --- a/paper/sections/intro.tex +++ b/paper/sections/intro.tex @@ -1,8 +1,10 @@ -Many real-world phenomena can be modeled as complex diffusion processes on networks: from behavior adoption, sharing of internet memes, citations of articles, and the spread of infectious diseases. Oftentimes, the exact network is unknown to us: we observe only the behavior of the nodes through time, without knowledge of who `influenced' whom. The spread of a particular behavior through a network is known as an {\it Influence Cascade}. In this context, the {\it Graph Inference} problem is to recover the edges of the graph from the observation of few influence cascades. +A recent line of research has focused on applying advances in sparse recovery to graph analysis. A graph can be interpreted as a signal that one seeks to `compress' or `sketch' and then `recovered'. However, we could also consider the situation where the graph is unknown to us, and we dispose of few measurements to recover the signal. Which kind of measurements do we dispose of in practice? -Recent research efforts have focused on constructing an effective algorithm which recovers a large majority of edges correctly from very few cascades. It has shown that the graph inference problem can be solved, under various assumptions, in ${\cal O}(poly(\Delta) \log m)$ number of observed cascades, where $\Delta$ is the maximum degree oand $m$ the number of nodes in the graph. In other words, the dependence of the number of cascades required to reconstruct the graph is (almost miraculously) logarithmic in the number of nodes of the graph. +A diffusion process on a graph, which depends on the edges and edge weights in the graph, provides valuable information. By observing the sequence of nodes which become `infected' over time without knowledge of who has `influenced' whom, can we recover the graph on which this process takes place? The spread of a particular behavior through a network is known as an {\it Influence Cascade}. In this context, the {\it Graph Inference} problem is to recover the edges of the graph from the observation of few influence cascades. We propose to show how sparse recovery can be applied to solve this recently introduced graph inference problem. -Since the first few papers on link prediction in networks, the research community has made good progress in defining the Graph Inference problem more clearly and suggesting effective algorithms. We show that these results can be improved to ${\cal O}(\Delta \log m)$, which is tight to a certain extent. We also that the edge weights themselves can be estimated under the same assumptions. +Recent research efforts in solving the graph inference problem have focused on constructing an effective algorithm which recovers a large majority of edges correctly from very few cascades. It has been shown that the graph inference problem can be solved in ${\cal O}(poly(s) \log m)$ number of observed cascades, where $s$ is the maximum degree and $m$ the number of nodes in the graph. In other words, the dependence of the number of cascades required to reconstruct the graph is (almost miraculously) logarithmic in the number of nodes of the graph. However, results in the sparse recovery literature lead us to believe that ${\cal O}(s \log m/s$ should be sufficient to recover the graph. In fact, we prove this lower bound in a very general sense. We also suggest a ${\cal O}(\Delta \log m)$-algorithm, which is almost tight. We also that the edge weights themselves can be estimated under the same assumptions. + +Since the first few papers on link prediction in networks, the research community has made good progress in defining the Graph Inference problem more clearly and suggesting effective algorithms. Throughout this paper, we focus on the two main diffusion processes, outlined in the seminal work \cite{Kempe:03}: the independent cascade model (IC) and the linear threshold model. |
