diff options
| author | jeanpouget-abadie <jean.pougetabadie@gmail.com> | 2015-01-31 12:08:02 -0500 |
|---|---|---|
| committer | jeanpouget-abadie <jean.pougetabadie@gmail.com> | 2015-01-31 12:08:02 -0500 |
| commit | 6e8c086ad5e54b68eb5653f1c9d02682b7813aa1 (patch) | |
| tree | 3bfbbec4a1a4d9f77da1eb572557a58311b380f7 | |
| parent | 2ea212ea11dd8cde1c6fc380c1ba136276a22a43 (diff) | |
| download | cascades-6e8c086ad5e54b68eb5653f1c9d02682b7813aa1.tar.gz | |
changing structure of model section
| -rw-r--r-- | paper/sections/assumptions.tex | 19 | ||||
| -rw-r--r-- | paper/sections/intro.tex | 2 | ||||
| -rw-r--r-- | paper/sections/model.tex | 54 |
3 files changed, 38 insertions, 37 deletions
diff --git a/paper/sections/assumptions.tex b/paper/sections/assumptions.tex index cff1051..58ea977 100644 --- a/paper/sections/assumptions.tex +++ b/paper/sections/assumptions.tex @@ -1,4 +1,4 @@ -In this section, we compare the main assumption of \cite{Daneshmand:2014}, commonly known as the {\it irrepresentability condition}, to the restricted eigenvalue condition. We argue that the restricted eigenvalue is weaker and more likely to hold in practical situations. +In this section, we compare the main assumption of \cite{Daneshmand:2014}, commonly known as the {\it irrepresentability condition}, to the restricted eigenvalue condition. We argue that the restricted eigenvalue is more likely to hold in practical situations. \subsection{The Irrepresentability Condition} @@ -14,11 +14,16 @@ The {\it (S,s)-irrepresentability} condition is defined as: \end{equation} \end{definition} -If our objective is to recover the support of the graph exactly, the irrepresentability condition has been shown to be essentially necessary \cite{Zhao:2006}. However, several recent papers \cite{vandegeer:2011}, \cite{Zou:2006}, argue this condition is unrealistic in situations where there is a correlation between variables. - -Consider for example the following matrix: - -Yet, assuming we only wish to recover all edges above a certain threshold, bounding the $\ell2$-error allows us to recover all edges with weights above a certain minimum threshold under an intuitively weaker {\bf(RE)} condition. In practical scenarios, such as in social networks, where one seeks to recover significant edges, this is a reasonable assumption. +If our objective is to recover the support of the graph exactly, the irrepresentability condition has been shown to be essentially necessary \cite{Zhao:2006}. However, several recent papers \cite{vandegeer:2011}, \cite{Zou:2006}, argue this condition is unrealistic in situations where there is a correlation between variables. Consider the following simplified example from \cite{vandegeer:2011}: +\begin{equation} +\left( +\begin{array}{cccc} +I_s & \rho J \\ +\rho J & I_s \\ +\end{array} +\right) +\end{equation} +where $I_s$ is the $s \times s$ identity matrix, $J$ is the all-ones matrix and $\rho \in \mathbb{R}^+$. It is easy to see that $\nu_{\text{irrepresentable}}(S) = \rho s$ and $\lambda_{\min}(Q) \geq 1 - \rho$, such that for any $\rho > \frac{1}{s}$ and $\rho < 1$, the restricted eigenvalue holds trivially but the (S,s)-irrepresentability does not hold. As mentioned previously, it is intuitive that the irrepresentability condition is stronger than our suggested {\bf(RE)} assumption. In fact, by adapting slightly a result from \cite{vandegeer:2009}, we can show that a `strong' irrepresentability condition directly {\it implies} the {\bf(RE)} condition for $\ell2$-recovery: @@ -30,6 +35,8 @@ If the irrepresentability condition holds with $\epsilon > \frac{2}{3}$, then th \subsection{The Restricted Eigenvalue Condition} +Yet, assuming we only wish to recover all edges above a certain threshold, bounding the $\ell2$-error allows us to recover all edges with weights above a certain minimum threshold under an intuitively weaker {\bf(RE)} condition. In practical scenarios, such as in social networks, where one seeks to recover significant edges, this is a reasonable assumption. + Expressing the restricted eigenvalue assumption for correlated measurements as parameters of the graph and the cascade diffusion process is non-trivial. The restricted eigenvalue property is however well behaved in the following sense: \begin{lemma} diff --git a/paper/sections/intro.tex b/paper/sections/intro.tex index f7a187c..a0933a5 100644 --- a/paper/sections/intro.tex +++ b/paper/sections/intro.tex @@ -2,7 +2,7 @@ A recent line of research has focused on applying advances in sparse recovery to A diffusion process taking place on a graph can provide valuable information about the existence of edges and their edge weights. By observing the sequence of nodes which become `infected' over time without knowledge of who has `infected' whom, can we recover the graph on which the process takes place? The spread of a particular behavior through a network is known as an {\it Influence Cascade}. In this context, the {\it Graph Inference} problem is the recovery of the graph's edges from the observation of few influence cascades. We propose to show how sparse recovery can be applied to solve this recently introduced graph inference problem. -Tackling the graph inference problem means constructing a polynomial-time algorithm which recovers a large majority of edges correctly from very few observations or {\it cascades}. Prior work shown that the graph inference problem can be solved in ${\cal O}(poly(s) \log m)$ number of observed cascades, where $s$ is the maximum degree and $m$ is the number of nodes in the graph. Almost miraculously, the number of cascades required to reconstruct the graph is logarithmic in the number of nodes of the graph. Results in the sparse recovery literature lead us to believe that $\Omega(s \log m/s)$ cascade observations should be sufficient to recover the graph. In fact, we prove this lower bound in a very general sense. We show an almost tight upper-bound ${\cal O}(s \log m)$ by applying standard sparse recovery techniques and assumptions to the graph inference problem. We show that the edge weights themselves can also be recovered under the same assumptions. +Tackling the graph inference problem means constructing a polynomial-time algorithm which recovers with high probability a large majority of edges correctly from very few observations or {\it cascades}. Prior work shown that the graph inference problem can be solved in ${\cal O}(poly(s) \log m)$ number of observed cascades, where $s$ is the maximum degree and $m$ is the number of nodes in the graph. Almost miraculously, the number of cascades required to reconstruct the graph is logarithmic in the number of nodes of the graph. Results in the sparse recovery literature lead us to believe that $\Omega(s \log m/s)$ cascade observations should be sufficient to recover the graph. In fact, we prove this lower bound in a very general sense: any non-adaptive graph inference algorithm which succeeds in recovering the graph with constant probability requires $\Omega(s \log m/s)$ observations. We show an almost tight upper-bound by applying standard sparse recovery techniques and assumptions: ${\cal O}(s \log m)$ are sufficient to recover the graph. We show that the edge weights themselves can also be recovered under the same assumptions. Throughout this paper, we focus on the three main discrete-time diffusion processes: the independent cascade model, the voter model, and the linear threshold model... diff --git a/paper/sections/model.tex b/paper/sections/model.tex index 5a0a96f..a645fcf 100644 --- a/paper/sections/model.tex +++ b/paper/sections/model.tex @@ -1,58 +1,52 @@ -We consider a graph $G= (V, E)$ and we define $m\defeq |V|$. A \emph{cascade -model} is a discrete-time homogeneous Markov process over $\{0,1\}^m$. Denoting -by $X^t$ the state of the process at time step $t-1$, we interpret $X^t_j = 1$ -for some node $j\in V$ as ``node $j$ is infected at time step $t-1$''. An -\emph{influence cascade} is simply a realisation of the random process -described by a cascade model. +We consider a graph ${\cal G}= (V, E, \Theta)$, with $\Theta$ a set of parameters to the graph. Let $m\defeq |V|$. A \emph{cascade model} is a finite state space Markov process over $\{0, 1, \dots \}^V$. An \emph{influence cascade} is a realisation of the random process described by a cascade model. -In what follows, we describe two widely used cascade models, the Independent -Cascade model (IC) and the Linear Threshold model (LT). In these models, the -transition probabilities of the Markov process depend on unknown parameters -describing how much the nodes in $G$ influence each other. +In practice, we restrict ourselves to discrete-time homogeneous cascade models. $\Theta$ usually represents the edge weights of ${\cal G}$. Finally, the state of each node can often be reduced to two classes: able to infect neighboring nodes at the next round, or unable to infect its neighbors. -Recovering these parameters from observed influence cascades is the central -question of the present work and is important for the following two reasons: +Recovering the intrinsic graph parameters $\Theta$ from observed influence cascades is the central question of the present work and is important for the following two reasons: \begin{enumerate} \item knowing the influence parameters allows for future prediction of influence cascades. \item when the set of edges $E$ is unknown, it can be recovered from the influence parameters. \end{enumerate} -We note that recovering the edges in $E$ from observed influence casccades is +We note that recovering the edges in $E$ from observed influence cascades is a well-identified problem known as the \emph{Graph Inference} problem. However, recovering the influence parameters is no less important and has been seemingly overlooked so far. The machinery developped in this work addresses both problems. -\subsection{Independent Cascade Model} +\subsection{Generalized Linear Models} + +Denoting by $X^t$ the state of the cascade at time step $t-1$, we interpret $X^t_j = 1$ for some node $j\in V$ as ``node $j$ can infect other nodes at time step $t+1$''. By the Markov property of cascades, $$\mathbb{P}(X^t | X^0, X^1, \dots, X^{t-1}) = \mathbb{P}(X^t | X^{t-1})$$ +We consider the following \emph{generalized linear model} (GLM) framework for cascades. + +\begin{definition} +Let .... A \emph{generalized linear cascade} is: +\end{definition} + +\subsection{Examples} +\label{subsec:examples} -In the independent cascade model, nodes can be either susceptible, active or -infected. All nodes start either as susceptible or active. At each time step -$t$, for each edge $(i,j)$ where $j$ is susceptible and $i$ is active, $i$ -attempts to infect $j$ with probability $p_{i,j}\in[0,1]$. If $i$ succeeds, $j$ will -become active at time step $t+1$. Regardless of $i$'s success, node $i$ will be -infected at time $t+1$: nodes stay active for only one time step. The cascade -continues until no active nodes remain. +\subsubsection{Independent Cascade Model} -If we denote by $X^t$ the indicator variable of the set of active nodes at time -step $t-1$, then if $j$ is susceptible at time step $t-1$, we have: +In the independent cascade model, nodes can be either susceptible, active/infected or inactive. All nodes start either as susceptible or active. At each time step $t$, for each edge $(i,j)$ where $j$ is susceptible and $i$ is active, $i$ attempts to infect $j$ with probability $p_{i,j}\in[0,1]$. If $i$ succeeds, $j$ will become active at time step $t+1$. Regardless of $i$'s success, node $i$ will be inactive at time $t+1$: nodes stay active for only one time step. The cascade continues until no active nodes remain. + +If we denote by $X^t$ the indicator variable of the set of active nodes at time step $t-1$, then if $j$ is susceptible at time step $t-1$, we have: \begin{displaymath} \P\big[X^{t+1}_j = 1\,|\, X^{t}\big] = 1 - \prod_{i = 1}^m (1 - p_{i,j})^{X^t_i}. \end{displaymath} -Defining $\Theta_{i,j} \defeq \log(1-p_{i,j})$, this can be rewritten -as: +Defining $\Theta_{i,j} \defeq \log(1-p_{i,j})$, this can be rewritten as: \begin{equation}\label{eq:ic} \tag{IC} \P\big[X^{t+1}_j = 1\,|\, X^{t}\big] = 1 - \prod_{i = 1}^m e^{\Theta_{i,j}X^t_i} = 1 - e^{\inprod{\theta_j}{X^t}} -\end{equation} -where we defined $\theta_j\defeq (\Theta_{1,j},\ldots,\Theta_{m,j})$. +\end{equation} where we defined $\theta_j\defeq (\Theta_{1,j},\ldots,\Theta_{m,j})$. + +\subsubsection{The Voter Model} -\subsection{The Voter Model} -\subsection{Generalized Linear Models} \subsection{Maximum Likelihood Estimation} @@ -94,7 +88,7 @@ $f(\inprod{\theta_j}{x^t})$ for some function $f$. Hence, if we denote by $n_j$ the first step at which $j$ becomes active, we can rewrite the MLE program \eqref{eq:pre-mle}: \begin{multline} - \hat{\theta}_j \in \argmax_{\theta} + \hat{\theta}_j \in \argmax_{\theta} \frac{1}{n_j} \sum_{t=1}^{n_j-1}\log\big(1-f(\inprod{\theta_j}{x^t})\big)\\ \log f(\inprod{\theta_j}{x^{n_j}})+ \lambda\|\theta_j\| \end{multline} |
