diff options
Diffstat (limited to 'paper/sections/model.tex')
| -rw-r--r-- | paper/sections/model.tex | 42 |
1 files changed, 24 insertions, 18 deletions
diff --git a/paper/sections/model.tex b/paper/sections/model.tex index 5934590..5e55c7a 100644 --- a/paper/sections/model.tex +++ b/paper/sections/model.tex @@ -7,7 +7,21 @@ We consider a graph ${\cal G}= (V, E, \Theta)$, where $\Theta$ is a $|V|\times | In other words, a cascade model describes a diffusion process, whereby a set of contagious nodes ``influence'' other nodes in the graph to adopt or not their behavior. An \emph{influence cascade} is a realisation of this random process: the successive states of the nodes in graph ${\cal G}$. Note that both the ``single source'' assumption made in \cite{Daneshmand:2014} and \cite{Abrahao:13} as well as the ``uniformly chosen source set'' assumption made in \cite{Netrapalli:2012} verify condition 3. -In the context of Graph Inference, \cite{Netrapalli:2012} focuses specifically on the well-known discrete-time independent cascade model recalled below, which \cite{Abrahao:13} and \cite{Daneshmand:2014} generalize to continuous time. Whilst we restrict ourselves to discrete-time processes, we consider both the independent cascade model and the linear voter model, and show they can be solved by a very similar maximum-likelihood program. The linear threshold model is a special case and is discussed in Section~\ref{sec:linear_threshold}. +In the context of Graph Inference, \cite{Netrapalli:2012} focuses specifically on the well-known discrete-time independent cascade model recalled below, which \cite{Abrahao:13} and \cite{Daneshmand:2014} generalize to continuous time. Whilst we restrict ourselves to discrete-time processes, we consider both the independent cascade model and the linear voter model. Despite their obvious differences, both models make the graph inference problem very similar to the standard generalized linear model estimation problem. In fact, we can define a class of diffusion processes which for which this true: the \emph{Generalized Linear Cascade Models}. The linear threshold model is a special case and is discussed in Section~\ref{sec:linear_threshold}. + +\subsection{Generalized Linear Cascade Models} +\label{sec:GLC} + +Let \emph{susceptible} characterize any state which can become contagious at the next time step with a non-zero probability. We draw inspiration from generalized linear models to introduce Generalized Linear Cascades: + +\begin{definition} +\label{def:glcm} +Let $X^t$ be the indicator variable of ``contagious nodes'' at time step $t$. A \emph{generalized linear cascade model} is a cascade model such that for each susceptible node $j$ in state $s$ at time step $t$, the probability of $j$ becoming ``contagious'' at time step $t+1$ conditionally on $X^t$ is a Bernoulli of parameter $f_s(\Theta_j \cdot X^t)$: +$$\mathbb{P}(X^{t+1}_j = 1|X^t \text{and } e^t_j = s) = f_s(\Theta_j \cdot X^t)$$ +where $e^t_j$ is the state of node $j$ at time step $t$ and $f_s : \mathbb{R} \rightarrow [0,1]$ +\end{definition} + +In other words, each generalized linear cascade provides a series for each node $j \in V$ a series of measurements $(X^t, X^t_j)_{t \in {\cal T_j}}$ issued from a generalized linear model. Note also that $\E[X^{t+1}_i\,|\,X^t] = f(\inprod{\theta_i}{X^t})$. As such, $f$ can be interpreted as the inverse link function of our generalized linear cascade model. % Denoting by $X^t$ the state of the cascade at time step $t$, we interpret % $X^t_j = 1$ for node $j\in V$ as ``node $j$ is contagious'', \emph{i.e}, ``$j$ @@ -37,7 +51,9 @@ In the context of Graph Inference, \cite{Netrapalli:2012} focuses specifically o % $f$ can be interpreted as the inverse link function of our generalized linear % cascade model. -\subsection{Independent Cascade Model} +\subsection{examples} + +\subsubsection{Independent Cascade Model} In the independent cascade model, nodes can be either susceptible, contagious or immune. At $t=0$, all source nodes are ``contagious'' and all remaining nodes are @@ -46,8 +62,7 @@ susceptible and $i$ is contagious, $i$ attempts to infect $j$ with probability $p_{i,j}\in[0,1]$, the infection attempts are independent of each other. If $i$ succeeds, $j$ will become contagious at time step $t+1$. Regardless of $i$'s success, node $i$ will be immune at time $t+1$. In other words, nodes stay -contagious for only one time step. The cascade process terminates when no contagious -nodes remain. +contagious for only one time step. The cascade process terminates when no contagious nodes remain. If we denote by $X^t$ the indicator variable of the set of contagious nodes at time step $t$, then if $j$ is susceptible at time step $t+1$, we have: @@ -63,7 +78,9 @@ Defining $\Theta_{i,j} \defeq \log(1-p_{i,j})$, this can be rewritten as: = 1 - e^{\inprod{\theta_j}{X^t}} \end{equation} -\subsection{The Linear Voter Model} +Therefore, the independent cascade model is a Generalized Linear Cascade model with link function $f : z \mapsto 1 - e^z$ + +\subsubsection{The Linear Voter Model} In the Linear Voter Model, nodes can be either \emph{red} or \emph{blue}. Both states verify the properties of a contagious state, but for ease of presentation, we will suppose that the contagious nodes are the \emph{blue}. The parameters of the graph are normalized such that $\forall i, \ \sum_j \Theta_{i,j} = 1$. Each round, every node $j$ independently chooses @@ -76,6 +93,8 @@ step $t$, then we have: \tag{V} \end{equation} +Therefore, the independent cascade model is a Generalized Linear Cascade model with link function $f: z \mapsto z$ + % \subsection{The Linear Threshold Model} % In the Linear Threshold Model, each node $j\in V$ has a threshold $t_j$ from the interval $[0,1]$ and for each node, the sum of incoming weights is less than $1$: $\forall j\in V$, $\sum_{i=1}^m \Theta_{i,j} \leq 1$. @@ -98,19 +117,6 @@ step $t$, then we have: % \end{equation} % where ``sign'' is the function $\mathbbm{1}_{\cdot > 0}$. -\subsection{Generalized Linear Cascade Models} -\label{sec:GLC} - -Despite their obvious differences, both models share a common similarity: conditioned on node $j$ being susceptible at time step $t$, the probability that node $j$ becomes infected at time step $t+1$ is a bernoulli of parameter $f(\Theta_j \cdot X^t)$, where $f:\mathbb{R} \rightarrow [0,1]$. In the case of the voter model, $f_{\text{V}}$ is the identify function, and in the case of the independent cascade model $f_{\text{ICC}} : z \rightarrow 1 - e^z$. We draw inspiration from generalized linear models to introduce \emph{Generalized Linear Cascades}: - -\begin{definition} -\label{def:glcm} -A \emph{generalized linear cascade} is a cascade model characterized by the following equation: -\begin{displaymath} -\forall j \in V, \; \P[X^{t+1}_j=x\,|\, X^t] = f(\Theta_j \cdot X^t) -\end{displaymath} -where $f:\mathbb{R}\to[0,1]$ -\end{definition} \subsection{Maximum Likelihood Estimation} |
