diff options
| author | Thibaut Horel <thibaut.horel@gmail.com> | 2015-02-01 19:03:59 -0500 |
|---|---|---|
| committer | Thibaut Horel <thibaut.horel@gmail.com> | 2015-02-01 19:03:59 -0500 |
| commit | bfe2069858dac0e23caa4e171949e887302ba2d7 (patch) | |
| tree | 9b7ff694ea08d153813e597f4b73e31c09d87f9a /paper | |
| parent | 2762395e506e466a97abd9a8a247fea05a168cc9 (diff) | |
| download | cascades-bfe2069858dac0e23caa4e171949e887302ba2d7.tar.gz | |
Polising intro of model section
Diffstat (limited to 'paper')
| -rw-r--r-- | paper/sections/model.tex | 55 |
1 files changed, 34 insertions, 21 deletions
diff --git a/paper/sections/model.tex b/paper/sections/model.tex index 4cef131..0f7eff7 100644 --- a/paper/sections/model.tex +++ b/paper/sections/model.tex @@ -1,31 +1,28 @@ -We consider a graph ${\cal G}= (V, E, \Theta)$, with $\Theta$ a set of parameters to the graph. Let $m\defeq |V|$. A \emph{cascade model} is a finite state space Markov process over $\{0, 1, \dots \}^V$. An \emph{influence cascade} is a realisation of the random process described by a cascade model. +We consider a graph ${\cal G}= (V, E, \Theta)$, where $\Theta$ is $|V|\times +|V|$ matrix of paremeters describing the edge weights of $\mathcal{G}$. +A \emph{cascade model} is a Markov process over the finite state space $\{0, 1, +\dots, K\}^V$. An \emph{influence cascade} is a realisation of the random +process described by a cascade model. -In practice, we restrict ourselves to discrete-time homogeneous cascade models. $\Theta$ usually represents the edge weights of ${\cal G}$. Each influence cascade begins at $t=0$ with a designated source `state' (e.g ``active''). All nodes in the source state at $t=0$ are called \emph{source nodes}. A standard theoretical assumption we can make is that each node at $t=0$ has a constant probability $p_{\text{init}}$ of being in the source state. This is more realistic and less restrictive than ``single source'' assumption made in \cite{Daneshmand:2014} and \cite{Abrahao:13}. +In practice, we restrict ourselves to discrete-time homogeneous cascade models. +At $t=0$, each node in the graph has a constant probability $p_{init}$ of being +in the ``active state''. The active nodes at $t=0$ are called the source nodes. +Our probabilistic model for the source nodes is more realistic and less +restrictive than the ``single source'' assumption made in \cite{Daneshmand:2014} +and \cite{Abrahao:13}. -Recovering the intrinsic graph parameters $\Theta$ from observed influence cascades is the central question of the present work and is important for the following two reasons: -\begin{enumerate} - \item knowing the influence parameters allows for future prediction of - influence cascades. - \item when the set of edges $E$ is unknown, it can be recovered from the - influence parameters. -\end{enumerate} -We note that recovering the edges in $E$ from observed influence cascades is -a well-identified problem known as the \emph{Graph Inference} problem. However, -recovering the influence parameters is no less important and has been seemingly -overlooked so far. The machinery developped in this work addresses both -problems for a large class of well-known cascade models. \subsection{Generalized Linear Cascade Models} Denoting by $X^t$ the state of the cascade at time step $t-1$, we interpret -$X^t_j = 1$ for some node $j\in V$ as ``node $j$ exhibits the source nodes' -behavior at time step $t+1$''. We draw inspiration from \emph{generalized -linear models} (GLM) to define a generalized linear cascade. +$X^t_j = 1$ for some node $j\in V$ as ``node $j$ is active'', \emph{i.e}, ``$j$ +exhibits the source nodes' behavior at time step $t+1$''. We draw inspiration +from \emph{generalized linear models} (GLM) to define a generalized linear +cascade. -\begin{definition} - Let us denote by $\{\mathcal{F}_t, t\in\ints\}$ the natural filtration - induced by $\{X_t, t\in\ints\}$. A \emph{generalized linear cascade} is - characterized by the following equation: +\begin{definition} Let us denote by $\{\mathcal{F}_t, t\in\ints\}$ the natural + filtration induced by $\{X_t, t\in\ints\}$. A \emph{generalized linear + cascade} is characterized by the following equation: \begin{displaymath} \P[X^{t+t}=x\,|\, \mathcal{F}_t] = \prod_{i\in V} f(\inprod{\theta_i}{X^{t}})^{x_i} @@ -39,6 +36,10 @@ satisfies the Markov property: \begin{displaymath} \P[X^{t+1}=x\,|\mathcal{F}_t] = \P[X^{t+1}=x\,|\, X^t] \end{displaymath} +Note also that $\E[X^{t+1}_i\,|\,X^t] = f(\inprod{\theta_i}{X^t})$. As such, +$f$ can be interpreted as the inverse link function of our generalized linear +cascade model. + \subsection{Examples} \label{subsec:examples} @@ -73,6 +74,18 @@ In the Voter Model, nodes can be either red or blue, where ``blue'' is the sourc \subsection{Maximum Likelihood Estimation} +Recovering the intrinsic graph parameters $\Theta$ from observed influence +cascades is the central question of the present work and is important for the +following two reasons: \begin{enumerate} \item knowing the influence parameters + allows for future prediction of influence cascades. \item when the + set of edges $E$ is unknown, it can be recovered from the influence +parameters. \end{enumerate} We note that recovering the edges in $E$ from +observed influence cascades is a well-identified problem known as the +\emph{Graph Inference} problem. However, recovering the influence parameters is +no less important and has been seemingly overlooked so far. The machinery +developped in this work addresses both problems for a large class of well-known +cascade models. + In both models, the unknown influence parameters are described by an $m\times m$ matrix $\Theta$. Furthermore, the information of the edges of $G$ is fully contained in $\Theta$: |
