more cosmetic changes

author: jeanpouget-abadie <jean.pougetabadie@gmail.com> 2015-05-12 19:46:27 +0200
committer: jeanpouget-abadie <jean.pougetabadie@gmail.com> 2015-05-12 19:46:27 +0200
commit: e794a29f1d777e65f778aaf2ff9c764ce5155f3f (patch)
tree: 1e8584a4285a5b44066e610ec106ee165630a779 /paper/sections/model.tex
parent: 4110319283b3b5b667d215bd44f23f8cd1a7cf46 (diff)
download: cascades-e794a29f1d777e65f778aaf2ff9c764ce5155f3f.tar.gz
1 files changed, 31 insertions, 32 deletions
diff --git a/paper/sections/model.tex b/paper/sections/model.tex
index acec974..b704b9e 100644
--- a/paper/sections/model.tex
+++ b/paper/sections/model.tex
@@ -107,9 +107,9 @@ In the independent cascade model, nodes can be either susceptible, contagious
 or immune. At $t=0$, all source nodes are ``contagious'' and all remaining
 nodes are ``susceptible''. At each time step $t$, for each edge $(i,j)$ where
 $j$ is susceptible and $i$ is contagious, $i$ attempts to infect $j$ with
-probability $p_{i,j}\in(0,1]$; the infection attempts are mutually independent.
+probability $p_{i,j}\in[0,1]$; the infection attempts are mutually independent.
 If $i$ succeeds, $j$ will become contagious at time step $t+1$. Regardless of
-$i$'s success, node $i$ will be immune at time $t+1$. In other words, nodes
+$i$'s success, node $i$ will be immune at time $t+1$, such that nodes
 stay contagious for only one time step. The cascade process terminates when no
 contagious nodes remain.
 
@@ -130,9 +130,10 @@ Defining $\Theta_{i,j} \defeq \log(1-p_{i,j})$, this can be rewritten as:
 Therefore, the independent cascade model is a Generalized Linear Cascade model
 with inverse link function $f : z \mapsto 1 - e^z$.
 
-In Section~\ref{sec:results}, we show bounds on the parameter of the graph.
-Fortunately, a bound on $\|\hat\theta - \theta^*\|_2$ directly implies a bound
-on $\|\hat p - p^*\|_2$ as shown by the following lemma:
+In Section~\ref{sec:results}, we show bounds on the \emph{transformed}
+parameters $\Theta$ of the graph.  Fortunately, a bound on $\|\hat\theta -
+\theta^*\|_2$ directly implies a bound on $\|\hat p - p^*\|_2$ as shown by the
+following lemma:
 
 \begin{lemma}
     $\|\hat{\theta} - \theta^* \|_2 \geq \|\hat{p} - p^*\|_2$.
@@ -146,15 +147,14 @@ $|\log (1 - p) - \log (1-p')| \geq \max(1 - \frac{1-p}{1-p'},
 \subsubsection{The Linear Voter Model}
 
 In the Linear Voter Model, nodes can be either \emph{red} or \emph{blue}.
-Without loss of generality, we can suppose that the
-\emph{blue} nodes are contagious. The parameters of the graph are normalized
-such that $\forall i, \ \sum_j \Theta_{i,j} = 1$ and we assume that
-$\Theta_{i,i}$ is always non-zero, meaning that all nodes have self-loops.
-Each round, every node $j$ independently chooses
-one of its neighbors with probability $\Theta_{i,j}$ and adopts their color. The
-cascades stops at a fixed horizon time $T$ or if all nodes are of the same color.
-If we denote by $X^t$ the indicator variable of the set of blue nodes at time
-step $t$, then we have:
+Without loss of generality, we can suppose that the \emph{blue} nodes are
+contagious. The parameters of the graph are normalized such that $\forall i, \
+\sum_j \Theta_{i,j} = 1$ and we assume that $\Theta_{i,i}$ is always non-zero,
+meaning that all nodes have self-loops.  Each round, every node $j$
+independently chooses one of its neighbors with probability $\Theta_{i,j}$ and
+adopts their color. The cascades stops at a fixed horizon time $T$ or if all
+nodes are of the same color.  If we denote by $X^t$ the indicator variable of
+the set of blue nodes at time step $t$, then we have:
 \begin{equation}
 \mathbb{P}\left[X^{t+1}_j = 1 | X^t \right] = \sum_{i=1}^m \Theta_{i,j} X_i^t =
 \inprod{\Theta_j}{X^t}
@@ -176,11 +176,12 @@ the discrete-time independent cascade model, $X^k_j = 1 \implies X^{k+1}_j =
 1$.
 
 Let $\exp(p)$ be an exponentially-distributed random variable of parameter $p$
-and let $p_{i,j}$ be the rate of transmission along directed edge $(i,j)$ in the
-CICE model.  By the memoryless property of the exponential, if $X^k_j \neq 1$:
+and let $\Theta_{i,j}$ be the rate of transmission along directed edge $(i,j)$
+in the CICE model.  By the memoryless property of the exponential, if $X^k_j
+\neq 1$:
 \begin{align*}
   \mathbb{P}(X^{k+1}_j = 1 | X^k) & = \mathbb{P}(\min_{i \in {\cal N}(j)}
-  \exp(p_{i,j}) \leq \epsilon) \\
+  \exp(\Theta_{i,j}) \leq \epsilon) \\
   & = \mathbb{P}(\exp( \sum_{i=1}^m \Theta_{i,j} X^t_i) \leq \epsilon) \\
   & = 1 - e^{- \epsilon \Theta_j \cdot X^t}
 \end{align*}
@@ -193,15 +194,14 @@ Generalized Linear Cascade model with inverse link function $f:z\mapsto
 
 \subsubsection{Logistic Cascades}
 \label{sec:logistic_cascades}
-Consider the specific case of ``logistic cascades'' (where $f$ is the logistic
-function). This model captures the idea that there is a threshold around which
-each additional neighbor's contribution becomes significant: logistic
-cascades can be thought of approximating the more-commonly studied Linear
-Threshold model~\cite{Kempe:03}. As we will see later in the
-analysis, the Hessian of our optimization program simplifies in the case of a
-logistic inverse link function to $\nabla^2\mathcal{L}(\theta^*) =
-\frac{1}{|\mathcal{T}|}XX^T$ where $X$ is the observation matrix $[x^1 \ldots
-x^\mathcal{|T|}]$.
+``Logistic cascades'' (where the inverse link function $f$ is the logistic
+function) capture the idea that there is a threshold around which each
+additional neighbor's contribution becomes significant: they constitue a
+differentiable approximation to the more-commonly studied Linear Threshold
+model~\cite{Kempe:03}. As we will see later in the analysis, the Hessian of our
+optimization program simplifies in the case of a logistic inverse link function
+to $\nabla^2\mathcal{L}(\theta^*) = \frac{1}{|\mathcal{T}|}XX^T$ where $X$ is
+the observation matrix $[x^1 \ldots x^\mathcal{|T|}]$.
 
 % \subsection{The Linear Threshold Model}
 
@@ -234,18 +234,17 @@ x^\mathcal{|T|}]$.
   \caption{Illustration of the sparse-recovery approach. Our objective is to
   recover the unknown weight vector $\theta_j$ for each node $j$. We observe a
 Bernoulli realization of the $f$ transform of the matrix-vector product, where
-the measurement matrix encodes which nodes are ``contagious''}
+the measurement matrix encodes which nodes are ``contagious'' at each time step}
 \end{figure}
 
 
 Inferring the model parameter $\Theta$ from observed influence cascades is the
 central question of the present work. Recovering the edges in $E$ from observed
-influence cascades is a well-identified problem known as the \emph{Graph
+influence cascades is a well-identified problem known as the \emph{Network
 Inference} problem. However, recovering the influence parameters is no less
-important and has been seemingly overlooked so far. In this work we focus on
-recovering $\Theta$, noting that the set of edges $E$ can then be recovered
-through the following equivalence: $(i,j)\in E\Leftrightarrow  \Theta_{i,j}
-\neq 0$
+important. In this work we focus on recovering $\Theta$, noting that the set of
+edges $E$ can then be recovered through the following equivalence: $(i,j)\in
+E\Leftrightarrow  \Theta_{i,j} \neq 0$
 
 Given observations $(x^1,\ldots,x^n)$ of a cascade model, we can recover
 $\Theta$ via Maximum Likelihood Estimation (MLE). Denoting by $\mathcal{L}$ the
author	jeanpouget-abadie <jean.pougetabadie@gmail.com>	2015-05-12 19:46:27 +0200
committer	jeanpouget-abadie <jean.pougetabadie@gmail.com>	2015-05-12 19:46:27 +0200
commit	e794a29f1d777e65f778aaf2ff9c764ce5155f3f (patch)
tree	1e8584a4285a5b44066e610ec106ee165630a779 /paper/sections/model.tex
parent	4110319283b3b5b667d215bd44f23f8cd1a7cf46 (diff)
download	cascades-e794a29f1d777e65f778aaf2ff9c764ce5155f3f.tar.gz