aboutsummaryrefslogtreecommitdiffstats
path: root/paper/sections/model.tex
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2015-05-19 00:20:40 +0200
committerThibaut Horel <thibaut.horel@gmail.com>2015-05-19 00:20:40 +0200
commit26bfc9b9e69facabe838fc35a96263e5c487c8b3 (patch)
tree2c9889a206ce2a55335814cf16b16dd978c3d521 /paper/sections/model.tex
parent282349d87a75e930bb4d2ac7bc389106d0519f0b (diff)
downloadcascades-26bfc9b9e69facabe838fc35a96263e5c487c8b3.tar.gz
More compression
Diffstat (limited to 'paper/sections/model.tex')
-rw-r--r--paper/sections/model.tex43
1 files changed, 20 insertions, 23 deletions
diff --git a/paper/sections/model.tex b/paper/sections/model.tex
index 532ee5e..5d3a232 100644
--- a/paper/sections/model.tex
+++ b/paper/sections/model.tex
@@ -130,14 +130,12 @@ Defining $\Theta_{i,j} \defeq \log(\frac{1}{1-p_{i,j}})$, this can be rewritten
\end{align*}
Therefore, the independent cascade model is a Generalized Linear Cascade model
-with inverse link function $f : z \mapsto 1 - e^{-z}$.
-
-Note that to write the Independent Cascade Model as a Generalized Linear
-Cascade Model, we had to introduce the change of variable $\Theta_{i,j}
-= \log(\frac{1}{1-p_{i,j}})$. The recovery results in Section~\ref{sec:results}
-pertain to the $\Theta_j$ parameters. Fortunately, the following lemma shows that
-the recovery error on $\Theta_j$ is an upper bound on the error on the original
-$p_j$ parameters.
+with inverse link function $f : z \mapsto 1 - e^{-z}$. Note that to write the
+Independent Cascade Model as a Generalized Linear Cascade Model, we had to
+introduce the change of variable $\Theta_{i,j} = \log(\frac{1}{1-p_{i,j}})$.
+The recovery results in Section~\ref{sec:results} pertain to the $\Theta_j$
+parameters. Fortunately, the following lemma shows that the recovery error on
+$\Theta_j$ is an upper bound on the error on the original $p_j$ parameters.
\begin{lemma}
\label{lem:transform}
@@ -181,12 +179,12 @@ Let $\text{Exp}(p)$ be an exponentially-distributed random variable of parameter
and let $\Theta_{i,j}$ be the rate of transmission along directed edge $(i,j)$
in the CICE model. By the memoryless property of the exponential, if $X^k_j
\neq 1$:
-\begin{align*}
- \mathbb{P}(X^{k+1}_j = 1 | X^k) & = \mathbb{P}(\min_{i \in {\cal N}(j)}
+\begin{multline*}
+ \mathbb{P}(X^{k+1}_j = 1 | X^k) = \mathbb{P}(\min_{i \in {\cal N}(j)}
\text{Exp}(\Theta_{i,j}) \leq \epsilon) \\
- & = \mathbb{P}(\text{Exp}( \sum_{i=1}^m \Theta_{i,j} X^t_i) \leq \epsilon) \\
- & = 1 - e^{- \epsilon \Theta_j \cdot X^t}
-\end{align*}
+ = \mathbb{P}(\text{Exp}( \sum_{i=1}^m \Theta_{i,j} X^t_i) \leq \epsilon)
+ = 1 - e^{- \epsilon \Theta_j \cdot X^t}
+\end{multline*}
Therefore, the $\epsilon$-discretized CICE-induced process is a
Generalized Linear Cascade model with inverse link function $f:z\mapsto
1-e^{-\epsilon\cdot z}$.
@@ -194,10 +192,8 @@ Generalized Linear Cascade model with inverse link function $f:z\mapsto
\subsubsection{Logistic Cascades}
\label{sec:logistic_cascades}
``Logistic cascades'' is the specific case where the inverse link function is
-given by the logistic function:
-\begin{displaymath}
- f(z) = \frac{1}{1+e^{-z + t}}
-\end{displaymath}
+given by the logistic function
+$f(z) = 1/(1+e^{-z + t})$.
Intuitively, this captures the idea that there is a threshold $t$ such that
when the sum of the parameters of the infected parents of a node is larger than
the threshold, the probability of getting infected is close to one. This is
@@ -277,11 +273,11 @@ and:
\end{multline*}
In the case of the voter model, the measurements include all time steps until
-we reach the time horizon $T$ or the graph coalesces to a single state. In the
-case of the independent cascade model, the measurements include all time steps
-until node $i$ becomes contagious, after which its behavior is deterministic.
-Contrary to prior work, our results depend on the number of
-measurements and not the number of cascades.
+we reach the time horizon $T$ or the graph coalesces to a single state. For the
+independent cascade model, the measurements include all time steps until node
+$i$ becomes contagious, after which its behavior is deterministic. Contrary to
+prior work, our results depend on the number of measurements and not the number
+of cascades.
\paragraph{Regularity assumptions}
@@ -294,12 +290,13 @@ Model and Voter Model.
Furthermore, the data-dependent bounds in Section~\ref{sec:main_theorem} will
require the following regularity assumption on the inverse link function $f$:
+there exists $\alpha\in(0,1)$ such that
\begin{equation}
\tag{LF}
\max \big\{ | (\log f)'(z_x) |, |(\log (1-f))'(z_x) | \big\}
\leq \frac{1}{\alpha}
\end{equation}
-for some $\alpha\in(0, 1)$ and for all $z_x\defeq\inprod{\theta^*}{x}$ such that
+for all $z_x\defeq\inprod{\theta^*}{x}$ such that
$f(z_x)\notin\{0,1\}$.
In the voter model, $\frac{f'(z)}{f(z)} = \frac{1}{z}$ and