aboutsummaryrefslogtreecommitdiffstats
path: root/paper/sections/model.tex
diff options
context:
space:
mode:
Diffstat (limited to 'paper/sections/model.tex')
-rw-r--r--paper/sections/model.tex66
1 files changed, 50 insertions, 16 deletions
diff --git a/paper/sections/model.tex b/paper/sections/model.tex
index 01c7860..26e1a8d 100644
--- a/paper/sections/model.tex
+++ b/paper/sections/model.tex
@@ -121,22 +121,24 @@ time step $t$, then if $j$ is susceptible at time step $t+1$, we have:
\P\big[X^{t+1}_j = 1\,|\, X^{t}\big]
= 1 - \prod_{i = 1}^m {(1 - p_{i,j})}^{X^t_i}.
\end{displaymath}
-Defining $\Theta_{i,j} \defeq \log(1-p_{i,j})$, this can be rewritten as:
+Defining $\Theta_{i,j} \defeq \log(\frac{1}{1-p_{i,j}})$, this can be rewritten as:
\begin{equation}\label{eq:ic}
\tag{IC}
- \P\big[X^{t+1}_j = 1\,|\, X^{t}\big]
- = 1 - \prod_{i = 1}^m e^{\Theta_{i,j}X^t_i}
- = 1 - e^{\inprod{\Theta_j}{X^t}}
+ \begin{split}
+ \P\big[X^{t+1}_j &= 1\,|\, X^{t}\big]
+ = 1 - \prod_{i = 1}^m e^{-\Theta_{i,j}X^t_i}\\
+ &= 1 - e^{-\inprod{\Theta_j}{X^t}}
+\end{split}
\end{equation}
Therefore, the independent cascade model is a Generalized Linear Cascade model
-with inverse link function $f : z \mapsto 1 - e^z$.
+with inverse link function $f : z \mapsto 1 - e^{-z}$.
Note that to write the Independent Cascade Model as a Generalized Linear
Cascade Model, we had to introduce the change of variable $\Theta_{i,j}
-= \log(1-p_{i,j})$. The recovery results in Section~\ref{sec:results} hold on
-the $\Theta_j$ parameters. Fortunately, the following lemma shows that the
-recovery error on $\Theta_j$ is an upper bound on the error on the original
+= \log(\frac{1}{1-p_{i,j}})$. The recovery results in Section~\ref{sec:results}
+hold on the $\Theta_j$ parameters. Fortunately, the following lemma shows that
+the recovery error on $\Theta_j$ is an upper bound on the error on the original
$p_j$ parameters.
\begin{lemma}
@@ -144,7 +146,7 @@ $p_j$ parameters.
\end{lemma}
\begin{proof}
Using the inequality $\forall x>0, \; \log x \geq 1 - \frac{1}{x}$, we have
-$|\log (1 - p) - \log (1-p')| \geq \max(1 - \frac{1-p}{1-p'},
+$|\log (\frac{1}{1 - p}) - \log (\frac{1}{1-p'})| \geq \max(1 - \frac{1-p}{1-p'},
1 - \frac{1-p'}{1-p}) \geq \max( p-p', p'-p)$.
\end{proof}
@@ -176,8 +178,8 @@ independent cascade model with exponential transmission function (CICE)
of~\cite{GomezRodriguez:2010, Abrahao:13, Daneshmand:2014}. Assume that the
temporal resolution of the discretization is $\varepsilon$, \emph{i.e.} all
nodes whose (continuous) infection time is within the interval $[k\varepsilon,
- (k+1)\varepsilon)$ are considered infected at (discrete) time step $t$.
- pLet $X^k$ be the indicator vector of the set of nodes `infected' before or
+ (k+1)\varepsilon)$ are considered infected at (discrete) time step $t$. Let
+ $X^k$ be the indicator vector of the set of nodes `infected' before or
during the $k^{th}$ time interval. Note that contrary to the discrete-time
independent cascade model, $X^k_j = 1 \implies X^{k+1}_j = 1$, that is,
there is no immune state and nodes remain contagious forever.
@@ -288,11 +290,43 @@ until node $i$ becomes contagious, after which its behavior is deterministic.
Contrary to prior work, our results depend on the number of
measurements and not the number of cascades.
-A sufficient condition for program~\eqref{eq:pre-mle} to be a convex program is
-that $\mathcal{L}_i$ is a concave function. This will be the case if, for
-example, $f$ and $(1-f)$ are both log-concave functions. Remember that
-a twice-differentiable function $f$ is log concave iff. $f''f \leq f'^2$. It is
+\paragraph{Regularity assumptions}
+
+We would like program~\eqref{eq:pre-mle} to be a convex program in order to
+solve it efficiently. A sufficient condition is to
+assume that $\mathcal{L}_i$ is a concave function. This will be the case if,
+for example, $f$ and $(1-f)$ are both log-concave functions. Remember that
+a twice-differentiable function $f$ is log-concave iff. $f''f \leq f'^2$. It is
easy to verify this property for $f$ and $(1-f)$ in the Independent Cascade
Model and Voter Model.
-{\color{red} TODO:~talk about the different constraints}
+Furthermore, the data-dependent bounds in Section~\ref{sec:main_theorem} will
+require the following regularity assumption on the inverse link function $f$:
+\begin{equation}
+ \tag{LF}
+ \max \big\{ | (\log f)'(z_x) |, |(\log (1-f))'(z_x) | \big\}
+ \leq \frac{1}{\alpha}
+\end{equation}
+for some $\alpha\in(0, 1)$ and for all $z_x\defeq\inprod{\theta^*}{x}$ such that
+$f(z_x)\notin\{0,1\}$.
+
+In the voter model, $\frac{f'(z)}{f(z)} = \frac{1}{z}$ and
+$\frac{f'(z)}{(1-f)(z)} =
+\frac{1}{1-z}$. Hence (LF) will hold as soon as $\alpha\leq \Theta_{i,j}\leq
+1-\alpha$ for all $(i,j)\in E$ which is always satisfied for some $\alpha$ for
+non-isolated nodes. In the Independent Cascade Model, $\frac{f'(z)}{f(z)} =
+\frac{1}{e^{z}-1}$ and $\frac{f'(z)}{(1-f)(z)} = 1$. Hence (LF) holds as soon
+as $p_{i,j}\geq \alpha$ for all $(i,j)\in E$ which is always satisfied for some
+$\alpha\in(0,1)$.
+
+For the data-independent bound of Proposition~\ref{prop:fi}, we will require the
+following additional regularity assumption:
+\begin{equation}
+ \tag{LF2}
+ \max \big\{ | (\log f)''(z_x) |, |(\log (1-f))''(z_x) | \big\}
+ \leq \frac{1}{\alpha}
+\end{equation}
+for some $\alpha\in(0, 1)$ and for all $z_x\defeq\inprod{\theta^*}{x}$ such
+that $f(z_x)\notin\{0,1\}$. It is again easy to see that this condition is
+verified for the Independent Cascade Model and the Voter model for the same
+$\alpha\in(0,1)$.