diff options
Diffstat (limited to 'paper/sections/model.tex')
| -rw-r--r-- | paper/sections/model.tex | 66 |
1 files changed, 50 insertions, 16 deletions
diff --git a/paper/sections/model.tex b/paper/sections/model.tex index 01c7860..26e1a8d 100644 --- a/paper/sections/model.tex +++ b/paper/sections/model.tex @@ -121,22 +121,24 @@ time step $t$, then if $j$ is susceptible at time step $t+1$, we have: \P\big[X^{t+1}_j = 1\,|\, X^{t}\big] = 1 - \prod_{i = 1}^m {(1 - p_{i,j})}^{X^t_i}. \end{displaymath} -Defining $\Theta_{i,j} \defeq \log(1-p_{i,j})$, this can be rewritten as: +Defining $\Theta_{i,j} \defeq \log(\frac{1}{1-p_{i,j}})$, this can be rewritten as: \begin{equation}\label{eq:ic} \tag{IC} - \P\big[X^{t+1}_j = 1\,|\, X^{t}\big] - = 1 - \prod_{i = 1}^m e^{\Theta_{i,j}X^t_i} - = 1 - e^{\inprod{\Theta_j}{X^t}} + \begin{split} + \P\big[X^{t+1}_j &= 1\,|\, X^{t}\big] + = 1 - \prod_{i = 1}^m e^{-\Theta_{i,j}X^t_i}\\ + &= 1 - e^{-\inprod{\Theta_j}{X^t}} +\end{split} \end{equation} Therefore, the independent cascade model is a Generalized Linear Cascade model -with inverse link function $f : z \mapsto 1 - e^z$. +with inverse link function $f : z \mapsto 1 - e^{-z}$. Note that to write the Independent Cascade Model as a Generalized Linear Cascade Model, we had to introduce the change of variable $\Theta_{i,j} -= \log(1-p_{i,j})$. The recovery results in Section~\ref{sec:results} hold on -the $\Theta_j$ parameters. Fortunately, the following lemma shows that the -recovery error on $\Theta_j$ is an upper bound on the error on the original += \log(\frac{1}{1-p_{i,j}})$. The recovery results in Section~\ref{sec:results} +hold on the $\Theta_j$ parameters. Fortunately, the following lemma shows that +the recovery error on $\Theta_j$ is an upper bound on the error on the original $p_j$ parameters. \begin{lemma} @@ -144,7 +146,7 @@ $p_j$ parameters. \end{lemma} \begin{proof} Using the inequality $\forall x>0, \; \log x \geq 1 - \frac{1}{x}$, we have -$|\log (1 - p) - \log (1-p')| \geq \max(1 - \frac{1-p}{1-p'}, +$|\log (\frac{1}{1 - p}) - \log (\frac{1}{1-p'})| \geq \max(1 - \frac{1-p}{1-p'}, 1 - \frac{1-p'}{1-p}) \geq \max( p-p', p'-p)$. \end{proof} @@ -176,8 +178,8 @@ independent cascade model with exponential transmission function (CICE) of~\cite{GomezRodriguez:2010, Abrahao:13, Daneshmand:2014}. Assume that the temporal resolution of the discretization is $\varepsilon$, \emph{i.e.} all nodes whose (continuous) infection time is within the interval $[k\varepsilon, - (k+1)\varepsilon)$ are considered infected at (discrete) time step $t$. - pLet $X^k$ be the indicator vector of the set of nodes `infected' before or + (k+1)\varepsilon)$ are considered infected at (discrete) time step $t$. Let + $X^k$ be the indicator vector of the set of nodes `infected' before or during the $k^{th}$ time interval. Note that contrary to the discrete-time independent cascade model, $X^k_j = 1 \implies X^{k+1}_j = 1$, that is, there is no immune state and nodes remain contagious forever. @@ -288,11 +290,43 @@ until node $i$ becomes contagious, after which its behavior is deterministic. Contrary to prior work, our results depend on the number of measurements and not the number of cascades. -A sufficient condition for program~\eqref{eq:pre-mle} to be a convex program is -that $\mathcal{L}_i$ is a concave function. This will be the case if, for -example, $f$ and $(1-f)$ are both log-concave functions. Remember that -a twice-differentiable function $f$ is log concave iff. $f''f \leq f'^2$. It is +\paragraph{Regularity assumptions} + +We would like program~\eqref{eq:pre-mle} to be a convex program in order to +solve it efficiently. A sufficient condition is to +assume that $\mathcal{L}_i$ is a concave function. This will be the case if, +for example, $f$ and $(1-f)$ are both log-concave functions. Remember that +a twice-differentiable function $f$ is log-concave iff. $f''f \leq f'^2$. It is easy to verify this property for $f$ and $(1-f)$ in the Independent Cascade Model and Voter Model. -{\color{red} TODO:~talk about the different constraints} +Furthermore, the data-dependent bounds in Section~\ref{sec:main_theorem} will +require the following regularity assumption on the inverse link function $f$: +\begin{equation} + \tag{LF} + \max \big\{ | (\log f)'(z_x) |, |(\log (1-f))'(z_x) | \big\} + \leq \frac{1}{\alpha} +\end{equation} +for some $\alpha\in(0, 1)$ and for all $z_x\defeq\inprod{\theta^*}{x}$ such that +$f(z_x)\notin\{0,1\}$. + +In the voter model, $\frac{f'(z)}{f(z)} = \frac{1}{z}$ and +$\frac{f'(z)}{(1-f)(z)} = +\frac{1}{1-z}$. Hence (LF) will hold as soon as $\alpha\leq \Theta_{i,j}\leq +1-\alpha$ for all $(i,j)\in E$ which is always satisfied for some $\alpha$ for +non-isolated nodes. In the Independent Cascade Model, $\frac{f'(z)}{f(z)} = +\frac{1}{e^{z}-1}$ and $\frac{f'(z)}{(1-f)(z)} = 1$. Hence (LF) holds as soon +as $p_{i,j}\geq \alpha$ for all $(i,j)\in E$ which is always satisfied for some +$\alpha\in(0,1)$. + +For the data-independent bound of Proposition~\ref{prop:fi}, we will require the +following additional regularity assumption: +\begin{equation} + \tag{LF2} + \max \big\{ | (\log f)''(z_x) |, |(\log (1-f))''(z_x) | \big\} + \leq \frac{1}{\alpha} +\end{equation} +for some $\alpha\in(0, 1)$ and for all $z_x\defeq\inprod{\theta^*}{x}$ such +that $f(z_x)\notin\{0,1\}$. It is again easy to see that this condition is +verified for the Independent Cascade Model and the Voter model for the same +$\alpha\in(0,1)$. |
