aboutsummaryrefslogtreecommitdiffstats
path: root/paper/sections/model.tex
diff options
context:
space:
mode:
Diffstat (limited to 'paper/sections/model.tex')
-rw-r--r--paper/sections/model.tex65
1 files changed, 36 insertions, 29 deletions
diff --git a/paper/sections/model.tex b/paper/sections/model.tex
index 43de785..b478e94 100644
--- a/paper/sections/model.tex
+++ b/paper/sections/model.tex
@@ -2,7 +2,7 @@ We consider a graph ${\cal G}= (V, E, \Theta)$, where $\Theta$ is a $|V|\times
|V|$ matrix of parameters describing the edge weights of $\mathcal{G}$.
Intuitively, $\Theta_{i,j}$ captures the ``influence'' of node $i$ on node $j$.
Let $m\defeq |V|$. For each node $j$, let $\theta_{j}$ be the $j^{th}$ column
-vector of $\Theta$. A {\color{red} discrete-time} \emph{Cascade model} is a
+vector of $\Theta$. A discrete-time \emph{Cascade model} is a
Markov process over a finite state space ${\{0, 1, \dots, K-1\}}^V$ with the
following properties:
\begin{enumerate}
@@ -25,16 +25,19 @@ contagious nodes ``influence'' other nodes in the graph to become contagious. An
the successive states of the nodes in graph ${\cal G}$. Note that both the
``single source'' assumption made in~\cite{Daneshmand:2014} and
\cite{Abrahao:13} as well as the ``uniformly chosen source set'' assumption made
-in~\cite{Netrapalli:2012} verify condition 3. {\color{red} why is it less
-restrictive? explain}
+in~\cite{Netrapalli:2012} verify condition 3. Also note that the multiple-source
+node assumption does not reduce to the single-source assumption: even
+non-overlapping cascades started from two nodes that are far apart cannot in
+practice be attributed to either process without knowledge of the newtork
+topology.
-In the context of Graph Inference,~\cite{Netrapalli:2012} focus
+In the context of Network Inference,~\cite{Netrapalli:2012} focus
on the well-known discrete-time independent cascade model recalled below, which
\cite{Abrahao:13} and~\cite{Daneshmand:2014} generalize to continuous time. We
extend the independent cascade model in a different direction by considering
a more general class of transition probabilities while staying in the
discrete-time setting. We observe that despite their obvious differences, both
-the independent cascade and the voter models make the graph inference problem
+the independent cascade and the voter models make the network inference problem
similar to the standard generalized linear model inference problem. In fact, we
define a class of diffusion processes for which this is true: the
\emph{Generalized Linear Cascade Models}. The linear threshold model is
@@ -121,18 +124,15 @@ Defining $\Theta_{i,j} \defeq \log(1-p_{i,j})$, this can be rewritten as:
\tag{IC}
\P\big[X^{t+1}_j = 1\,|\, X^{t}\big]
= 1 - \prod_{i = 1}^m e^{\Theta_{i,j}X^t_i}
- = 1 - e^{\inprod{\theta_j}{X^t}}
+ = 1 - e^{\inprod{\Theta_j}{X^t}}
\end{equation}
Therefore, the independent cascade model is a Generalized Linear Cascade model
with inverse link function $f : z \mapsto 1 - e^z$.
-Note that interpreting it in the case of the Independent Cascade Model requires
-one more step. Indeed, to cast it as a generalized linear cascade model, we had
-to perform the transformation $\Theta_{i,j} = \log(1-p_{i,j})$, where $p_{i,j}$
-are the infection probabilities. Fortunately, if we estimate $p_j$ through
-$\hat{p}_{j} = \hat{\theta}_j$, a bound on $\|\hat\theta - \theta^*\|_2$
-directly implies a bound on $\|\hat p - p^*\|_2$:
+In Section~\ref{sec:results}, we show bounds on the parameter of the graph.
+Fortunately, a bound on $\|\hat\theta - \theta^*\|_2$ directly implies a bound
+on $\|\hat p - p^*\|_2$ as shown by the following lemma:
\begin{lemma}
$\|\hat{\theta} - \theta^* \|_2 \geq \|\hat{p} - p^*\|_2$.
@@ -166,39 +166,43 @@ with inverse link function $f: z \mapsto z$.
\subsubsection{Discretization of Continous Model}
-Anoter motivation for the Generalized Linear Cascade model is that it captures
+Another motivation for the Generalized Linear Cascade model is that it captures
the time-discretized formulation of the well-studied continuous-time
independent cascade model with exponential transmission function (CICE)
of~\cite{GomezRodriguez:2010, Abrahao:13, Daneshmand:2014}. Suppose that time
is binned into intervals of length $\epsilon$ and let $X^k$ be the set of nodes
`infected' before or during the $k^{th}$ time interval. Note that contrary to
-the discrete-time independent cascade model, $X^k_i = 1 \implies X^{k+1}_i =
-1$. Let $\exp(p)$ be an exponentially-distributed random variable of parameter
-$p$. By the memoryless property of the exponential, if $X^k_j \neq 1$:
+the discrete-time independent cascade model, $X^k_j = 1 \implies X^{k+1}_j =
+1$.
+
+Let $\exp(p)$ be an exponentially-distributed random variable of parameter $p$
+and let $p_{i,j}$ be the rate of transmission along directed edge $(i,j)$ in the
+CICE model. By the memoryless property of the exponential, if $X^k_j \neq 1$:
\begin{align*}
\mathbb{P}(X^{k+1}_j = 1 | X^k) & = \mathbb{P}(\min_{i \in {\cal N}(j)}
\exp(p_{i,j}) \leq \epsilon) \\
& = \mathbb{P}(\exp( \sum_{i=1}^m \Theta_{i,j} X^t_i) \leq \epsilon) \\
- & = 1 - e^{- \epsilon \cdot \theta_j \cdot X^t}
+ & = 1 - e^{- \epsilon \Theta_j \cdot X^t}
\end{align*}
Note that here we implicitly let $\Theta_{i,j} = 0$ if $(i,j)$ is not a
directed edge of the graph. Note that this formulation is also consistent when
$X^k_j = 1$ by considering the dummy variables $\forall i, \Theta_{i,i} =
-+\infty$. Therefore, the $\epsilon-$binned-process of the continuous-time
-model with exponential transmission function is a Generalized Linear Cascade
-model with inverse link function $f:z\mapsto 1-e^{-\epsilon\cdot z}$.
++\infty$. Therefore, the $\epsilon$-time-binned CICE-induced process is a
+Generalized Linear Cascade model with inverse link function $f:z\mapsto
+1-e^{-\epsilon\cdot z}$.
\subsubsection{Logistic Cascades}
Consider the specific case of ``logistic cascades'' (where $f$ is the logistic
-function). This model captures the idea that there is a threshold after which,
-each additional neighbor's influence contribution, becomes significant. In this
-way, logistic cascades can be thought of approximating the more-commonly
-studied Linear Threshold model TODO: CITE. As we will see later in the
-analysis, the Hessian of our optimization program simplifies to
-$\nabla^2\mathcal{L}(\theta^*) = \frac{1}{|\mathcal{T}|}XX^T$ where $X$ is the
-observation matrix $[x^1 \ldots x^\mathcal{|T|}]$. The (RE)-condition is then
-equivalent to the assumption made in the Lasso analysis of \cite{bickel:2009}.
+function). This model captures the idea that there is a threshold around which
+each additional neighbor's contribution becomes significant: logistic
+cascades can be thought of approximating the more-commonly studied Linear
+Threshold model~\cite{Kempe:03}. As we will see later in the
+analysis, the Hessian of our optimization program simplifies in the case of a
+logistic inverse link function to $\nabla^2\mathcal{L}(\theta^*) =
+\frac{1}{|\mathcal{T}|}XX^T$ where $X$ is the observation matrix $[x^1 \ldots
+x^\mathcal{|T|}]$. The (RE)-condition we introduce subsequently is then
+equivalent to the assumption made in the Lasso analysis of~\cite{bickel:2009}.
@@ -230,7 +234,10 @@ equivalent to the assumption made in the Lasso analysis of \cite{bickel:2009}.
\begin{figure}
\includegraphics[scale=.4]{figures/drawing.pdf}
- \caption{Illustration of the sparse-recovery approach}
+ \caption{Illustration of the sparse-recovery approach: the measurement matrix
+ is known, as is a Bernoulli realization of the matrix-vector product via the
+non-linear transformation induced by $f$. Our objective is to recover the
+unknown weight vector $\theta_i$ for each node $i$.}
\end{figure}