diff options
Diffstat (limited to 'paper/sections/results.tex')
| -rw-r--r-- | paper/sections/results.tex | 12 |
1 files changed, 5 insertions, 7 deletions
diff --git a/paper/sections/results.tex b/paper/sections/results.tex index f62e6f2..ed6ebe6 100644 --- a/paper/sections/results.tex +++ b/paper/sections/results.tex @@ -9,6 +9,7 @@ As mentioned previously, our objective is twofold: It is easy to see that solving the second objective allows us to solve the first. If $\|\hat \Theta - \Theta \|_2$ is sufficiently small, we can recover all large coordinates of $\Theta$ by keeping only the coordinates above a certain threshold. \subsection{Main Theorem} +\label{sec:main_theorem} For the remainder of this section, we consider a single node $i$. For ease of presentation, the index $j$ is made implicit: $\theta_i \leftarrow \Theta_{ij}$ and $\theta \leftarrow (\Theta_{i\cdot})$. We begin our analysis with the following lemma: \begin{lemma} @@ -52,6 +53,7 @@ Suppose $\theta$ is exactly s-sparse. By choosing $\delta = 0$, if $n>\frac{36}{ Note that if we want \emph{perfect recovery} of all edges, we must estimate $p_{\min}$ within $\epsilon'$ and set $\eta \defeq p_{\min} - \epsilon'$ \subsection{Relaxing the Sparsity Constraint} +\label{sec:relaxing_sparsity} In practice, exact sparsity is rarely verified. For social networks in particular, it is more realistic to assume that each node has few strong `parents' and many `weak' parents. Rather than obtaining an impossibility result in this situation, we show that we pay a small price for relaxing the sparsity constraint. If we let $\theta^*_{\lfloor s \rfloor}$ be the best s-sparse approximation to $\theta^*$ defined as $$\theta^*_{\lfloor s \rfloor} \defeq \min_{\|\theta\|_0 \leq s} \|\theta - \theta^*\|_1$$ @@ -82,11 +84,9 @@ then similarly: ${\cal S}^*_{\eta + \epsilon} \subset \hat {\cal S}_\eta \subset \end{corollary} -\subsection{Examples} +\subsection{The Independent Cascade Model} -\subsubsection{Independent Cascade Model} - -We begin our analysis with the following simple lemma: +Recall that to cast the Independent Cascade model as a Generalized Linear Cascade, we performed the following change of variables: $\forall i,j \ \Theta_{i,j} = \log(1 - p_{i,j})$. The previous results hold \emph{a priori} for the ``effective'' parameter $\Theta$. The following lemma shows they also hold for the original infection probabilities $p_{i,j}$: \begin{lemma} \label{lem:theta_p_upperbound} $\|\theta - \theta' \|_2 \geq \|p - p^*\|_2$ and $\|\theta_{\lfloor s \rfloor}\|_1 \geq \| p_{\lfloor s \rfloor} \|_1$ @@ -95,6 +95,4 @@ $\|\theta - \theta' \|_2 \geq \|p - p^*\|_2$ and $\|\theta_{\lfloor s \rfloor}\| Using the inequality $\forall x>0, \; \log x \geq 1 - \frac{1}{x}$, we have $|\log (1 - p) - \log (1-p')| \geq \max(1 - \frac{1-p}{1-p'}, 1 - \frac{1-p'}{1-p}) \geq \max( p-p', p'-p)$. The second result follows easily from the monotonicity of $\log$ and $\forall x > 0, | \log (1 -x) | \geq x$ \end{proof} -In other words, finding an upper-bound for the estimation error of the `effective' parameters $\theta_{i,j} \defeq \log(1-p_{i,j})$ provides immediately an upper-bound for the estimation error of the true parameters $(p_{i,j})_{i,j}$. - -\subsubsection{The Voter Model} +The results of sections~\ref{sec:main_theorem} and \ref{sec:relaxing_sparsity} therefore hold for the original transition probabilities $p$. |
