paper/sections/lowerbound.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73

In this section, we will consider the stable sparse recovery setting of
Section~\ref{sec:relaxing}.  Our goal is to obtain an information-theoretic
lower bound on the number of measurements necessary to approximately recover
the parameter $\theta$ of a cascade model from observed cascades. Similar lower
bounds were obtained for sparse linear inverse problems in \cite{pw11, pw12,
bipw11}

\begin{theorem}
    \label{thm:lb}
    Let us consider a cascade model of the form XXX and a recovery algorithm
    $\mathcal{A}$ which takes as input $n$ random cascade measurements and
    outputs $\hat{\theta}$ such that with probability $\delta<\frac{1}{2}$
    (over the measurements):
    \begin{displaymath}
        \|\hat{\theta}-\theta^*\|_2\leq
        C\min_{\|\theta\|_0\leq s}\|\theta-\theta^*\|_2
    \end{displaymath}
    where $\theta^*$ is the true paramter of the cascade model. Then $n
    = \Omega(s\log\frac{m}{s}/\log C)$.
\end{theorem}

This theorem should be contrasted with Corollary~\ref{cor:relaxing}: up to an additive
$s\log s$ factor, the number of measurements required by our algorithm is
tight.

The proof of Theorem~\ref{thm:lb} follows an approach similar to \cite{pw12}.
Let us consider an algorithm $\mathcal{A}$ as in the theorem. Intuitively,
$\mathcal{A}$ allows Alice and Bob to send $\Omega(s\log\frac{m}{s})$ quantity
of information over a Gaussian channel. By the Shannon-Hartley theorem, this
quantity of information is $O(n\log C)$.  These two bounds together give the
theorem.

Formally, let $\mathcal{F}\subset \{S\subset [1..m]\,|\, |S|=s\}$ be a family
of $s$-sparse supports such that:
\begin{itemize}
    \item $|S\Delta S'|\geq s$ for $S\neq S'\in\mathcal{F}$ .
    \item $\P_{S\in\mathcal{F}}[i\in S]=\frac{s}{m}$ for all $i\in [1..m]$ and
        when $S$ is chosen uniformly at random in $\mathcal{F}$
    \item $\log|\mathcal{F}| = \Omega(s\log\frac{m}{s})$
\end{itemize}
such a family can be obtained by considering a linear code (see details in
\cite{pw11}. Let $X = \big\{t\in\{-1,0,1\}^m\,|\, \mathrm{supp}(t)\in\mathcal{F}\big\}$. 

Consider the following communication game between Alice and Bob.
\begin{itemize}
    \item Alice chooses $S$ uniformly at random from $\mathcal{F}$ and
        $t$ uniformly at random from $X$ subject to $\mathrm{supp}(x) = S$
    \item Let $w\sim\mathcal{N}(0, \alpha \frac{s}{m}I_m)$ and $\theta
        = t + w$. Since for all $\theta$, $\mathcal{A}$ recovers $\theta$ with
        probability $1-\delta$ over the measurements, there exists a fixed set
        of measurements $x_1,\ldots, x_n$ such that with probability
        $1-\delta$ over $\theta$, $\mathcal{A}$ recovers $\theta$. Alice sends
        those measurements to Bob.
    \item Bob uses $\mathcal{A}$ to recover $\hat{\theta}$ from the
        measurements, then computes $\hat{t}$ the best $\ell_2$ approximation
        of $\hat{\theta}$ in $X$.
\end{itemize}

We have the following from \cite{pw11} and \cite{pw12} respectively.

\begin{lemma}
    \label{lemma:upperinf}
    Let $S'=\mathrm{supp}(\hat{t})$. If $\alpha = \Omega(\frac{1}{C})$, $I(S, S')
    = \O(n\log C)$.
\end{lemma}

\begin{lemma}
    \label{lemma:lowerinf}
    If $\alpha = \Omega(\frac{1}{C})$, $I(S, S') = \Omega(s\log\frac{m}{s})$.
\end{lemma}

Lemmas \ref{lemma:lowerinf} and \ref{lemma:upperinf} together give Theorem
\ref{thm:lb}.