paper/sections/assumptions.tex


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51

In this section, we discuss the main assumption of Theorem~\ref{thm:neghaban} namely the restricted eigenvalue condition. We begin by comparing to the irrepresentability condition considered in \cite{Daneshmand:2014}.

\subsection{The Irrepresentability Condition}

\cite{Daneshmand:2014} rely on an `incoherence' condition on the hessian of the likelihood function. It is in fact easy to see that their condition is equivalent to the more commonly called {\it (S,s)-irrepresentability} condition:

\begin{definition}
Following similar notation, let $Q^* \defeq \nabla^2 f(\theta^*)$. Let $S$ and $S^c$ be the set of indices of all the parents and non-parents respectively and $Q_{S,S}$, $Q_{S^c,S}$, $Q_{S, S^c}$, and $Q_{S^c, S^c}$ the induced sub-matrices. Consider the following constant:
\begin{equation}
\nu_{\text{irrepresentable}}(S) \defeq \max_{\tau \in \mathbb{R}^p \ :\ \| \tau \|_{\infty} \leq 1} \|Q_{S^c, S} Q_{S, S}^{-1} \tau\|_{\infty}
\end{equation}
The (S,s)-irrepresentability holds if $\nu_{\text{irrepresentable}}(S) < 1 - \epsilon$ for $\epsilon > 0$
\end{definition}

This condition has been shown to be essentially necessary for variable selection \cite{Zhao:2006}. Several recent papers \cite{vandegeer:2011}, \cite{Zou:2006}, argue this condition is unrealistic in situations where there is a correlation between variables. Consider the following simplified example from \cite{vandegeer:2011}:
\begin{equation}
\nonumber
\left( 
\begin{array}{cccc}
I_s & \rho J \\
\rho J & I_s \\
\end{array}
\right)
\end{equation}
where $I_s$ is the $s \times s$ identity matrix, $J$ is the all-ones matrix and $\rho \in \mathbb{R}^+$. It is easy to see that $\nu_{\text{irrepresentable}}(S) = \rho s$ and $\lambda_{\min}(Q) \geq 1 - \rho$, such that for any $\rho > \frac{1}{s}$ and $\rho < 1$, the restricted eigenvalue holds trivially but the (S,s)-irrepresentability does not hold.

It is intuitive that the irrepresentability condition is stronger than the {\bf(RE)} assumption. In fact, a slightly modified result from \cite{vandegeer:2009}shows that a `strong' irrepresentability condition directly {\it implies} the {\bf(RE)} condition for $\ell2$-recovery:

\begin{proposition}
\label{prop:irrepresentability}
If the irrepresentability condition holds with $\epsilon > \frac{2}{3}$, then the restricted eigenvalue condition holds with constant  $\gamma_n \geq \frac{ (1 - 3(1 -\epsilon))^2 \lambda_{\min}^2}{4s}n$, where $\lambda_{\min} > 0$ is the smallest eigenvalue of $Q^*_{S,S}$, on which the results of \cite{Daneshmand:2014} also depend.
\end{proposition}


\subsection{The Restricted Eigenvalue Condition}

In practical scenarios, such as in social networks, recovering only the `significant' edges is a reasonable assumption. This can be done under the less restrictive eigenvalue assumption. Expressing $\gamma$ as a function of the cascade model parameters process is non-trivial. The restricted eigenvalue property is however well behaved in the following sense:

\begin{lemma}
\label{lem:expected_hessian}
Expected hessian analysis!
\end{lemma}

This result is easily proved by adapting slightly a result from \cite{vandegeer:2009} XXX. Similarly to the analysis conducted in \cite{Daneshmand:2014}, we can show that if the eigenvalue can be showed to hold for the `expected' hessian, it can be showed to hold for the hessian itself. It is easy to see that:

\begin{proposition}
\label{prop:expected_hessian}
If result holds for the expected hessian, then it holds for the hessian!
\end{proposition}

It is most likely possible to remove this extra s factor. See sub-gaussian paper by ... but the calculations are more involved.