diff options
Diffstat (limited to 'paper/sections/assumptions.tex')
| -rw-r--r-- | paper/sections/assumptions.tex | 46 |
1 files changed, 0 insertions, 46 deletions
diff --git a/paper/sections/assumptions.tex b/paper/sections/assumptions.tex index 33d8e69..e69de29 100644 --- a/paper/sections/assumptions.tex +++ b/paper/sections/assumptions.tex @@ -1,46 +0,0 @@ -In this section, we discuss the main assumption of Theorem~\ref{thm:neghaban} namely the restricted eigenvalue condition. We then compare to the irrepresentability condition considered in \cite{Daneshmand:2014}. - -\subsection{The Restricted Eigenvalue Condition} - -The restricted eigenvalue condition, introduced in \cite{bickel:2009}, is one of the weakest sufficient condition on the design matrix for successful sparse recovery \cite{vandegeer:2009}. Several recent papers show that large classes of correlated designs obey the restricted eigenvalue property with high probability \cite{raskutti:10} \cite{rudelson:13}. - -Expressing the minimum restricted eigenvalue $\gamma$ as a function of the cascade model parameters is highly non-trivial. Yet, the restricted eigenvalue property is however well behaved in the following sense: under reasonable assumptions, if the population matrix of the hessian $\mathbb{E} \left[\nabla^2 {\cal L}(\theta) \right]$, corresponding to the \emph{Fisher Information Matrix} of the Cascade Model as a function of $\Theta$, verifies the restricted eigenvalue property, then the finite sample hessian also verifies the restricted eigenvalue property with overwhelming probability. It is straightforward to show this holds when $n \geq C s^2 \log m$ \cite{vandegeer:2009}, where $C$ is an absolute constant. By adapting Theorem 8 \cite{rudelson:13}\footnote{this result still needs to be confirmed!}, this can be reduced to: - -$$n \geq C s \log m \log^3 \left( \frac{s \log m}{C'} \right)$$ - -where $C, C'$ are constants not depending on $(s, m, n)$. - - -\subsection{The Irrepresentability Condition} - -\cite{Daneshmand:2014} rely on an `incoherence' condition on the hessian of the likelihood function. It is in fact easy to see that their condition is equivalent to the more commonly called {\it (S,s)-irrepresentability} condition: - -\begin{definition} -Following similar notation, let $Q^* \defeq \nabla^2 f(\theta^*)$. Let $S$ and $S^c$ be the set of indices of all the parents and non-parents respectively and $Q_{S,S}$, $Q_{S^c,S}$, $Q_{S, S^c}$, and $Q_{S^c, S^c}$ the induced sub-matrices. Consider the following constant: -\begin{equation} -\nu_{\text{irrepresentable}}(S) \defeq \max_{\tau \in \mathbb{R}^p \ :\ \| \tau \|_{\infty} \leq 1} \|Q_{S^c, S} Q_{S, S}^{-1} \tau\|_{\infty} -\end{equation} -The (S,s)-irrepresentability holds if $\nu_{\text{irrepresentable}}(S) < 1 - \epsilon$ for $\epsilon > 0$ -\end{definition} - -This condition has been shown to be essentially necessary for variable selection \cite{Zhao:2006}. Several recent papers \cite{vandegeer:2011}, \cite{Zou:2006}, argue this condition is unrealistic in situations where there is a correlation between variables. Consider the following simplified example from \cite{vandegeer:2011}: -\begin{equation} -\nonumber -\left( -\begin{array}{cccc} -I_s & \rho J \\ -\rho J & I_s \\ -\end{array} -\right) -\end{equation} -where $I_s$ is the $s \times s$ identity matrix, $J$ is the all-ones matrix and $\rho \in \mathbb{R}^+$. It is easy to see that $\nu_{\text{irrepresentable}}(S) = \rho s$ and $\lambda_{\min}(Q) \geq 1 - \rho$, such that for any $\rho > \frac{1}{s}$ and $\rho < 1$, the restricted eigenvalue holds trivially but the (S,s)-irrepresentability does not hold. - -It is intuitive that the irrepresentability condition is stronger than the {\bf(RE)} assumption. In fact, a slightly modified result from \cite{vandegeer:2009} shows that a `strong' irrepresentability condition directly {\it implies} the {\bf(RE)} condition for $\ell2$-recovery: - -\begin{proposition} -\label{prop:irrepresentability} -If the irrepresentability condition holds with $\epsilon > \frac{2}{3}$, then the restricted eigenvalue condition holds with constant $\gamma_n \geq \frac{ (1 - 3(1 -\epsilon))^2 \lambda_{\min}^2}{4s}n$, where $\lambda_{\min} > 0$ is the smallest eigenvalue of $Q^*_{S,S}$, on which the results of \cite{Daneshmand:2014} also depend. -\end{proposition} - - - |
