2 files changed, 23 insertions, 4 deletions
diff --git a/paper/paper.tex b/paper/paper.tex
index 4a25484..1201b0f 100644
--- a/paper/paper.tex
+++ b/paper/paper.tex
@@ -59,6 +59,7 @@
 \newcommand{\inprod}[2]{#1 \cdot #2}
 \newcommand{\defeq}{\equiv}
 \DeclareMathOperator*{\argmax}{argmax}
+\DeclareMathOperator*{\argmin}{argmin}
 
 \newtheorem{theorem}{Theorem}
 \newtheorem{lemma}{Lemma}
diff --git a/paper/sections/results.tex b/paper/sections/results.tex
index cd0a5f8..9c356c2 100644
--- a/paper/sections/results.tex
+++ b/paper/sections/results.tex
@@ -198,12 +198,30 @@ problem in the following manner: pick $\epsilon = \frac{\eta}{2}$ and $\eta
 $n=\Omega\left(\frac{s\log m}{\alpha^3\gamma^2}\right)$. That is, the support
 of $\theta^*$ can be found by thresholding $\hat{\theta}$ to the level $\eta$.
 
-\subsection{Approximate sparsity}
+\subsection{Approximate Sparsity}
 \label{sec:relaxing_sparsity}
 
-In practice, exact sparsity is rarely verified. For social networks in particular, it is more realistic to assume that each node has few strong `parents' and many `weak' parents. Rather than obtaining an impossibility result in this situation, we show that we pay a small price for relaxing the sparsity constraint. If we let $\theta^*_{\lfloor s \rfloor}$ be the best s-sparse approximation to $\theta^*$ defined as 
-$$\theta^*_{\lfloor s \rfloor} \defeq \min_{\|\theta\|_0 \leq s} \|\theta - \theta^*\|_1$$
-then we pay a price proportional to $\|\theta^*_s\|_1$ for recovering the weights of non-exactly sparse vectors, i.e. the sum of the coefficients of $\theta$ save for the $s$ largest. In other words, the closer $\theta^*$ is to being sparse, the smaller the price. These results are formalized in the following theorem, which is also a consequence of Theorem 1 in \cite{Negahban:2009} and lemma~\ref{lem:icc_lambda_upper_bound}.
+In practice, exact sparsity is rarely verified. For social networks in
+particular, it is more realistic to assume that each node has few ``strong''
+parents' and many ``weak'' parents. In other words, even if $\theta^*$ is not
+exactly $s$-sparse, it can be well approximated by $s$-sparse vectors.
+
+Rather than obtaining an impossibility result, we show that the bounds obtained
+in Section~\ref{sec:main_theorem} degrade gracefully in this setting. Formally,
+let
+\begin{displaymath}
+    \theta^*_{\lfloor s \rfloor}
+    \in \argmin_{\|\theta\|_0 \leq s} \|\theta - \theta^*\|_1
+\end{displaymath}
+be the best $s$-approximation to $\theta^*$, then we pay a cost proportional
+to $\|\theta^* - \theta^*_{\lfloor s\rfloor}\|_1$ for recovering the weights of
+non-exactly sparse vectors. This cost is simply the ``tail'' of
+$\theta^*$: the sum of the $m-s$ smallest coordinates of $\theta^*$.
+
+In other words, the closer $\theta^*$ is to being sparse, the smaller the
+price, and we recover the results of Section~\ref{sec:main_theorem} in the
+limit of exact sparsity. These results are formalized in the following theorem,
+which is also a consequence of Theorem 1 in \cite{Negahban:2009}.
 
 \begin{theorem}
 \label{thm:approx_sparse}