3 files changed, 14 insertions, 21 deletions
diff --git a/paper/paper.tex b/paper/paper.tex
index 068e5d2..ca054dc 100644
--- a/paper/paper.tex
+++ b/paper/paper.tex
@@ -109,7 +109,7 @@ blabla
 \section{Discussion}
 
 
-\bibliography{example_paper}
+\bibliography{sparse}
 \bibliographystyle{icml2015}
 
 \end{document} 
diff --git a/paper/sections/intro.tex b/paper/sections/intro.tex
index d86c54e..5507513 100644
--- a/paper/sections/intro.tex
+++ b/paper/sections/intro.tex
@@ -1,21 +1,16 @@
-\begin{itemize}
-\item Small introduction about problem
-\item What is a cascade?
-\item What is our objective?
-\item Motivation for the problem.
-\item Summary of our approach
-\end{itemize}
+Many real-world phenomena can be modeled as complex diffusion processes on networks: from behavior adoption, sharing of internet memes, citations of articles, and the spread of infectious diseases. Oftentimes, the exact network is unknown to us: we observe only the behavior of the nodes through time, without knowledge of who `influenced' whom. The spread of a particular behavior through a network is known as an {\it Influence Cascade}. In this context, the {\it Graph Inference} problem is to recover the edges of the graph, and as we will argue, along with its weights, from the observation of few influence cascades.
 
-Parameters of the model:
-\begin{itemize}
-\item $p_{init}$ : multi-source: explain why more reasonable
-\item $p_{min}$
-\item $p_i <$  1- epsilon?
-\end{itemize}
+An effective algorithm recovers most of the edge links correctly from very few cascades. This problem is made all the more interesting by the fact that graph inference has shown to be possible, under various assumptions, in $\Omega(poly(\Delta) \log p)$ number of observed cascades, where $\Delta$ is the maximum degree of the graph and $p$ is the number of nodes. In other words, the dependence on $p$ is (almost miraculously) logarithmic.
+
+Since the first few papers on link prediction in networks, the research community has made good progress in defining the {\it Graph Inference} problem more clearly and suggesting effective algorithms. However, not only can the known results be improved to $\Omega(\Delta \log p)$, but it can be shown that this is tight to a certain extent. Furthermore, not only can the support of the graph be recovered effectively, but it can be shown that the edge weights themselves can be estimated under the same assumptions, and that this is also tight to a certain extent.
+
+Throughout this paper, we focus on the two main diffusion processes, outlined by Kempe et al. (...) in their seminal work [cite]: the independent cascade model (IC) and the linear threshold model. More generally, we aim to cast the graph inference problem as a {\it generalized linear model} regression problem.
+
+\paragraph{Cascades and parameters}
 
 \subsection{Related Work}
 
-\subsection{Past work}
+\paragraph{Past work}
 
 \begin{itemize}
 \item Gomez
@@ -24,7 +19,7 @@ Parameters of the model:
 \item Gomez: $d^3$ log n + assumptions on Hessian of diffusion process: upper and lower bound on eigenvalues + same proof concept as Netrapalli
 \end{itemize}
 
-\subsection{Our contribution}
+\paragraph{Our contribution}
 
 \begin{itemize}
 \item Better assumptions: easy to understand, verify?, and much less restrictive
diff --git a/paper/sections/results.tex b/paper/sections/results.tex
index eadc0ba..ebe90bd 100644
--- a/paper/sections/results.tex
+++ b/paper/sections/results.tex
@@ -7,11 +7,11 @@ Our approach is different. Rather than trying to perform variable selection by f
 \forall X \in {\cal C}, \| \Sigma X \|_2^2 \geq \gamma_n \|X\|_2^2 \qquad \ \quad \text{\bf (RE)}
 \end{equation}
 
-We cite the following Theorem from \cite{Negahban:2009}:
+We cite the following Theorem from \cite{Negahban:2009}
 
 \begin{theorem}
 \label{thm:neghaban}
-Suppose the true vector $\theta^*$ has support S of size s and the {\bf(RE)} assumption holds for the Hessian $\nabla^2 f(\theta^*)$, then by solving Eq.~\ref{...} for $\lambda_n \geq 2 \|\nabla f(\theta^*)\|_{\infty}$ we have:
+Suppose the true vector $\theta^*$ has support S of size s and the {\bf(RE)} assumption holds for the Hessian $\nabla^2 f(\theta^*)$, then by solving Eq.~\ref{eq:mle} for $\lambda_n \geq 2 \|\nabla f(\theta^*)\|_{\infty}$ we have:
 \begin{equation}
 \|\hat \theta - \theta^* \|_2 \leq \frac{\sqrt{s}\lambda_n}{\gamma_n}
 \end{equation}
@@ -45,11 +45,9 @@ n > \frac{4}{\gamma^2 p_{\min} \eta^2} s \log p
 Then with probability $1-e^{n^\delta \log p}$, $\hat {\cal S}_\eta = {\cal S}^*_{2\eta}$, where ${\cal S}^*_{2\eta} = \{ j \in [1..p] :\theta^*_j > 2 \eta \} $. In other words, we incur no false positives (false parents) and recover all `strong' parents such that $\theta^*_j > 2 \eta$. 
 \end{corollary}
 
-Note that $n$ is the number of measurements and not the number of cascades. This is an important improvement over prior work since we expect several measurements per cascade. We claim that graph recovery is better understood as a function of $n$, the cumulative number of steps in each cascade, rather than as a function $N$, the number of individual cascades.
+Note that $n$ is the number of measurements and not the number of cascades. This is an improvement over prior work since we expect several measurements per cascade. We claim that graph recovery is better understood as a function of $n$, the cumulative number of steps in each cascade, rather than as a function $N$, the number of individual cascades.
 
 
 
 \subsection{Linear Threshold Model}
 
-
-