1 files changed, 31 insertions, 12 deletions
diff --git a/finale/mid_report.tex b/finale/mid_report.tex
index b7d9608..67aedd0 100644
--- a/finale/mid_report.tex
+++ b/finale/mid_report.tex
@@ -238,7 +238,7 @@ adversarially.}
 
 \begin{example}[Lack of information]
     Let us consider the graph in \cref{fig:ident} and the source distribution
-    $p_s$ defined by $p_s\big((0, 1, 0)\big) = 1$. In other words, node $2$ is
+    $p_s$ defined by $p_s\big((0, 1, 0)\big) = 1$. In other words, node $2$ is
     always the only infected node at $t=0$. Then, it is clear that all
     casacades die at $t=1$ at which node 3 becomes infected with probability
     $f(\theta_{2,3})$. This probability does not depend on $\theta_{1,3}$.
@@ -254,7 +254,7 @@ Let us consider again the graph in \cref{fig:ident} and let us consider the
 case where the source distribution is such that $p_s\big((1,1,0)\big) = 1$,
 \emph{i.e} nodes 1 and 2 are always jointly infected at time $t=0$. Then it is
 clear that cascades will all die at time $t=1$ where node 3 becomes infected
-with probability $f(\theta_{1,3} + \theta_{2,3})$. This implies that all
+with probability $f(\theta_{1,3} + \theta_{2,3})$. This implies that all
 matrices $\Theta$ which are compatible with the graph in \cref{fig:ident}
 (defining the same edge set) and for which $\theta_{1,3} + \theta_{2,3} = c$
 for some $c\in \R_+$ define the same cascade probability distribution.
@@ -321,7 +321,7 @@ distributions which satisfy the assumptions of the proposition include a single
 source chosen uniformly at random among the vertices of the graph, or a source
 where each node in $G$ is included independently with some probability $p>0$.
 
-Finally, Assumption 1 can be relaxed to a weaker assumption. For this weaker
+Finally, assumption 1 can be relaxed to a weaker assumption. Under this weaker
 assumption, Proposition 2. becomes an exact characterization of the
 identifiability of the model. Again, we defer the details to the final version
 of this report.
@@ -332,16 +332,35 @@ of this report.
 \subsection{Maximum-Likelihood Estimation}
 \label{MLEsec}
 
-Because transitions occur independently for each node $i$, the log-likelihood
-is a sum of $k$ terms, each term $i$ only depending on $\bt_i$. Hence, each
-$\bt_i$ can be learnt separately by solving a node-specific optimization
-problem:
+It follows from the form of \cref{eq:dist} that the log-likelihood of $\Theta$
+given cascade data can be written as the sum of $k$ terms, each term $i$ only
+depending on $\bt_i$. Hence, each $\bt_i$ can be learnt separately by solving
+a node-specific optimization problem. Specifically, for a given node $i$, if we
+concatenate together all the time steps $t$ where this node was susceptible and
+write $y^t = x^{t+1}_i$, \emph{i.e} whether or not the node became infected at
+the next time step, the MLE estimator for $\bt_i$ is obtained by solving the
+following optimization problem:
+\begin{equation}
+    \label{eq:mle}
+    \hat{\theta}\in \argmax_\theta \sum_{t} y^t\log f(\theta\cdot x^t)
+    + (1-y^t) \log \big(1 - f(\theta\cdot x^t)\big)
+\end{equation}
+It is interesting to note that at the node-level, doing MLE inference for the
+GLC model is exactly amounts to fitting a Generalized Linear Model. When $f$ is
+log-concave as is the case in most examples of GLC models, then the above
+optimization problem becomes a convex optimization problem which can be solved
+exactly and efficiently. The code to perform MLE estimation can be found in the
+appendix, file \textsf{mle.py}.
+
+Furthermore, as usual for an MLE estimator, we obtain that $\hat{\theta}$ is
+asymptotically consistent under mild assumptions.
 
-We assume $f$ log-concave to have a concave optimization problem. Depending on
-the model, $\theta_{i,j}$ can either be restricted to live in a compact subset
-of $\R_+$. Otherwise, we need to assume that $\lim_{z\to\infty} f(z) = 1$.
-Then, we can apply standard results on consistency and asymptotic normality of
-MLE estimator.
+\begin{proposition}
+    Under the assumptions of Proposition 2., and further assuming that
+    $\lim_{z\to\infty} f(z) = 1$, the optimization program \eqref{eq:mle} has
+    a unique solution $\hat{\theta}$ which is an asymptotically consistent
+    estimator of $\bt_i$.
+\end{proposition}
 
 \subsection{Bayesian Approach}