Final commit before submissionHEAD master

author: Thibaut Horel <thibaut.horel@gmail.com> 2015-12-12 00:52:52 -0500
committer: Thibaut Horel <thibaut.horel@gmail.com> 2015-12-12 00:52:52 -0500
commit: 928dace484b0f4ffd70229c5e12c892c35f988b1 (patch)
tree: 15901f02ba5997489431d3f6792ada44cc39d13a
parent: 6f876aa0fa8596eec71eec640ca9556b48034b6f (diff)
download: cascades-928dace484b0f4ffd70229c5e12c892c35f988b1.tar.gz
3 files changed, 45 insertions, 11 deletions
diff --git a/finale/final_report.tex b/finale/final_report.tex
index ec0fe9c..bac2d9a 100644
--- a/finale/final_report.tex
+++ b/finale/final_report.tex
@@ -98,8 +98,13 @@ Airoldi and Scott Linderman for fruitful discussions.
 
 \bibliography{sparse}
 \bibliographystyle{icml2015}
+\newpage
+
+\vspace*{10em}
 
 \newpage
+
+
 \section{Appendix}
 \label{sec:appendix}
 \input{sections/appendix.tex}
diff --git a/finale/sections/appendix.tex b/finale/sections/appendix.tex
index b53de70..d81f8d5 100644
--- a/finale/sections/appendix.tex
+++ b/finale/sections/appendix.tex
@@ -1,4 +1,5 @@
-\begin{figure*}
+See figure next page.
+\begin{figure*}[h!]
 \centering
 \subfigure[][50 cascades]{
 \includegraphics[scale=.4]{../simulation/plots/2015-11-05_22:52:30.pdf}}\hspace{1em}%
diff --git a/finale/sections/experiments.tex b/finale/sections/experiments.tex
index 534ea7b..00950aa 100644
--- a/finale/sections/experiments.tex
+++ b/finale/sections/experiments.tex
@@ -26,7 +26,10 @@ one for the infected nodes and one for the susceptible nodes, the variational
 inference objective can be written as a simple computation graph on matrices
 that Theano compiles to machine code optimized for GPU execution.
 
-\paragraph{Synthetic graph.} Since intuitively if nodes are exchangeable
+The code for our Variational Inference as well as MCMC implementations can be
+found at \url{http://thibaut.horel.org/cs281_project.zip}.
+
+\paragraph{Synthetic graph.} Since intuitively, if nodes are exchangeable
 (\emph{i.e} their topological properties like degree are identical) in our
 graph, the active learning policy will make little difference compared to the
 uniform-source policy, we decided to test our algorithms on an unbalanced graph
@@ -42,19 +45,44 @@ A = \left( \begin{array}{cccccc}
 \right)
 \end{equation*}
 In other words, graph $\mathcal{G}_A$ is a wheel graph where every node, except
-for the center node, points to its (clock-wise) neighbor. 
+for the center node, points to its (clock-wise) neighbor. Even though the wheel
+graph might not seem realistic, it is known that social networks (as well as
+graphs found in many other real-life applications) exhibit a power-law degree
+distribution which is also highly unbalanced.
+
+\paragraph{Baselines.} Our experiments involved comparing the active learning
+described in Section 4 to two baselines. For easier comparison, both baselines
+are also online learning methods: cascades are generated on the fly and only
+used once by the SGD algorithm to update the approximate posterior parameters.
+The two baselines are:
+\begin{itemize}
+    \item \emph{uniform source:} for each cascade the source node is chosen
+        uniformly at random among all vertices. Intuitively, this method should
+        behaves similarly to offline learning.
+    \item \emph{out-degree:} the source node is selected with
+        probability proportional to the estimated out-degree (\emph{i.e} the
+        sum of the estimated means on the outgoing edges).
+\end{itemize}
 
-\paragraph{Baseline.} In order for the baseline to be fair, we choose to create
-cascades starting from the source node on the fly both in the case of the
-uniform source and for the active learning policy. Each cascade is therefore
-`observed' only once. We plot the RMSE of the graph i.e. $RMSE^2
-= \frac{1}{n^2} \|\hat{\mathbf{\Theta}} - \mathbf{\Theta}\|^2_2$. The results
-are presented in 
+Our active learning method, henceforth called \emph{out-variance} is then
+compared to the two baselines by plotting the RMSE of the estimated graph,
+\emph{i.e} $\text{RMSE}^2 = \frac{1}{n^2} \|\hat{\mathbf{\Theta}}
+- \mathbf{\Theta}\|^2_2$, as a function of the observed number of cascades. The
+results are presented in \cref{fig:results}. As we can see, both
+\emph{out-degree} and \emph{out-variance} which are active in nature vastly
+outperform \emph{uniform source} which is passive. Surprisingly,
+\emph{out-degree} which is seemingly very crude since it does not take into
+account the current uncertainty performs comparably to \emph{out-variance}. We
+believe that this is due to the specific nature of the wheel graph: for this
+graph \emph{out-degree} repeatedly selects the central node which is adjacent
+to 50\% of the graph edges.
 
 \begin{figure}
 \centering
-\label{fig:graphical}
 \includegraphics[scale=.4]{finale.png}
-\caption{}
+\caption{RMSE of the proposed active learning heuristic compared to the
+baselines. The $x$ axis shows the number of batches: each batch contains 100
+time steps of the cascade process.}
+\label{fig:results}
 \end{figure}
author	Thibaut Horel <thibaut.horel@gmail.com>	2015-12-12 00:52:52 -0500
committer	Thibaut Horel <thibaut.horel@gmail.com>	2015-12-12 00:52:52 -0500
commit	928dace484b0f4ffd70229c5e12c892c35f988b1 (patch)
tree	15901f02ba5997489431d3f6792ada44cc39d13a
parent	6f876aa0fa8596eec71eec640ca9556b48034b6f (diff)
download	cascades-928dace484b0f4ffd70229c5e12c892c35f988b1.tar.gz