aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2015-12-12 00:52:52 -0500
committerThibaut Horel <thibaut.horel@gmail.com>2015-12-12 00:52:52 -0500
commit928dace484b0f4ffd70229c5e12c892c35f988b1 (patch)
tree15901f02ba5997489431d3f6792ada44cc39d13a
parent6f876aa0fa8596eec71eec640ca9556b48034b6f (diff)
downloadcascades-928dace484b0f4ffd70229c5e12c892c35f988b1.tar.gz
Final commit before submissionHEADmaster
-rw-r--r--finale/final_report.tex5
-rw-r--r--finale/sections/appendix.tex3
-rw-r--r--finale/sections/experiments.tex48
3 files changed, 45 insertions, 11 deletions
diff --git a/finale/final_report.tex b/finale/final_report.tex
index ec0fe9c..bac2d9a 100644
--- a/finale/final_report.tex
+++ b/finale/final_report.tex
@@ -98,8 +98,13 @@ Airoldi and Scott Linderman for fruitful discussions.
\bibliography{sparse}
\bibliographystyle{icml2015}
+\newpage
+
+\vspace*{10em}
\newpage
+
+
\section{Appendix}
\label{sec:appendix}
\input{sections/appendix.tex}
diff --git a/finale/sections/appendix.tex b/finale/sections/appendix.tex
index b53de70..d81f8d5 100644
--- a/finale/sections/appendix.tex
+++ b/finale/sections/appendix.tex
@@ -1,4 +1,5 @@
-\begin{figure*}
+See figure next page.
+\begin{figure*}[h!]
\centering
\subfigure[][50 cascades]{
\includegraphics[scale=.4]{../simulation/plots/2015-11-05_22:52:30.pdf}}\hspace{1em}%
diff --git a/finale/sections/experiments.tex b/finale/sections/experiments.tex
index 534ea7b..00950aa 100644
--- a/finale/sections/experiments.tex
+++ b/finale/sections/experiments.tex
@@ -26,7 +26,10 @@ one for the infected nodes and one for the susceptible nodes, the variational
inference objective can be written as a simple computation graph on matrices
that Theano compiles to machine code optimized for GPU execution.
-\paragraph{Synthetic graph.} Since intuitively if nodes are exchangeable
+The code for our Variational Inference as well as MCMC implementations can be
+found at \url{http://thibaut.horel.org/cs281_project.zip}.
+
+\paragraph{Synthetic graph.} Since intuitively, if nodes are exchangeable
(\emph{i.e} their topological properties like degree are identical) in our
graph, the active learning policy will make little difference compared to the
uniform-source policy, we decided to test our algorithms on an unbalanced graph
@@ -42,19 +45,44 @@ A = \left( \begin{array}{cccccc}
\right)
\end{equation*}
In other words, graph $\mathcal{G}_A$ is a wheel graph where every node, except
-for the center node, points to its (clock-wise) neighbor.
+for the center node, points to its (clock-wise) neighbor. Even though the wheel
+graph might not seem realistic, it is known that social networks (as well as
+graphs found in many other real-life applications) exhibit a power-law degree
+distribution which is also highly unbalanced.
+
+\paragraph{Baselines.} Our experiments involved comparing the active learning
+described in Section 4 to two baselines. For easier comparison, both baselines
+are also online learning methods: cascades are generated on the fly and only
+used once by the SGD algorithm to update the approximate posterior parameters.
+The two baselines are:
+\begin{itemize}
+ \item \emph{uniform source:} for each cascade the source node is chosen
+ uniformly at random among all vertices. Intuitively, this method should
+ behaves similarly to offline learning.
+ \item \emph{out-degree:} the source node is selected with
+ probability proportional to the estimated out-degree (\emph{i.e} the
+ sum of the estimated means on the outgoing edges).
+\end{itemize}
-\paragraph{Baseline.} In order for the baseline to be fair, we choose to create
-cascades starting from the source node on the fly both in the case of the
-uniform source and for the active learning policy. Each cascade is therefore
-`observed' only once. We plot the RMSE of the graph i.e. $RMSE^2
-= \frac{1}{n^2} \|\hat{\mathbf{\Theta}} - \mathbf{\Theta}\|^2_2$. The results
-are presented in
+Our active learning method, henceforth called \emph{out-variance} is then
+compared to the two baselines by plotting the RMSE of the estimated graph,
+\emph{i.e} $\text{RMSE}^2 = \frac{1}{n^2} \|\hat{\mathbf{\Theta}}
+- \mathbf{\Theta}\|^2_2$, as a function of the observed number of cascades. The
+results are presented in \cref{fig:results}. As we can see, both
+\emph{out-degree} and \emph{out-variance} which are active in nature vastly
+outperform \emph{uniform source} which is passive. Surprisingly,
+\emph{out-degree} which is seemingly very crude since it does not take into
+account the current uncertainty performs comparably to \emph{out-variance}. We
+believe that this is due to the specific nature of the wheel graph: for this
+graph \emph{out-degree} repeatedly selects the central node which is adjacent
+to 50\% of the graph edges.
\begin{figure}
\centering
-\label{fig:graphical}
\includegraphics[scale=.4]{finale.png}
-\caption{}
+\caption{RMSE of the proposed active learning heuristic compared to the
+baselines. The $x$ axis shows the number of batches: each batch contains 100
+time steps of the cascade process.}
+\label{fig:results}
\end{figure}