aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorjeanpouget-abadie <jean.pougetabadie@gmail.com>2015-12-09 18:21:25 -0500
committerjeanpouget-abadie <jean.pougetabadie@gmail.com>2015-12-09 18:21:25 -0500
commit0df73a4639d5de09ace66b003f266ce1afa734b4 (patch)
tree061505646c897212937da8f396c3833294c96d27
parent0c7edbab0eddcc8b3d568f5e43ecc4b6a22251f0 (diff)
downloadcascades-0df73a4639d5de09ace66b003f266ce1afa734b4.tar.gz
bayesian section first pass
-rw-r--r--finale/final_report.tex1
-rw-r--r--finale/graphical.pdfbin0 -> 168270 bytes
-rw-r--r--finale/sections/bayesian.tex65
3 files changed, 61 insertions, 5 deletions
diff --git a/finale/final_report.tex b/finale/final_report.tex
index 9dc0b30..9e1e33c 100644
--- a/finale/final_report.tex
+++ b/finale/final_report.tex
@@ -47,6 +47,7 @@
\input{sections/intro.tex}
\section{Model}
+\label{sec:model}
\input{sections/model.tex}
\section{Bayesian Inference}
diff --git a/finale/graphical.pdf b/finale/graphical.pdf
new file mode 100644
index 0000000..fcdbf8b
--- /dev/null
+++ b/finale/graphical.pdf
Binary files differ
diff --git a/finale/sections/bayesian.tex b/finale/sections/bayesian.tex
index 04ba7b3..1e4caf7 100644
--- a/finale/sections/bayesian.tex
+++ b/finale/sections/bayesian.tex
@@ -1,9 +1,64 @@
-advantages, disadvantages
+\begin{figure}
+\centering
+\label{fig:graphical}
+\includegraphics[scale=.8]{graphical.pdf}
+\caption{Graphical model representation of the Network Inference Problem with
+ edge weights $\theta_{ij}$, cascade indicator vectors $X^c_t$, edge prior
+parameters $\mu$ and $\sigma$. The source distribution, parameterized by $\phi$,
+is considered fixed here.}
+\end{figure}
-graphical model and description
+In this section, we develop a Bayesian approach to the Network Inference Problem
+by placing priors on the edge weights of the graph. The quantity of interest is
+the posterior distribution, given through Bayes' rule by:
+\begin{equation}
+ \label{eq:bayesrule}
+ \Theta | \bx \propto \text{prior}(\Theta) \times \mathcal{L}_\Theta(\bx)
+\end{equation}
+where $\mathcal{L}_\Theta(\bx)$ is the likelihood expressed in
+Eq.~\ref{eq:dist}.
-MCMC
+One advantage of the Bayesian approach is its ability to convey information
+about the uncertainty surrounding each edge parameters. In the next section, we
+will explore how to exploit this knowledge to improve the rate at which we
+decrease our uncertainty by focusing on the most relevant parts of the network.
-variational inference
+Another advantage of the Bayesian approach is the ability to encode
+domain-knowledge through well-chosen prior distributions. For example, there is
+an extensive literature~\cite{} on parametric representations of social
+networks, which attempt to reproduce certain properties of such networks:
+density of triangles, diameter, degree distribution, clustering coefficient etc.
+Accounting for known graph properties, such as reciprocal links or the high
+density of triangles has the potential to greatly increase the information we
+leverage from each cascade. Of course, such priors no longer allow us to
+perform inference in parallel, which was leveraged in prior work.
-Bohning
+A systematic study of non-product priors is left for future work. We focus on
+product priors in the case of the IC model presented in Section~\ref{sec:model},
+which has no conjugate priors:
+\begin{equation}
+ \label{eq:gaussianprior}
+ \text{prior}(\Theta) = \prod_{ij} \mathcal{N}^+(\theta_{ij} | \mu_{ij},
+ \sigma_{ij})
+\end{equation}
+where $\mathcal{N}^+(\cdot)$ is a gaussian truncated to lied on $\mathbb{R}^+$
+since $\Theta$ is a transformed parameter $z \mapsto -\log(1 - z)$. This model
+is represented in the graphical model of Figure~\ref{fig:graphical}
+
+Since the IC model likelihood has no conjugate family, the prior in
+Eq.~\ref{eq:gaussianprior} is also non-conjuate. We will resort to sampling
+algorithms (MCMC) and approximate Bayesian methods (variational inference),
+which we cover here.
+
+\paragraph{MCMC}
+The Metropolis-Hastings (MCMC) algorithm allows us to draw samples from the
+posterior directly using the un-normalized posterior distribution. The advantage
+of this method is the ability to sample from the exact posterior and the wide
+availability of software packages which will work `out-of-the-box'. However,
+vanilla MCMC scales badly and is unsuitable for Bayesian learning of large
+networks ($\geq 100$ nodes). We resort to fitting approximate posterior
+distribution using a variational inference algorithm.
+
+\paragraph{Variational Inference}
+
+\paragraph{Bohning bounds}