aboutsummaryrefslogtreecommitdiffstats
path: root/finale
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2015-12-11 21:15:47 -0500
committerThibaut Horel <thibaut.horel@gmail.com>2015-12-11 21:15:47 -0500
commit6d9f6bad8fea9e062b2427f0950c662f3f83fd4a (patch)
tree8d1516d4b96621c7eb8e949bc55c4f427b95932d /finale
parent7ea0f81933617fa188e0eddf603bb69a05a66c53 (diff)
downloadcascades-6d9f6bad8fea9e062b2427f0950c662f3f83fd4a.tar.gz
Finish polishing Bayesian section
Diffstat (limited to 'finale')
-rw-r--r--finale/sections/bayesian.tex18
1 files changed, 10 insertions, 8 deletions
diff --git a/finale/sections/bayesian.tex b/finale/sections/bayesian.tex
index 5d95ddb..5ec04fd 100644
--- a/finale/sections/bayesian.tex
+++ b/finale/sections/bayesian.tex
@@ -82,14 +82,16 @@ bound on the log marginal likelihood:
\{\mathbf{x}_c\}) \leq \log p_\Theta(\{ \mathbf{x}_c\})
\end{equation}
-Contrary to MCMC which outputs samples from the exact posterior given all
-observed data, the variational inference approach allows us to process data in
-batches to provide an analytical approximation to the posterior, thus improving
-scalability. In many cases, however, the expectation term cannot be found in
-closed-form, and approximation by sampling does not scale well with the number
-of parameters, but we can borrow ideas from Bohning~\cite{} to propose a
-linear/quadratic approximation to the log-likelihood for which the expectation
-term can be written analytically.
+\emph{Computational considerations.} Contrary to MCMC which outputs samples
+from the exact posterior given all observed data. Optimizing the variational
+inference objective is amenable to batch learning. For example, when using
+Stochastic Gradient Descent (SGD), the observations can be processed one at
+a time to incrementally approximate the posterior, thus improving scalability.
+In many cases, however, the expectation term cannot be found in closed-form,
+and approximation by sampling does not scale well with the number of
+parameters. In those cases, we can borrow ideas from Bohning~\cite{} to propose
+a linear/quadratic approximation to the log-likelihood for which the
+expectation term can be written and analytically and efficiently optimized.
\subsection{Example}
\label{sec:example}