aboutsummaryrefslogtreecommitdiffstats
path: root/finale/sections/bayesian.tex
diff options
context:
space:
mode:
Diffstat (limited to 'finale/sections/bayesian.tex')
-rw-r--r--finale/sections/bayesian.tex18
1 files changed, 10 insertions, 8 deletions
diff --git a/finale/sections/bayesian.tex b/finale/sections/bayesian.tex
index 5d95ddb..5ec04fd 100644
--- a/finale/sections/bayesian.tex
+++ b/finale/sections/bayesian.tex
@@ -82,14 +82,16 @@ bound on the log marginal likelihood:
\{\mathbf{x}_c\}) \leq \log p_\Theta(\{ \mathbf{x}_c\})
\end{equation}
-Contrary to MCMC which outputs samples from the exact posterior given all
-observed data, the variational inference approach allows us to process data in
-batches to provide an analytical approximation to the posterior, thus improving
-scalability. In many cases, however, the expectation term cannot be found in
-closed-form, and approximation by sampling does not scale well with the number
-of parameters, but we can borrow ideas from Bohning~\cite{} to propose a
-linear/quadratic approximation to the log-likelihood for which the expectation
-term can be written analytically.
+\emph{Computational considerations.} Contrary to MCMC which outputs samples
+from the exact posterior given all observed data. Optimizing the variational
+inference objective is amenable to batch learning. For example, when using
+Stochastic Gradient Descent (SGD), the observations can be processed one at
+a time to incrementally approximate the posterior, thus improving scalability.
+In many cases, however, the expectation term cannot be found in closed-form,
+and approximation by sampling does not scale well with the number of
+parameters. In those cases, we can borrow ideas from Bohning~\cite{} to propose
+a linear/quadratic approximation to the log-likelihood for which the
+expectation term can be written and analytically and efficiently optimized.
\subsection{Example}
\label{sec:example}