diff options
Diffstat (limited to 'finale/sections')
| -rw-r--r-- | finale/sections/bayesian.tex | 18 |
1 files changed, 10 insertions, 8 deletions
diff --git a/finale/sections/bayesian.tex b/finale/sections/bayesian.tex index 5d95ddb..5ec04fd 100644 --- a/finale/sections/bayesian.tex +++ b/finale/sections/bayesian.tex @@ -82,14 +82,16 @@ bound on the log marginal likelihood: \{\mathbf{x}_c\}) \leq \log p_\Theta(\{ \mathbf{x}_c\}) \end{equation} -Contrary to MCMC which outputs samples from the exact posterior given all -observed data, the variational inference approach allows us to process data in -batches to provide an analytical approximation to the posterior, thus improving -scalability. In many cases, however, the expectation term cannot be found in -closed-form, and approximation by sampling does not scale well with the number -of parameters, but we can borrow ideas from Bohning~\cite{} to propose a -linear/quadratic approximation to the log-likelihood for which the expectation -term can be written analytically. +\emph{Computational considerations.} Contrary to MCMC which outputs samples +from the exact posterior given all observed data. Optimizing the variational +inference objective is amenable to batch learning. For example, when using +Stochastic Gradient Descent (SGD), the observations can be processed one at +a time to incrementally approximate the posterior, thus improving scalability. +In many cases, however, the expectation term cannot be found in closed-form, +and approximation by sampling does not scale well with the number of +parameters. In those cases, we can borrow ideas from Bohning~\cite{} to propose +a linear/quadratic approximation to the log-likelihood for which the +expectation term can be written and analytically and efficiently optimized. \subsection{Example} \label{sec:example} |
