diff options
Diffstat (limited to 'general.tex')
| -rw-r--r-- | general.tex | 8 |
1 files changed, 5 insertions, 3 deletions
diff --git a/general.tex b/general.tex index e7a955f..23145ad 100644 --- a/general.tex +++ b/general.tex @@ -1,13 +1,15 @@ -\subsection{Bayesian Experimental Design} -In this section, we extend our results to Bayesian experimental design \cite{chaloner1995bayesian}. In particular, we show that our choice of objective function \eqref{...} has a natural interpration in this context, further motivating its selection, and Theorem~\ref{...} has a natural generalization to this context. +\subsection{Bayesian Experimental Design}\label{sec:bed} +In this section, we extend our results to Bayesian experimental design \cite{chaloner1995bayesian}. We show that objective function \eqref{modified} has a natural interpration in this context, further motivating its selection as our objective. Moreover, we extend Theorem~\ref{thm:main} to a more general Bayesian setting. -In the Bayesian setting, it is assumed that the experimenter has a prior distribution on $\beta$: in particular, $\beta$ is assumed to be sampled from a multivariate normal distribution with zero mean and covariance $\sigma^2R\in \reals^{d^2}$ (where $\sigma^2$ is the noise variance). +In the Bayesian setting, it is assumed that the experimenter has a prior distribution on $\beta$: in particular, $\beta$ has a multivariate normal prior with zero mean and covariance $\sigma^2R\in \reals^{d^2}$ (where $\sigma^2$ is the noise variance). The experimenter estimates $\beta$ through \emph{maximum a posteriori estimation}: \emph{i.e.}, finding the parameter which maximizes the posterior distribution of $\beta$ given the observations $y_S$. Under the linearity assumption \eqref{model} and the gaussian prior on $\beta$, maximum a posteriori estimation leads to the following maximization \cite{hastie}: FIX! \begin{displaymath} \hat{\beta} = \argmin_{\beta\in\reals^d} \sum_i (y_i - \T{\beta}x_i)^2 + \sum_i \norm{R\beta}_2^2 \end{displaymath} This optimization, commonly known as \emph{ridge regression}, includes an additional penalty term compared to the least squares estimation \eqref{leastsquares}. + + Let $\entropy(\beta)$ be the entropy of $\beta$ under this distribution, and $\entropy(\beta\mid y_S)$ the entropy of $\beta$ conditioned on the experiment outcomes $Y_S$, for some $S\subseteq \mathcal{N}$. In this setting, a natural objective to select a set of experiments $S$ that maximizes her \emph{information gain}: $$ I(\beta;y_S) = \entropy(\beta)-\entropy(\beta\mid y_S). $$ |
