From 34e122cab94de7e727f5c9dd00d3c6f246cde30c Mon Sep 17 00:00:00 2001 From: Stratis Ioannidis Date: Sat, 3 Nov 2012 17:04:43 -0700 Subject: prelims --- problem.tex | 126 ++++++++++++++++++++++++++++++------------------------------ 1 file changed, 63 insertions(+), 63 deletions(-) (limited to 'problem.tex') diff --git a/problem.tex b/problem.tex index a8b89f4..237894e 100644 --- a/problem.tex +++ b/problem.tex @@ -1,4 +1,4 @@ -\subsection{Optimal Experimental Design} +\subsection{Experimental Design} The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimum} studies how an experimenter should select the parameters of a set of experiments she is about to conduct. In general, the optimality of a particular design depends on the purpose of the experiment, \emph{i.e.}, the quantity the experimenter is trying to learn or the hypothesis she is trying to validate. Due to their ubiquity in statistical analysis, a large literature on the subject focuses on learning \emph{linear models}, whereby the experimenter wishes to fit a linear map to the data she has collected. @@ -8,7 +8,7 @@ More precisely, putting cost considerations aside, suppose that an experimenter \end{align} where $\beta$ a vector in $\reals^d$, commonly referred to as the \emph{model}, and $\varepsilon_i$ (the \emph{measurement noise}) are independent, normally distributed random variables with zero mean and variance $\sigma^2$. -The purpose of these experiments is to allow the experimenter to estimate the model $\beta$. In particular, assuming Gaussian noise, the maximum likelihood estimator of $\beta$ is the \emph{least squares} estimator: for $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ the matrix of experiment features and +The purpose of these experiments is to allow the experimenter to estimate the model $\beta$. In particular, under \eqref{model}, the maximum likelihood estimator of $\beta$ is the \emph{least squares} estimator: for $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ the matrix of experiment features and $y_S=[y_i]_{i\in S}\in\reals^{|S|}$ the observed measurements, \begin{align} \hat{\beta} &=\max_{\beta\in\reals^d}\prob(y_S;\beta) =\argmin_{\beta\in\reals^d } \sum_{i\in S}(\T{\beta}x_i-y_i)^2 \nonumber\\ & = (\T{X_S}X_S)^{-1}X_S^Ty_S\label{leastsquares}\end{align} @@ -20,9 +20,9 @@ the noise terms $\varepsilon_i$). In particular, $\hat{\beta}$ has mean $\beta$ (\emph{i.e.}, it is an \emph{unbiased estimator}) and covariance $(\T{X_S}X_S)^{-1}$. -Let $V:2^\mathcal{N}\to\reals$ be a value function, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The standard optimal experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. +Let $V:2^\mathcal{N}\to\reals$ be a \emph{value function}, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The standard optimal experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. -A variety of different value functions are used in experimental design\cite{pukelsheim2006optimal}; almost all make use of the covariance $(\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value functioned preferred because of its relationship to entropy is the \emph{$D$-optimality criterion}: %which yields the following optimization problem +A variety of different value functions are used in experimental design~\cite{pukelsheim2006optimal}; almost all make use of the covariance $(\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value function preferred because of its relationship to entropy is the \emph{$D$-optimality criterion}: %which yields the following optimization problem \begin{align} V(S) &= \frac{1}{2}\log\det \T{X_S}X_S \label{dcrit} %\\ \end{align} @@ -41,94 +41,94 @@ the uncertainty on $\beta$, as captured by the entropy of its estimator. Value function \eqref{dcrit} has several appealing properties. To begin with, it is a submodular set function (see Lemma~\ref{...} and Thm.~\ref{...}). In addition, the maximization of convex relaxations of this function is a well-studied problem \cite{boyd}. Note that \eqref{dcrit} is undefined when $\mathrm{rank}(\T{X_S}X_S)