From b041a0d77807e18d02e266e1a0aca057db803973 Mon Sep 17 00:00:00 2001 From: Stratis Ioannidis Date: Thu, 1 Nov 2012 22:36:52 -0700 Subject: reverting my lost changes --- problem.tex | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) (limited to 'problem.tex') diff --git a/problem.tex b/problem.tex index 5723581..468ab01 100644 --- a/problem.tex +++ b/problem.tex @@ -10,19 +10,19 @@ where $\beta$ a vector in $\reals^d$, commonly referred to as the \emph{model}, The purpose of these experiments is to allow the experimenter to estimate the model $\beta$. In particular, assuming Gaussian noise, the maximum likelihood estimator of $\beta$ is the \emph{least squares} estimator: for $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ the matrix of experiment features and $y_S=[y_i]_{i\in S}\in\reals^{|S|}$ the observed measurements, -\begin{align*} \hat{\beta} &=\max_{\beta\in\reals^d}\prob(y_S;\beta) =\argmin_{\beta\in\reals^d } \sum_{i\in S}(\T{\beta}x_i-y_i)^2 \\ -& = (\T{X_S}X_S)^{-1}X_S^Ty_S\end{align*} +\begin{align} \hat{\beta} &=\max_{\beta\in\reals^d}\prob(y_S;\beta) =\argmin_{\beta\in\reals^d } \sum_{i\in S}(\T{\beta}x_i-y_i)^2 \nonumber\\ +& = (\T{X_S}X_S)^{-1}X_S^Ty_S\label{leastsquares}\end{align} %The estimator $\hat{\beta}$ is unbiased, \emph{i.e.}, $\expt{\hat{\beta}} = \beta$ (where the expectation is over the noise variables $\varepsilon_i$). Furthermore, $\hat{\beta}$ is a multidimensional normal random variable with mean $\beta$ and covariance matrix $(X_S\T{X_S})^{-1}$. Note that the estimator $\hat{\beta}$ is a linear map of $y_S$; as $y_S$ is a multidimensional normal r.v., so is $\hat{\beta}$ (the randomness coming from the noise terms $\varepsilon_i$). In particular, $\hat{\beta}$ has mean $\beta$ (\emph{i.e.}, it is an \emph{unbianced estimator}) and covariance $(\T{X_S}X_S)^{-1}$. Let $V:2^\mathcal{N}\to\reals$ be a value function, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The standard optimal experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$. -There is a variety of different value functions used in experimental design\cite{pukelsheim2006optimal}. Almost all capture this through some property the covariance $(\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. Due to its relationship to entropy, a most commonly used is the \emph{$D$-optimality criterion}: %which yields the following optimization problem +A variety of different value functions are used in experimental design\cite{pukelsheim2006optimal}; almost all make use of the the covariance $(\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A value functioned preferred because of its relationship to entropy is the \emph{$D$-optimality criterion}: %which yields the following optimization problem \begin{align} - V_D(S) &= \frac{1}{2}\log\det \T{X_S}X_S \label{dcrit} %\\ + V(S) &= \frac{1}{2}\log\det \T{X_S}X_S \label{dcrit} %\\ \end{align} -As $\hat{\beta}$ is a multidimensional normal random variable, the $D$-optimality criterion is equal (up to a costant) to the negative of the entropy of $\hat{\beta}$. Hence, selecting a set of experiments $S$ that maximizes $V_D(S)$ is equivalent to finding the set of experiments that minimizes the uncertainty on $\beta$, as captured by the entropy of its estimator. +As $\hat{\beta}$ is a multidimensional normal random variable, the $D$-optimality criterion is equal (up to a costant) to the negative of the entropy of $\hat{\beta}$. Hence, selecting a set of experiments $S$ that maximizes $V(S)$ is equivalent to finding the set of experiments that minimizes the uncertainty on $\beta$, as captured by the entropy of its estimator. %As discussed in the next section, in this paper, we work with a modified measure of information function, namely %\begin{align} @@ -30,6 +30,8 @@ As $\hat{\beta}$ is a multidimensional normal random variable, the $D$-optimalit %\end{align} %There are several reasons +Value function \eqref{dcrit} has several appealing properties. To begin with, it is a submodular set function (see Lemma~\ref{...} and Thm.~\ref{...}). In addition, the maximization of convex relaxations of this function is a well-studied problem \cite{boyd}. Note that \eqref{dcrit} is undefined when $\mathrm{rank}(\T{X_S}X_S)