summaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2012-11-01 12:59:55 +0100
committerThibaut Horel <thibaut.horel@gmail.com>2012-11-01 12:59:55 +0100
commit0023881612f2e100360f92aa3560bd443313c060 (patch)
tree6251c2e4a40cee41a9fcbd9b71734cfeca268cfa
parentde119c1bfc64a0c34fa4239b8c50a80c08244d94 (diff)
downloadrecommendation-0023881612f2e100360f92aa3560bd443313c060.tar.gz
Dropbox changes
-rw-r--r--problem.tex15
1 files changed, 7 insertions, 8 deletions
diff --git a/problem.tex b/problem.tex
index a26ae65..2c80d4e 100644
--- a/problem.tex
+++ b/problem.tex
@@ -10,21 +10,20 @@ where $\beta$ a vector in $\reals^d$, commonly referred to as the \emph{model},
The purpose of these experiments is to allow the experimenter to estimate the model $\beta$. In particular, assuming Gaussian noise, the maximum likelihood estimator of $\beta$ is the \emph{least squares} estimator: for $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ the matrix of experiment features and
$y_S=[y_i]_{i\in S}\in\reals^{|S|}$ the observed measurements,
-\begin{align} \hat{\beta} &=\max_{\beta\in\reals^d}\prob(y_S;\beta) =\argmin_{\beta\in\reals^d } \sum_{i\in S}(\T{\beta}x_i-y_i)^2 \nonumber\\
-& = (\T{X_S}X_S)^{-1}X_S^Ty_S \label{leastsquares}\end{align}
+\begin{align*} \hat{\beta} &=\max_{\beta\in\reals^d}\prob(y_S;\beta) =\argmin_{\beta\in\reals^d } \sum_{i\in S}(\T{\beta}x_i-y_i)^2 \\
+& = (\T{X_S}X_S)^{-1}X_S^Ty_S\end{align*}
%The estimator $\hat{\beta}$ is unbiased, \emph{i.e.}, $\expt{\hat{\beta}} = \beta$ (where the expectation is over the noise variables $\varepsilon_i$). Furthermore, $\hat{\beta}$ is a multidimensional normal random variable with mean $\beta$ and covariance matrix $(X_S\T{X_S})^{-1}$.
Note that the estimator $\hat{\beta}$ is a linear map of $y_S$; as $y_S$ is a multidimensional normal r.v., so is $\hat{\beta}$ (the randomness coming from the noise terms $\varepsilon_i$). In particular, $\hat{\beta}$ has mean $\beta$ (\emph{i.e.}, it is an \emph{unbianced estimator}) and covariance $(\T{X_S}X_S)^{-1}$.
-
+
Let $V:2^\mathcal{N}\to\reals$ be a value function, quantifying how informative a set of experiments $S$ is in estimating $\beta$. The standard optimal experimental design problem amounts to finding a set $S$ that maximizes $V(S)$ subject to the constraint $|S|\leq k$.
-A variety of different value functions are used in experimental design~\cite{pukelsheim2006optimal}; almost all are related to the covariance $(\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. A commonly used choice is the so-called \emph{$D$-optimality criterion} \cite{pukelsheim2006optimal,chaloner1995bayesian}: %which yields the following optimization problem
+There is a variety of different value functions used in experimental design\cite{pukelsheim2006optimal}. Almost all capture this through some property the covariance $(\T{X_S}X_S)^{-1}$ of the estimator $\hat{\beta}$. Due to its relationship to entropy, a most commonly used is the \emph{$D$-optimality criterion}: %which yields the following optimization problem
\begin{align}
- V(S) &= \frac{1}{2}\log\det \T{X_S}X_S \label{dcrit} %\\
+ V_D(S) &= \frac{1}{2}\log\det \T{X_S}X_S \label{dcrit} %\\
\end{align}
-As $\hat{\beta}$ is a multidimensional normal random variable, the $D$-optimality criterion is equal (up to a costant) to the negative of the entropy of $\hat{\beta}$. Hence, maximizing \eqref{dcrit} amounts to finding the set of experiments that minimizes the uncertainty on $\beta$, as captured by the entropy of its estimator.
+As $\hat{\beta}$ is a multidimensional normal random variable, the $D$-optimality criterion is equal (up to a costant) to the negative of the entropy of $\hat{\beta}$. Hence, selecting a set of experiments $S$ that maximizes $V_D(S)$ is equivalent to finding the set of experiments that minimizes the uncertainty on $\beta$, as captured by the entropy of its estimator.
-Value function \eqref{dcrit} has several appealing properties. To begin with, it is a submodular set function (see Lemma~\ref{...} and Thm.~\ref{...}). In addition, the maximization of convex relaxations of this function is a well-studied problem \cite{boyd}. Note that \eqref{dcrit} is undefined when $\mathrm{rank}(\T{X_S}X_S)<d$; in this case, we take $V(S)=-\infty$ (so that $V$ takes values in the extended reals).
%As discussed in the next section, in this paper, we work with a modified measure of information function, namely
%\begin{align}
%V(S) & = \frac{1}{2} \log\det \left(I + \T{X_S}X_S\right)
@@ -35,7 +34,7 @@ Value function \eqref{dcrit} has several appealing properties. To begin with, it
\subsection{Budget Feasible Mechanism Design}
In this paper, we approach the problem of optimal experimental design from the perspective of \emph{a budget feasible reverse auction} \cite{singer-mechanisms}. In particular, we assume that each experiment $i\in \mathcal{N}$ is associated with a cost $c_i$, that the experimenter needs to pay in order to conduct the experiment. The experimenter has a budget $B\in\reals_+$. In the \emph{full information case}, the costs are common knowledge; optimal design in this context amounts to selecting a set $S$ maximizing the value $V(S)$ subject to the constraint $\sum_{i\in S} c_i\leq B$.
-As in \cite{singer-mechanisms,chen}, we focus on a \emph{strategic scenario}: experiment $i$ corresponds to a \emph{strategic agent}, whose cost $c_i$ is private. For example, each $i$ may correspond to a human participant; the feature vector $x_i$ may correspond to a normalized vector of her age, weight, gender, income, \emph{etc.}, and the measurement $y_i$ may capture some biometric information (\emph{e.g.}, her red cell blood count, a genetic marker, etc.). The cost $c_i$ is the amount the participant deems sufficient to incentivize her participation in the study.
+As in \cite{singer-mechanisms,chen}, we focus in a \emph{strategic scenario}: experiment $i$ corresponds to a \emph{strategic agent}, whose cost $c_i$ is private. For example, each $i$ may correspond to a human participant; the feature vector $x_i$ may correspond to a normalized vector of her age, weight, gender, income, \emph{etc.}, and the measurement $y_i$ may capture some biometric information (\emph{e.g.}, her red cell blood count, a genetic marker, etc.). The cost $c_i$ is the amount the participant deems sufficient to incentivize her participation in the study.
As in \cite{singer-mechanisms,chen}, we focus in a \emph{strategic scenario}:
experiment $i$ corresponds to a \emph{strategic agent}, whose cost $c_i$ is