summaryrefslogtreecommitdiffstats
path: root/problem.tex
diff options
context:
space:
mode:
Diffstat (limited to 'problem.tex')
-rw-r--r--problem.tex10
1 files changed, 5 insertions, 5 deletions
diff --git a/problem.tex b/problem.tex
index f313be0..e820f85 100644
--- a/problem.tex
+++ b/problem.tex
@@ -4,10 +4,10 @@
The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimum} studies how an experimenter \E\ should select the parameters of a set of experiments she is about to conduct. In general, the optimality of a particular design depends on the purpose of the experiment, \emph{i.e.}, the quantity \E\ is trying to learn or the hypothesis she is trying to validate. Due to their ubiquity in statistical analysis, a large literature on the subject focuses on learning \emph{linear models}, where \E\ wishes to fit a linear map to the data she has collected.
More precisely, putting cost considerations aside, suppose that \E\ wishes to conduct $k$ among $n$ possible experiments. Each experiment $i\in\mathcal{N}\defeq \{1,\ldots,n\}$ is associated with a set of parameters (or features) $x_i\in \reals^d$, normalized so that $\|x_i\|_2\leq 1$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$, related to the experiment features $x_i$ through a linear function, \emph{i.e.},
-\begin{align}
- y_i = \T{\beta} x_i + \varepsilon_i,\quad\forall i\in\mathcal{N},\label{model}
+\begin{align}\label{model}
+ \forall i\in\mathcal{N},\quad y_i = \T{\beta} x_i + \varepsilon_i
\end{align}
-where $\beta$ a vector in $\reals^d$, commonly referred to as the \emph{model}, and $\varepsilon_i$ (the \emph{measurement noise}) are independent, normally distributed random variables with mean 0 and variance $\sigma^2$.
+where $\beta$ is a vector in $\reals^d$, commonly referred to as the \emph{model}, and $\varepsilon_i$ (the \emph{measurement noise}) are independent, normally distributed random variables with mean 0 and variance $\sigma^2$.
The purpose of these experiments is to allow \E\ to estimate the model $\beta$. In particular, under \eqref{model}, the maximum likelihood estimator of $\beta$ is the \emph{least squares} estimator: for $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ the matrix of experiment features and
$y_S=[y_i]_{i\in S}\in\reals^{|S|}$ the observed measurements,
@@ -50,8 +50,8 @@ knowledge; the objective of the buyer in this context is to select a set $S$
maximizing the value $V(S)$ subject to the constraint $\sum_{i\in S} c_i\leq
B$. We write:
\begin{equation}\label{eq:non-strategic}
- OPT = \max_{S\subseteq\mathcal{N}} \left\{V(S) \mid
- \sum_{i\in S}c_i\leq B\right\}
+ OPT = \max_{S\subseteq\mathcal{N}} \Big\{V(S) \;\Big| \;
+ \sum_{i\in S}c_i\leq B\Big\}
\end{equation}
for the optimal value achievable in the full-information case. %\stratis{Should be $OPT(V,c,B)$\ldots better drop the arguments here and introduce them wherever necessary.}