From 6914bc40cecb0cf3bacbe8bf44f8fb1bfb5d690b Mon Sep 17 00:00:00 2001 From: Thibaut Horel Date: Thu, 22 Nov 2012 17:14:24 +0100 Subject: Typos. There was an error in the proof of lemma 4. Fixing this error changes our approximation ratio to 12.98 instead of 19.68... --- problem.tex | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) (limited to 'problem.tex') diff --git a/problem.tex b/problem.tex index f313be0..e820f85 100644 --- a/problem.tex +++ b/problem.tex @@ -4,10 +4,10 @@ The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimum} studies how an experimenter \E\ should select the parameters of a set of experiments she is about to conduct. In general, the optimality of a particular design depends on the purpose of the experiment, \emph{i.e.}, the quantity \E\ is trying to learn or the hypothesis she is trying to validate. Due to their ubiquity in statistical analysis, a large literature on the subject focuses on learning \emph{linear models}, where \E\ wishes to fit a linear map to the data she has collected. More precisely, putting cost considerations aside, suppose that \E\ wishes to conduct $k$ among $n$ possible experiments. Each experiment $i\in\mathcal{N}\defeq \{1,\ldots,n\}$ is associated with a set of parameters (or features) $x_i\in \reals^d$, normalized so that $\|x_i\|_2\leq 1$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$, related to the experiment features $x_i$ through a linear function, \emph{i.e.}, -\begin{align} - y_i = \T{\beta} x_i + \varepsilon_i,\quad\forall i\in\mathcal{N},\label{model} +\begin{align}\label{model} + \forall i\in\mathcal{N},\quad y_i = \T{\beta} x_i + \varepsilon_i \end{align} -where $\beta$ a vector in $\reals^d$, commonly referred to as the \emph{model}, and $\varepsilon_i$ (the \emph{measurement noise}) are independent, normally distributed random variables with mean 0 and variance $\sigma^2$. +where $\beta$ is a vector in $\reals^d$, commonly referred to as the \emph{model}, and $\varepsilon_i$ (the \emph{measurement noise}) are independent, normally distributed random variables with mean 0 and variance $\sigma^2$. The purpose of these experiments is to allow \E\ to estimate the model $\beta$. In particular, under \eqref{model}, the maximum likelihood estimator of $\beta$ is the \emph{least squares} estimator: for $X_S=[x_i]_{i\in S}\in \reals^{|S|\times d}$ the matrix of experiment features and $y_S=[y_i]_{i\in S}\in\reals^{|S|}$ the observed measurements, @@ -50,8 +50,8 @@ knowledge; the objective of the buyer in this context is to select a set $S$ maximizing the value $V(S)$ subject to the constraint $\sum_{i\in S} c_i\leq B$. We write: \begin{equation}\label{eq:non-strategic} - OPT = \max_{S\subseteq\mathcal{N}} \left\{V(S) \mid - \sum_{i\in S}c_i\leq B\right\} + OPT = \max_{S\subseteq\mathcal{N}} \Big\{V(S) \;\Big| \; + \sum_{i\in S}c_i\leq B\Big\} \end{equation} for the optimal value achievable in the full-information case. %\stratis{Should be $OPT(V,c,B)$\ldots better drop the arguments here and introduce them wherever necessary.} -- cgit v1.2.3-70-g09d2