summaryrefslogtreecommitdiffstats
path: root/problem.tex
diff options
context:
space:
mode:
Diffstat (limited to 'problem.tex')
-rw-r--r--problem.tex9
1 files changed, 6 insertions, 3 deletions
diff --git a/problem.tex b/problem.tex
index c21135b..73647da 100644
--- a/problem.tex
+++ b/problem.tex
@@ -6,7 +6,7 @@ The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimu
Suppose that an experimenter \E\ wishes to conduct $k$ among $n$ possible
experiments. Each experiment $i\in\mathcal{N}\defeq \{1,\ldots,n\}$ is
associated with a set of parameters (or features) $x_i\in \reals^d$, normalized
-so that $$b\leq \|x_i\|^2_2\leq 1,$$ for some $b>0$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$, related to the experiment features $x_i$ through a linear function, \emph{i.e.},
+so that $b\leq \|x_i\|^2_2\leq 1,$ for some $b>0$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$, related to the experiment features $x_i$ through a linear function, \emph{i.e.},
\begin{align}\label{model}
\forall i\in\mathcal{N},\quad y_i = \T{\beta} x_i + \varepsilon_i
\end{align}
@@ -25,8 +25,11 @@ distribution on $\beta$, \emph{i.e.}, $\beta$ has a multivariate normal prior
with zero mean and covariance $\sigma^2R^{-1}\in \reals^{d^2}$ (where $\sigma^2$ is the noise variance).
Then, \E\ estimates $\beta$ through \emph{maximum a posteriori estimation}: \emph{i.e.}, finding the parameter which maximizes the posterior distribution of $\beta$ given the observations $y_S$. Under the linearity assumption \eqref{model} and the Gaussian prior on $\beta$, maximum a posteriori estimation leads to the following maximization \cite{hastie}:
\begin{align}
- \hat{\beta} = \argmax_{\beta\in\reals^d} \prob(\beta\mid y_S) =\argmin_{\beta\in\reals^d} \big(\sum_{i\in S} (y_i - \T{\beta}x_i)^2
- + \T{\beta}R\beta\big) = (R+\T{X_S}X_S)^{-1}X_S^Ty_S \label{ridge}
+ \begin{split}
+ \hat{\beta} = \argmax_{\beta\in\reals^d} \prob(\beta\mid y_S) &=\argmin_{\beta\in\reals^d} \big(\sum_{i\in S} (y_i - \T{\beta}x_i)^2
+ + \T{\beta}R\beta\big)\\
+ & = (R+\T{X_S}X_S)^{-1}X_S^Ty_S \label{ridge}
+\end{split}
\end{align}
where the last equality is obtained by setting $\nabla_{\beta}\prob(\beta\mid y_S)$ to zero and solving the resulting linear system; in \eqref{ridge}, $X_S\defeq[x_i]_{i\in S}\in \reals^{|S|\times d}$ is the matrix of experiment features and
$y_S\defeq[y_i]_{i\in S}\in\reals^{|S|}$ are the observed measurements.