1 files changed, 6 insertions, 3 deletions
diff --git a/problem.tex b/problem.tex
index c21135b..73647da 100755
--- a/problem.tex
+++ b/problem.tex
@@ -6,7 +6,7 @@ The theory of experimental design \cite{pukelsheim2006optimal,atkinson2007optimu
 Suppose that an experimenter \E\ wishes to conduct $k$ among $n$ possible
 experiments. Each experiment $i\in\mathcal{N}\defeq \{1,\ldots,n\}$ is
 associated with a set of parameters (or features) $x_i\in \reals^d$, normalized
-so that $$b\leq \|x_i\|^2_2\leq 1,$$ for some $b>0$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$,  related to the experiment features $x_i$ through a linear function, \emph{i.e.},
+so that $b\leq \|x_i\|^2_2\leq 1,$ for some $b>0$. Denote by $S\subseteq \mathcal{N}$, where $|S|=k$, the set of experiments selected; upon its execution, experiment $i\in S$ reveals an output variable (the ``measurement'') $y_i$,  related to the experiment features $x_i$ through a linear function, \emph{i.e.},
 \begin{align}\label{model}
      \forall i\in\mathcal{N},\quad y_i = \T{\beta} x_i + \varepsilon_i
 \end{align}
@@ -25,8 +25,11 @@ distribution on $\beta$, \emph{i.e.},  $\beta$ has a multivariate normal prior
 with zero mean  and covariance $\sigma^2R^{-1}\in \reals^{d^2}$ (where $\sigma^2$ is the noise variance). 
 Then, \E\ estimates $\beta$ through \emph{maximum a posteriori estimation}: \emph{i.e.}, finding the parameter which maximizes the posterior distribution of $\beta$ given the observations $y_S$. Under the linearity assumption \eqref{model} and the Gaussian prior on $\beta$, maximum a posteriori estimation leads to the following maximization \cite{hastie}: 
 \begin{align}
-    \hat{\beta} = \argmax_{\beta\in\reals^d} \prob(\beta\mid y_S) =\argmin_{\beta\in\reals^d} \big(\sum_{i\in S} (y_i - \T{\beta}x_i)^2
-    + \T{\beta}R\beta\big) = (R+\T{X_S}X_S)^{-1}X_S^Ty_S \label{ridge}
+    \begin{split}
+        \hat{\beta} = \argmax_{\beta\in\reals^d} \prob(\beta\mid y_S) &=\argmin_{\beta\in\reals^d} \big(\sum_{i\in S} (y_i - \T{\beta}x_i)^2
+    + \T{\beta}R\beta\big)\\
+    & = (R+\T{X_S}X_S)^{-1}X_S^Ty_S \label{ridge}
+\end{split}
 \end{align}
 where the last equality is obtained by setting  $\nabla_{\beta}\prob(\beta\mid y_S)$ to zero and solving the resulting linear system; in \eqref{ridge}, $X_S\defeq[x_i]_{i\in S}\in \reals^{|S|\times d}$ is the matrix of experiment features and
 $y_S\defeq[y_i]_{i\in S}\in\reals^{|S|}$ are the observed measurements.