summaryrefslogtreecommitdiffstats
path: root/notes.tex
diff options
context:
space:
mode:
Diffstat (limited to 'notes.tex')
-rw-r--r--notes.tex15
1 files changed, 12 insertions, 3 deletions
diff --git a/notes.tex b/notes.tex
index b4460f3..77ec645 100644
--- a/notes.tex
+++ b/notes.tex
@@ -29,7 +29,7 @@ The cost of the regression error will be measured by the MSE:
\end{displaymath}
The general goal is to understand how the size of the database impacts
-the MSE of the derived regression function.
+the MSE of the regression function.
\subsection{From the bivariate normal case to linear regression}
If $(X,Y)$ is drawn from a bivariate normal distribution with mean
@@ -39,7 +39,16 @@ write:
Y = \condexp{Y}{X} + \big(Y-\condexp{Y}{X}\big)
\end{displaymath}
In this particular case, $\condexp{Y}{X}$ is a linear function of $X$:
-writing $\varepsilon = Y-\condexp{Y}{X}$, it is easy to see that $\expt{X\varepsilon}=0$.
+\begin{displaymath}
+\condexp{Y}{X} = \alpha X + \beta
+\end{displaymath}
+where $\alpha$ and $\beta$ can be expressed as a function of $\mu$ and
+$\Sigma$. Writing $\varepsilon = Y-\condexp{Y}{X}$, it is easy to see
+that $\expt{X\varepsilon}=0$. Furthermore $\varepsilon$ is also normally
+distributed. Under these assumptions, it can be proven that the least
+square estimator for $(\alpha,\beta)$ is optimal (it reaches the
+Cramér-Rao bound).
+
\subsection{Linear regression}
We assume a linear model:
@@ -158,7 +167,7 @@ y)^2\big)}
By the Cauchy-Schwarz inequality:
\begin{displaymath}
(1+\norm{y}^2)(1+\norm{x_0}^2)-(x_0\cdot
-y)^2 \geq 0
+y)^2 > 0
\end{displaymath}
Thus the previous inequality is consecutively equivalent to: