diff options
Diffstat (limited to 'notes.tex')
| -rw-r--r-- | notes.tex | 15 |
1 files changed, 12 insertions, 3 deletions
@@ -29,7 +29,7 @@ The cost of the regression error will be measured by the MSE: \end{displaymath} The general goal is to understand how the size of the database impacts -the MSE of the derived regression function. +the MSE of the regression function. \subsection{From the bivariate normal case to linear regression} If $(X,Y)$ is drawn from a bivariate normal distribution with mean @@ -39,7 +39,16 @@ write: Y = \condexp{Y}{X} + \big(Y-\condexp{Y}{X}\big) \end{displaymath} In this particular case, $\condexp{Y}{X}$ is a linear function of $X$: -writing $\varepsilon = Y-\condexp{Y}{X}$, it is easy to see that $\expt{X\varepsilon}=0$. +\begin{displaymath} +\condexp{Y}{X} = \alpha X + \beta +\end{displaymath} +where $\alpha$ and $\beta$ can be expressed as a function of $\mu$ and +$\Sigma$. Writing $\varepsilon = Y-\condexp{Y}{X}$, it is easy to see +that $\expt{X\varepsilon}=0$. Furthermore $\varepsilon$ is also normally +distributed. Under these assumptions, it can be proven that the least +square estimator for $(\alpha,\beta)$ is optimal (it reaches the +Cramér-Rao bound). + \subsection{Linear regression} We assume a linear model: @@ -158,7 +167,7 @@ y)^2\big)} By the Cauchy-Schwarz inequality: \begin{displaymath} (1+\norm{y}^2)(1+\norm{x_0}^2)-(x_0\cdot -y)^2 \geq 0 +y)^2 > 0 \end{displaymath} Thus the previous inequality is consecutively equivalent to: |
