diff options
| -rw-r--r-- | notes.tex | 8 |
1 files changed, 5 insertions, 3 deletions
@@ -25,19 +25,21 @@ vector of explanatory variables $x$. The cost of the regression error will be measured by the MSE: \begin{displaymath} - \mathrm{MSE}(f_n) = \expt{\big(f_n(x)-y\big)^2} + \mse(f_n) = \expt{\big(f_n(x)-y\big)^2} \end{displaymath} The general goal is to understand how the size of the database impacts the MSE of the derived regression function. \subsection{From the bivariate normal case to linear regression} -If $(X,Y)$ is drawn from a bivariate normal distribution. Then, one can +If $(X,Y)$ is drawn from a bivariate normal distribution with mean +vector $\mu$ and covariance matrix $\Sigma$. Then, one can write: \begin{displaymath} Y = \condexp{Y}{X} + \big(Y-\condexp{Y}{X}\big) \end{displaymath} - +In this particular case, $\condexp{Y}{X}$ is a linear function of $X$: +writing $\varepsilon = Y-\condexp{Y}{X}$, it is easy to see that $\expt{X\varepsilon}=0$. \subsection{Linear regression} We assume a linear model: |
