diff options
| author | Thibaut Horel <thibaut.horel@gmail.com> | 2012-01-16 18:32:54 -0800 |
|---|---|---|
| committer | Thibaut Horel <thibaut.horel@gmail.com> | 2012-01-16 18:32:54 -0800 |
| commit | 424a6e62941f77c0633beb46c1314679de69f366 (patch) | |
| tree | 4187d6802aa517421275b3a9ee9cc5af143bb75f /notes.tex | |
| parent | 8d311a511a7c673698b08d48e80fa19dc8247a71 (diff) | |
| download | recommendation-424a6e62941f77c0633beb46c1314679de69f366.tar.gz | |
More details added to the notes
Diffstat (limited to 'notes.tex')
| -rw-r--r-- | notes.tex | 8 |
1 files changed, 5 insertions, 3 deletions
@@ -25,19 +25,21 @@ vector of explanatory variables $x$. The cost of the regression error will be measured by the MSE: \begin{displaymath} - \mathrm{MSE}(f_n) = \expt{\big(f_n(x)-y\big)^2} + \mse(f_n) = \expt{\big(f_n(x)-y\big)^2} \end{displaymath} The general goal is to understand how the size of the database impacts the MSE of the derived regression function. \subsection{From the bivariate normal case to linear regression} -If $(X,Y)$ is drawn from a bivariate normal distribution. Then, one can +If $(X,Y)$ is drawn from a bivariate normal distribution with mean +vector $\mu$ and covariance matrix $\Sigma$. Then, one can write: \begin{displaymath} Y = \condexp{Y}{X} + \big(Y-\condexp{Y}{X}\big) \end{displaymath} - +In this particular case, $\condexp{Y}{X}$ is a linear function of $X$: +writing $\varepsilon = Y-\condexp{Y}{X}$, it is easy to see that $\expt{X\varepsilon}=0$. \subsection{Linear regression} We assume a linear model: |
