diff options
| author | Thibaut Horel <thibaut.horel@gmail.com> | 2012-01-19 11:03:32 -0800 |
|---|---|---|
| committer | Thibaut Horel <thibaut.horel@gmail.com> | 2012-01-19 11:03:32 -0800 |
| commit | 72c43954f3f4495b0cb3204c378b08f5ce2bbab6 (patch) | |
| tree | 0c816518fa265a59bfa3b82c58467f99610b7d4e /notes.tex | |
| parent | a8b368584371d773deb0288c69606c9f14660342 (diff) | |
| download | recommendation-72c43954f3f4495b0cb3204c378b08f5ce2bbab6.tar.gz | |
Taking into account Stratis's suggestions.
Diffstat (limited to 'notes.tex')
| -rw-r--r-- | notes.tex | 55 |
1 files changed, 25 insertions, 30 deletions
@@ -7,6 +7,7 @@ \newcommand{\expt}[1]{\mathop{\mathbb{E}}\left[#1\right]} \newcommand{\norm}[1]{\lVert#1\rVert} \newcommand{\tr}[1]{#1^*} +\newcommand{\ip}[2]{\langle #1, #2 \rangle} \newcommand{\mse}{\mathop{\mathrm{MSE}}} \begin{document} @@ -102,10 +103,10 @@ verification). For any invertible matrix $A$: \end{equation} $A^{-1}$ is the inverse of a positive semidefinite matrix. Hence it is -also positive semidefinite. We will denote by $x\cdot y$ the scalar product defined by $A^{-1}$, +also positive semidefinite. We will denote by $\ip{x}{y}$ the scalar product defined by $A^{-1}$, that is: \begin{displaymath} - x\cdot y = \tr x A^{-1} y = \tr y A^{-1} x + \ip{x}{y} = \tr x A^{-1} y = \tr y A^{-1} x \end{displaymath} Using \eqref{eq:inverse} we get: @@ -113,13 +114,13 @@ Using \eqref{eq:inverse} we get: \begin{split} \tr x (A + x_0\tr x_0)^{-1} x & = \tr x A^{-1} x - \frac{\tr x A^{-1}x_0\tr x_0A^{-1} x}{1+\tr x_0 A^{-1}x_0 }\\ -& = \tr x A^{-1} x - \frac{(x\cdot x_0)^2}{1+\norm{x_0}^2} +& = \tr x A^{-1} x - \frac{\ip{x}{x_0}^2}{1+\norm{x_0}^2} \end{split} \end{displaymath} Thus: \begin{equation}\label{eq:decrease} -\mse(D_n\cup\{x_0\}) = \mse(D_n) - \frac{\sigma^2(x\cdot x_0)^2}{1+\norm{x_0}^2} +\mse(D_n\cup\{x_0\}) = \mse(D_n) - \frac{\sigma^2\ip{x}{x_0}^2}{1+\norm{x_0}^2} \end{equation} \emph{Adding one observation to the database decreases the MSE.} @@ -147,58 +148,52 @@ the database is: \frac{\sigma^2(\tr x A^{-1} x_0)^2}{1+\tr x_0 A^{-1} x_0} \end{displaymath} -If we denote by $y$ the additional observation present in $D_m$ and +If we denote by $z$ the additional observation present in $D_m$ and not in $D_n$, then we would like to prove that: \begin{displaymath} \frac{\sigma^2(\tr x A^{-1} x_0)^2}{1+\tr x_0 A^{-1} x_0} -\geq \frac{\sigma^2\left(\tr x (A+y\tr y)^{-1} x_0\right)^2}{1+\tr x_0 (A+y\tr y)^{-1} x_0} +\geq \frac{\sigma^2\left(\tr x (A+z\tr z)^{-1} x_0\right)^2}{1+\tr x_0 (A+z\tr z)^{-1} x_0} \end{displaymath} Using the same notations as before, this is equivalent to: \begin{displaymath} - \frac{(x\cdot x_0)^2}{1+\norm{x_0}^2} -\geq \frac{\left(\left(1+\norm{y}^2\right)(x\cdot x_0)-(x\cdot - y)(y\cdot x_0)\right)^2} -{\left(1+\norm{y}^2\right)\big((1+\norm{y}^2)(1+\norm{x_0}^2)-(x_0\cdot -y)^2\big)} + \frac{\ip{x}{x_0}^2}{1+\norm{x_0}^2} +\geq \frac{\left(\left(1+\norm{z}^2\right)\ip{x}{x_0}-\ip{x}{z}\ip{z}{x_0}\right)^2} +{\left(1+\norm{z}^2\right)\big((1+\norm{z}^2)(1+\norm{x_0}^2)-\ip{x_0}{z}^2\big)} \end{displaymath} By the Cauchy-Schwarz inequality: \begin{displaymath} - (1+\norm{y}^2)(1+\norm{x_0}^2)-(x_0\cdot -y)^2 > 0 + (1+\norm{z}^2)(1+\norm{x_0}^2)-\ip{x_0}{z}^2 > 0 \end{displaymath} -Thus the previous inequality is consecutively equivalent to: +Thus the previous inequality is consequently equivalent to: \begin{multline*} - \left(1+\norm{y}^2\right)^2\left(1+\norm{x_0}^2\right)(x\cdot x_0)^2 --\left(1+\norm{y}^2\right)(x_0\cdot y)^2(x\cdot x_0)^2\\ -\geq \left(1+\norm{x_0}^2\right)\left(1+\norm{y}^2\right)^2(x\cdot -x_0)^2 -+ \left(1+\norm{x_0}^2\right)(x\cdot y)^2(y\cdot x_0)^2\\ --2\left(1+\norm{x_0}^2\right)\left(1+\norm{y}^2\right)(x\cdot -x_0)(x\cdot y)(y\cdot x_0) + \left(1+\norm{z}^2\right)^2\left(1+\norm{x_0}^2\right)\ip{x}{x_0}^2 +-\left(1+\norm{z}^2\right)\ip{x_0}{z}^2\ip{x}{x_0}^2\\ +\geq \left(1+\norm{x_0}^2\right)\left(1+\norm{y}^2\right)^2\ip{x}{x_0}^2 ++ \left(1+\norm{x_0}^2\right)\ip{x}{z}^2\ip{z}{x_0}^2\\ +-2\left(1+\norm{x_0}^2\right)\left(1+\norm{z}^2\right)\ip{x}{x_0}\ip{x}{z}\ip{z}{x_0} \end{multline*} \begin{multline*} -2\left(1+\norm{x_0}^2\right)\left(1+\norm{y}^2\right)(x\cdot -x_0)(x\cdot y)(y\cdot x_0)\\ -\geq \left(1+\norm{x_0}^2\right)(x\cdot y)^2(y\cdot x_0)^2 -+ \left(1+\norm{y}^2\right)(x_0\cdot y)^2(x\cdot x_0)^2 +2\left(1+\norm{x_0}^2\right)\left(1+\norm{z}^2\right)\ip{x}{x_0}\ip{x}{z}\ip{z}{x_0}\\ +\geq \left(1+\norm{x_0}^2\right)\ip{x}{z}^2\ip{z}{x_0}^2 ++ \left(1+\norm{z}^2\right)\ip{x_0}{z}^2\ip{x}{x_0}^2 \end{multline*} -This last inequality is not true in general. As soon as $x$, $y$ and +This last inequality is not true in general. As soon as $x$, $x_0$ and $z$ span a 2-dimensional space, it is possible that for example -$(x\cdot x_0)$ and $(x\cdot y)$ are positive and $(y\cdot x_0)$ +$\ip{x}{x_0}$ and $\ip{x}{z}$ are positive and $\ip{z}{x_0}$ negative. Then the left term of the inequality will be negative and cannot be greater than the right term which is always positive. -In the one-dimensional case, the inner product $(x\cdot y)$ can be -written as $\lambda xy$ for some positive $\lambda$. Then the last +In the one-dimensional case, the inner product $\ip{x}{z}$ can be +written as $\lambda xz$ for some positive $\lambda$. Then the last inequality becomes: \begin{displaymath} - 2\geq \frac{\lambda y^2}{1+\lambda y^2} + 2\geq \frac{\lambda z^2}{1+\lambda z^2} + \frac{\lambda x_0^2}{1+\lambda x_0^2} \end{displaymath} which is trivially true (a more direct proof for the one-dimensional |
