aboutsummaryrefslogtreecommitdiffstats
diff options
context:
space:
mode:
authorjeanpouget-abadie <jean.pougetabadie@gmail.com>2015-03-19 13:17:41 -0400
committerjeanpouget-abadie <jean.pougetabadie@gmail.com>2015-03-19 13:17:41 -0400
commitc6d6ed2cd4ca205d364b000251ad582715025957 (patch)
tree0bf05731099b861a4abe1988a5f1337f503a7c5f
parent657435c83e6a423ffb739966d24c04ac28ee0e5f (diff)
downloadlearn-optimize-c6d6ed2cd4ca205d364b000251ad582715025957.tar.gz
added few minor stuff to 4
-rw-r--r--results.tex47
1 files changed, 29 insertions, 18 deletions
diff --git a/results.tex b/results.tex
index e293905..6a2e684 100644
--- a/results.tex
+++ b/results.tex
@@ -156,11 +156,6 @@ and optimize $f$ for any cardinality constraint.
optimize it under any cardinality constraint, with probability $1-e^{-cn}$.
\end{proposition}
-TODO:A much easier result to show is that an $nxn$ matrix is singular with
-probability $1/\sqrt{2}^n$, look at the determinant of random matrix theory pdf
--> and see that by taking the determinant mod 2 -> we can bound the probability
-that the determinant is even?
-
\subsection{Average weight method}
Another method is to compute for every node $i$ the \emph{average weight of
@@ -228,17 +223,15 @@ fraction of the non-zero sets is a big problem
\subsection{Average weight method}
-\subsection{Passive Optimization can be easier than Learning}
-
-Consider the matroid construction from~\cite{balcan2011learning}, which we
-recall here:
+\section{Passive Optimization can be easier than Passive Learning}
+Recall the matroid construction from~\cite{balcan2011learning}:
\begin{theorem}
For any $k$ with $k = 2^{o(n^{1/3})}$, there exists a family of sets ${\cal A}
\subseteq2^{[n]}$ and a family of matroids $\cal{M} = \{M_{\cal{B}} :
\cal{B} \subseteq\cal{A} \}$ such that:
\begin{itemize}
- \item $|\cal{A}| = k$ and $|A| = n^{1/3}$ for every $A \in \cal{A}$
+ \item $|{\cal A}| = k$ and $|A| = n^{1/3}$ for every $A \in \cal{A}$
\item For every $\cal{B} \subseteq\cal{A}$ and every $A \in \cal{A}$, we
have:
\begin{align*}
@@ -249,15 +242,33 @@ recall here:
\end{theorem}
Consider the following subset of the above family of matroids: ${\cal M}^1 =
-\{M_{\cal B} : {\cal B} \subseteq{\cal A} \wedge |{\cal A}\backslash{\cal B}| \geq
-\frac{1}{n}{\cal A}\}$. Suppose we observe sets from a uniform distribution
-over all sets of ${\cal A}$, then after polynomially observations, we will
+\{M_{\cal B} : {\cal B} \subseteq{\cal A} \wedge |{\cal A}\backslash{\cal B}|
+\geq \frac{1}{n}{\cal A}\}$ for $k = 2^{n^{1/6}}$. Let $D_u$ be the uniform
+distribution over all sets of ${\cal A}$. Consider an \emph{unknown} function
+$f$, corresponding to a rank function of one of the matroids from ${\cal M}^1$.
+After $n$ observations $O \equiv (S_i, f(S_i))_i$ for $S_i \sim D_u$, we will
observe at least one element from $\cal{A}\backslash\cal{B}$ with constant
-probability. Note that as long as we observe \emph{one} set from ${\cal A}
-\backslash {\cal B}$, we can optimize the unknown function directly. However,
-it follows from the analysis of~\cite{balcan2011learning} that we cannot learn
-this family of matroid rank functions under any distribution, even with active
-value queries!
+probability:
+
+\begin{equation}
+ \nonumber \mathbb{P}(\bigcup_{S_i} S_i \in {\cal A}\backslash{\cal B}) \geq 1 -
+ (1 - \frac{1}{n})^{n} \geq 1 - \frac{1}{e}
+\end{equation}
+
+Note that as long as we observe \emph{one} set from ${\cal A} \backslash {\cal
+B}$, we can optimize $f$ exactly! However, it follows from the analysis
+of~\cite{balcan2011learning} that we cannot learn $f$ under any distribution,
+even with active value queries! Indeed, for any polynomially-sized set of
+observations, there exists a super-polynomially number of functions in ${\cal
+M^1}$ which coincide on this set of observations, but which take very different
+values outside of this set of observations: $8n^{1/6}$ for $A \in {\cal B}$ and
+$n^{1/3}$ for $A \in {\cal A}\backslash {\cal B}$.
+
+{\bf TODO:} Notice how the distribution we observe to optimize doesn't cover
+every possible set of size $k$. This is \emph{okay} because we are evaluating
+the learning procedure for this distribution. But a cleaner simpler example
+would be nice.
+
\bibliography{optimize} \bibliographystyle{apalike}