summaryrefslogtreecommitdiffstats
path: root/ps1/main.tex
diff options
context:
space:
mode:
authorThibaut Horel <thibaut.horel@gmail.com>2014-09-09 12:16:35 -0400
committerThibaut Horel <thibaut.horel@gmail.com>2014-09-09 12:16:35 -0400
commit887940665581a2e5c0a864c91c887006a7ac1b2d (patch)
treeddd29c87674ba6c0ee06dd98d1ef1f846377f6b0 /ps1/main.tex
parentcdb9599635c62f95cdf88aa95dac650a436a1f1a (diff)
downloadcs224-887940665581a2e5c0a864c91c887006a7ac1b2d.tar.gz
Adding the final details. Last commit for PS1.
Diffstat (limited to 'ps1/main.tex')
-rw-r--r--ps1/main.tex85
1 files changed, 63 insertions, 22 deletions
diff --git a/ps1/main.tex b/ps1/main.tex
index 27e31e6..a2646e3 100644
--- a/ps1/main.tex
+++ b/ps1/main.tex
@@ -39,27 +39,46 @@
\maketitle
\section*{Problem 1}
-\paragraph{(a)} Statement (6) is wrong. The number of min pointers is
-not necessarily equal to the number of elements. Indeed, when you insert an
-element, you have to insert it in both its own cluster and (possibly) in the
-summary, leading to (possibly) more than one min pointer per element.
+\paragraph{(a)} Statement (6) is wrong. The number of min pointers is not
+necessarily equal to the number of elements. Indeed, inserting an element can
+possibly require creating a min pointer in its cluster and recursively
+inserting its cluster id into the summary.
-\paragraph{(c)} Let us denote by $N(u)$ the number of min pointers that you
-have to create when inserting an element in a $u$-vEB tree. We have two cases:
+\paragraph{(c)} Let us denote by $N(u)$ the number of min pointers created when
+inserting an element in a $u$-vEB tree. We have two cases:
\begin{itemize*}
\item either a cluster already exists for that element, and then you just
- have to insert it in the cluster. In this case: $N(u)
+ have to recusrively insert it in the cluster. In this case: $N(u)
= N(\sqrt{u})$
\item or you need to create a new cluster for this element (which creates
- one min pointer) and insert the cluster id in the summary. In this
- case: $N(u) = 1 + N(\sqrt{u})$.
+ one min pointer) and recursively insert the cluster id in the summary.
+ In this case: $N(u) = 1 + N(\sqrt{u})$.
\end{itemize*}
In both cases, we have $N(u) \leq 1 + N(\sqrt{u})$. Solving this recursion
gives $N(u) = O(\log\log u)$ number of min-pointers per element, which
in turns gives a space complexity of $O(n\log\log u)$ for a $u$-vEB tree.
-\paragraph{(d)}
+\paragraph{(d)} We give a family of examples for $n=\log u$, namely $b(u)$, the
+first $\log(u)$ powers of two ($b(u)\eqdef (2^{\log{u}-1},\ldots, 4, 2,1)$. Let
+us denote by $S(u)$ the number of min pointers created when inserting $b(u)$ in
+a $u$-vEB tree.
+
+We note that when inserting the first $\frac{\log u}{2}$ elements of $b(u)$, we
+will have to create one new cluster for each element and insert the cluster ids
+in the summary. But these cluster ids are exactly $b(\sqrt{u})$. The number of
+min pointers created during this phase is $S(\sqrt{u}) + \frac{\log u}{2}$.
+Then, inserting the last $\frac{\log u}{2}$ elements of $b(u)$ will lead to
+inserting $b(\sqrt{u})$ in a new cluster, whose id is 0, the number of min
+pointers created during this phase is $1 + S(\sqrt{u})$. Hence, the recursion
+takes the following form:
+\begin{displaymath}
+ S(u) = 2S(\sqrt{u}) + 1 + \frac{\log u}{2}
+\end{displaymath}
+Solving it leads to $S(u) = \Omega(\log u\log\log u) = \Omega(n\log\log u)$.
+Since the space complexity of a $u$-vEB tree is equal the number of min
+pointers it contains, we have that the space complexity of the $u$-vEB tree
+containing $b(u)$ is $\Omega(n\log\log u)$.
\paragraph{(e)} We can use indirection to achieve linear space complexity as
follows:
@@ -170,20 +189,42 @@ following procedures:
\end{itemize*}
Finally, we will need one last constant, $c$ which is the concatenation of
-$\sqrt{w}$ blocks of length $\sqrt{w}$, the $k$-th right most block being equal
-to $2^k-1$, $c\eqdef \sum_{k=1}^{\sqrt{w}} (2^k-1)2^{(k-1)\sqrt{w}}$. We first
+$\sqrt{w}$ blocks of length $\sqrt{w}$, the $k$-th rightmost block being equal
+to $2^k-1$:
+\begin{displaymath}
+c\eqdef \sum_{k=1}^{\sqrt{w}} (2^k-1})2^{(k-1)\sqrt{w}}.
+\end{displaymath}
+
+We note that if $x$ is a word of length $\sqrt{w}$, \texttt{repeat(x) \& c}
+will consist of $\sqrt{w}$ clusters of length $\sqrt{w}$; the $k$-th rightmost
+cluster is empty if the rightmost $k$ bits of $x$ are all zero, that is, if the
+least significant bit of $x$ set to one is greater than or equal to $k$. Hence,
+the least significant bit of $x$ set to one is equal to the number of empty
+clusters in \texttt{repeat(x) \& c} which we can easily compute using
+a negation and the \texttt{non-empty-clusters} procedure. This allows us to
define a procedure \texttt{lsb-short} which takes a word $x$ of length
$\sqrt{w}$ and computes its least significant bit set to one:
\begin{center}
- \texttt{lsb-short(x) := count-ones(non-empty-clusters(repeat(x) \& c))}
+ \texttt{lsb-short(x) := count-ones($\sim$ non-empty-clusters(repeat(x) \&
+c)) - 1}
\end{center}
-Then, the least significant bit set to one can be computed by:
-\begin{verbatim}
-def lsb(x):
- z := lsb-short(perfect-sketch(non-empty-clusters(x)))
- return 1 << z + lsb-short((x >> z * \sqrt{w}) & (1 << \sqrt{w} - 1))
-\end{verbatim}
-In the above code, $z$ computes the position of the rightmost non empty
-$\sqrt{w}$-block in $x$. The least significant bit set to one can then be
-obtained by applying \texttt{lsb-short} to this block.
+
+Then, the least significant bit set to one of a word $x$ of length $w$ can be
+computed by first finding the rightmost non-empty cluster of length $w$ in $x$,
+then extract this cluster by shifting and masking, and then finding the least
+significant bit set to one in this cluster:
+
+\vspace{1em}
+
+\noindent\texttt{def lsb(x):}\newline
+\hspace*{1cm}\texttt{z := lsb-short(perfect-sketch(non-empty-clusters(x)))}\newline
+\hspace*{1cm}\texttt{if z == -1:}\newline
+\hspace*{2cm}\texttt{return -1}\newline
+\hspace*{1cm}\texttt{return z * $\sqrt{\texttt{w}}$
+ + lsb-short((x >> z * $\sqrt{w}$) \& (1 << $\sqrt{w}$ - 1))}
+
+Note that the above code only requires a constant number of multiplications and
+bitwise operations once the constants $m$, $a$, $b$ and $c$ have been
+pre-computed.
+
\end{document}