\documentclass[10pt]{article}
\usepackage{amsmath, amsthm, amssymb}
\usepackage[utf8x]{inputenc}
\usepackage[pagebackref=false,breaklinks=true,colorlinks=true]{hyperref}
\usepackage{microtype}
\usepackage[capitalize, noabbrev]{cleveref}
\usepackage{algorithm}
\usepackage{algpseudocode}
\usepackage{graphicx}
%\usepackage{geometry}

\newcommand{\N}{\mathcal{N}}
\newcommand{\I}{\mathcal{I}}
\newcommand{\R}{\mathbf{R}}
\newcommand{\eps}{\varepsilon}
\newcommand{\defeq}{:=}
\renewcommand{\O}{\mathrm{OPT}}
\newcommand{\E}{\mathbb{E}}
\renewcommand{\P}{\mathbb{P}}
\renewcommand{\epsilon}{\varepsilon}
\newcommand{\inprod}[1]{\left\langle #1 \right\rangle}

\newtheorem{theorem}{Theorem}
\newtheorem{lemma}[theorem]{Lemma}
\newtheorem{definition}[theorem]{Definition}
\theoremstyle{remark}
\newtheorem*{remark}{Remark}

\newcommand{\INPUT}{\item[{\bf Input:}]}
\newcommand{\OUTPUT}{\item[{\bf Output:}]}
\newcommand{\RR}{{\mathbb R}}
\newcommand{\myvec}[1]{\mathbf{#1}}
\DeclareMathOperator*{\argmax}{arg\,max}
\newcommand{\ignore}[1]{}%
\title{Proof of Space Systems}
\author{Thibaut Horel}

\begin{document}
\maketitle

\section{Introduction}

\paragraph{Proofs of Work.} Proofs of Work (PoW) were introduced in the early
90s \cite{pow} as a way restrict the access to a shared resource by forcing
a service requester to dedicate a significant amount of computational work to
every request. The original application was to fight spam in emails: email
servers would require an email to be accompanied with a proof of work (computed
by the sending email server) which would both be easy to verify but would
require non-trivial computation to be constructed. Nowadays, the most famous
application of PoW is the Bitcoin digital currency where miners compete with
each other to provide the best proof of work and add blocks of transactions to
the blockchain. The simplest PoW---and approximately the one which is used in
Bitcoin---is, given a hash function $\mathcal{H}$, finding a string $s$ such
that $\mathcal{H}(s)$ starts with $t$ zeros (where $t$ is a pre-specified
parameter).

\paragraph{Proofs of Space.}

Proofs of Space (PoS) have recently been introduced in \cite{pos} as an
alternative to Proofs of Work. In Proofs of Space, the service requires the
service requester to dedicate disk space instead of computational work to every
request. Spacemint \cite{spacemint} is a recent cryptocurrency proposal where
PoW (as in Bitcoin) is replaced by PoS. 

A toy example of PoS is the following: the prover (the service
requester) stores the inverse value table of a hash function:
\begin{center}
\begin{tabular}{|c|c|c|c|}
    \hline
    $f(x_1)$ & $f(x_2)$ & \ldots & $f(x_n)$\\
    \hline
    $x_1$ & $x_2$ & \ldots & $x_n$\\
    \hline
\end{tabular}
\end{center}
The prover can then prove that he allocated the space by answering challenges:
a challenge is random value in the output space of the hash function $f$ (some
$f(x)$ for a random $x$). The answer to the challenge consists in a pre-image
of $f(x)$, \emph{i.e} $y$ such that $f(y) = f(x)$. Unless the prover stores the
entire value table, answering the challenge would amount to break the
second-preimage resistance of the hash function. On the other hand, if the
prover has honestly allocated the space, answering the challenge can be done
with negligible computation (for example in $O(1)$ time assuming implicit
indexing).

Unfortunately, the above approach of storing the value table of a hash does not
have desirable cryptographic guarantees as a dishonest could cheat by trading
off disk space for computation time in a non-trivial way (see \cite{pos} for
details). In \cite{pos}, the PoS instead relies on the notion of
\emph{hard-to-pebble} graphs. In this approach, the prover generates a graph
from a known family of directed acyclic graphs.  Then the graph is labeled in
the following way, label $\ell_v$ of vertex $v$ is defined by:
\begin{equation}
    \label{eq:foo}
    \ell_v = \textsf{hash}(v, \ell_{p_1},\dots,\ell_{p_t})
\end{equation}
where $p_1,\dots,p_t$ denote the parents of vertex $v$. The prover can then
prove he allocated the space by being challenged about a random vertex $v$. The
answer to the challenge is then the label of $v$ as well as the labels of its
parents. The validity of the proof can then be verified by checking that
Equation~\eqref{eq:foo} is satisfied by the answered labels.

In this project, we investigate PoS from a systems perspective: a PoS can
indeed be seen as a data system where the data is owned by the prover who then
answer queries. As in any data system, one could ask the following question of
PoS systems:

\begin{center}
    \emph{How to design both the data layout and data access methods of a Poof
    of Space system to be able to answer queries (challenges) efficiently and
    in a reliable way?}
\end{center}

The organisation of this report is as follows: after giving the necessary
background on PoS in Section~\ref{sec:background}, we discuss design decisions
and trade-offs for the data layout and data access methods of a PoS system in
Section~\ref{sec:design}. In Section~\ref{sec:evaluation}, we present the
experimental evaluation we conducted. Finally we conclude with future research
directions.

\section{Background}
\label{sec:background}

In this section we give a high-level overview of PoS. Only the aspects which
are relevant to understand PoS as a data system are presented, the details of
the construction (and the theoretical guarantees) can be found in \cite{pos}.

\paragraph{Proofs of Space.} A PoS system is an interactive protocol between
two parties, the prover and the verifier. This protocol works in two distinct
phases:
\begin{itemize}
    \item  \emph{initialization phase:} in this phase the prover allocates some
        disk space and write some data $S$ to it. He then sends to the verifier
        a commitment $\phi(S)$ to this data.
    \item \emph{proof phase:} at any later time, the verifier can initiate
        a proof phase by sending a challenge $c$ to the prover. The prover
        computes an answer $a(c, S)$ to the challenge $c$ using the data $S$ in
        his allocated space. The verifier then verifiers that the answer is
        a valid answer to challenge $c$ and consistent with the original
        commitment $\phi(S)$.
\end{itemize}
The initialization phase (and more precisely, the commitment) prevents the
prover from cheating: once the space has been allocated and data written to it,
the same data has to be used to answer challenges; the space cannot be
re-used for other purposes.

\paragraph{Hard-to-pebble graphs.} As mentioned in the introduction, the data
that the prover writes to disk when allocating disk space is a directed acyclic
graph from a known family of so-called \emph{hard-to-pebble} graphs. The
content of the vertices of the graph are labels (outputs of a hash function)
computed as in Equation~\eqref{eq:foo}.

\paragraph{Merkle tree commitment.} The commitment $\phi(S)$ is the root of
a Merkle tree \cite{merkle} computed on top of the vertices of the graph. For
simplicity, we assume that the graph has $n = 2^k$ nodes. The Merkle tree is
then a complete binary tree computed on top of these $2^k$ nodes: the bottom
layer of the tree (the leaves) are the vertices of the hard-to-pebble graph.
Each of the $2^{k+1}-1-2^k$ internal node of the Merkle tree contains the hash
of the concatenation of its two children. The prover computes and stores (it
will be needed during the proof phase) the entire Merkle tree and sends the
root as the commitment $\phi(S)$ to the verifier.

\paragraph{Merkle tree proof.} During the proof phase, the verifier sends
a challenge $c$ to the prover. The challenge is the index of a randomly chosen
node of the hard-to-pebble graph, \emph{i.e.} a random leaf in the Merkle tree.
The answer to this challenge is a \emph{proof of inclusion} of the leaf in the
tree: the prover sends the content of the leaf as well as the content of the
sibling of all the nodes on the path from the leaf to the root. Upon receiving
this proof, the verifier can compute the content of all the nodes on the path
from the leaf to the root and verify that the computed root coincides with the
commitment $\phi(S)$ that the prover sent during the initialization phase.

\paragraph{} An illustration of a Merkle tree commitment and a proof of
inclusion is given in Figure~\ref{fig:proof}. The content of the nodes of the
tree are outputs of a hash function. Our implementation uses the SHA-256 hash
function where each hash is 256 bits (32 bytes) long. For a graph with $n=2^k$
nodes, the height of the tree is $k$ and the tree contains $2^{k+1}-1$ nodes.
Table~\ref{table} shows the amount of allocated space for different tree
heights: the height of the tree is a parameter that the prover chooses
depending on the amount of space he wishes to allocate.

\begin{table}
    \begin{center}
        \begin{tabular}{rl}
            Tree height & Data Size\\
            \hline
            25 & 1\,GB\\
            30 & 32\,GB\\
            32 & 128\,GB\\
            35 & 1\,TB
        \end{tabular}
    \end{center}
    \caption{Size of the data written to the allocated disk space for different
    tree heights.}
    \label{table}
\end{table}


\begin{figure}
    \begin{center}
    \includegraphics[width=0.7\textwidth]{graph7.pdf}
    \end{center}
    \caption{A Merkle tree commitment and proof of inclusion. The bottom layer
    of the tree (the leaves) are the vertices of the hard-to-pebble graph of
    the PoS system. Each node in the tree contains the hash of the
    concatenation of its two children. The commitment A challenge is a random
    leaf in the tree (here in red). The answer to the challenge is the leaf
    itself as well as the green nodes: the siblings of the nodes on the path
    from the root to the leaf. With this information the verifier can compute
    the blue nodes from the bottom up (by computing the hash of the
    concatenation of the two children) and eventually compute the content of
    the root.}
    \label{fig:proof}
\end{figure}

\section{Design}
\label{sec:design}

It is important to note that the following are desirable properties for a PoS
system:
\begin{itemize}
    \item efficient disk space allocation (tree construction): in a typical
        application of PoS, the tree construction happens once and is then used
        repeatedly to answer challenges during multiple proof phases. Hence,
        the tree construction can be seen as a one-time cost and might not seem
        particularly crucial to optimize. However, in a cryptocurrency
        application as in Spacemint \cite{spacemint}, this one-time cost is
        a barrier to entry for newcomers and should ideally be as low as
        possible. Part of the appeal of a PoS-based cryptocurrency is that
        anyone can start mining using an unused hard drive as opposed to
        acquiring dedicated hardware as is now required to mine bitcoins.
    \item efficient proof extraction: when being queried (challenged) for
        a random lead, the prover should be able to efficiently extract the
        proof from the Merkle tree. If this requires non trivial computation,
        this defeats the purpose of using a PoS instead of a PoW.
    \item efficient verification: it should be easy to verify proofs so that
        catching dishonest provers can be achieved with sufficient probability
        of success.
\end{itemize}

In the following subsections, we will describe and discuss several methods for
tree construction and proof extraction. The experimental evaluation is
conducted in Section~\ref{sec:evaluation}.

\subsection{Tree construction}

The hard-to-pebble graph is only used to compute the leaves of the Merkle tree
and its structure (the edges) can be considered fixed. In what follows, we will
thus abstract it away as an oracle that we can query during the tree
construction to get the content of any leaf and we will focus on the actual
tree construction.

It follows from the definition of a Merkle tree (a node contains the hash of
its children) that constructing it requires traversing its node in an order
where the parent of a node is visited after the node itself. Two natural orders
naturally come to mind: breadth-fist order and depth-first post-order.
Figure~\ref{fig:orders} illustrates these two orders for a tree of height 3.

\begin{figure}
    \begin{center}
    \includegraphics[width=0.49\textwidth]{bfs.pdf}
    \includegraphics[width=0.49\textwidth]{dfs.pdf}
    \end{center}
    \caption{Two traversal orders for the Merkle tree construction:
    breadth-first on the left, and depth-first post-order on the right. In both
    cases, the nodes are traversed according to their label and written to disk
    in that order.}
    \label{fig:orders}
\end{figure}

\paragraph{Breadth-first order.} In this order, the tree is written to disk
layer by layer starting from the leaves. Since an entire layer does not fit in
memory, the algorithm has to go back and read what has already been written to
compute the hashes of the internal nodes: for example, in
Figure~\ref{fig:orders}, when reaching node 10, the algorithm has to read from
disk the content of nodes 3 and 4 to compute the hash of their concatenation.

\paragraph{Depth-first order.} A naive implementation of this order would also
require going back and reading to disk as internal nodes are constructed.
However, we note that by keeping track of the right information in memory, the
tree can be constructed in one pass without ever reading back: the algorithm
only performs writes. Specifically, we maintain a stack of hashes: whenever
a subtree is completed, the hash of its root is pushed to the stack. By doing
so, whenever we reach a node, the top two elements on the stack are the hashes
of its children: we pop both of them and push the hash of their concatenation.

Let us illustrate this with the right tree in Figure~\ref{fig:orders}: when
reaching node 13, the subtrees rooted at 7, 10, 11 and 12 have been completed
already, so the stack contains (top is on the right): $[7, 10, 11, 12]$. The
algorithm then pops $11$ and $12$ and replaces them with the hash of their
concatenation, the result is also written to disk at offset $13$. The stack now
contains $[7, 10, 13]$. We can now compute and write to disk node $14$ by
hashing the top elements of the stack. The stack now contains $[7, 14]$ and we
can finally compute the root $15$.

A pseudo-code of this construction can be found in Algorithm~\ref{alg:df}. We
present it as a recursive function which computes the hash of the root of any
subtree. The total Merkle tree can be obtained by calling this function for the
root of the subtree. Note that in this recursive presentation, the hash stack
is implicit; it is automatically maintained by the recursive call stack. For
performance reasons, as we will discuss below, the actual implementation is
non-recursive and uses an explicit stack.

\begin{algorithm}
    \caption{Depth-first construction of a subtree of the Merkle tree}
    \label{alg:df}
    \begin{algorithmic}[1]
        \Require $i$ index of the root of the subtree
        \Ensure $h$ the return value is the hash of the subtree
        \If{$i$ is a leaf}
        \State $h\gets$ result of the leaf oracle for $i$
        \Else
        \State $i_1,i_2\gets$ labels of the left and right children of $i$
        \State $h_{1}\gets$ recursively call the algorithm for $i_1$
        \State $h_{2}\gets$ recursively call the algorithm for $i_2$
        \State $h\gets\textsf{hash}(h_{1}, h_2)$
        \EndIf
        \State write $h$ to disk at location $i$
        \State\textbf{return} $h$
    \end{algorithmic}
\end{algorithm}

It is important to note that the size of the stack is proportional to the
height of the Merkle tree, \emph{i.e.} is logarithmic in the data size. For
a tree of height 32 (that is, a data size of 1\,TB) the stack size is 1.1\,KB
and easily fit in memory (and even in the L1 cache).

\paragraph{Parallelization.} Both constructions are easily parallelizable. For
the breadth-first order, the parallelization occurs level by level: each level
is split into contiguous chunks of nodes, and each chunk is sent to a different
processing element. For the depth-first order, the parallelization occurs
subtree by subtree: all subtrees of a given height can be computed in parallel
(each processing element being in charge of computing its own subtree).

\subsection{Proof extraction} Regardless of the order chosen for the layout of
the tree on disk, extracting the proof of a given leaf can be done in a similar
manner: we traverse the path from the leaf to the root and for each node along
the path read the content of its sibling and append it to the proof. We note
that from that perspective, the depth-first tree layout appears to have better
data locality: right children are co-located with their parents; this never
happens in depth-first order.

\paragraph{Batch extraction.} An important aspect of proof of space systems is
that during the proof phase, the verifier sends multiple challenges to the
prover. The number of challenges depends on the security level (probability of
failure) required by the application, but as discussed in \cite{spacemint}, for
the Spacemint application, a typical setting would require answering about 2000
challenges for a data size of 1\,TB. It is then natural to ask whether batch
processing challenges is better from the IO perspective. More precisely, the
proof for two or more leaves might have nodes in common, these nodes only need
to be read once from disk instead of once for each leaf. This is similar to
doing a shared table scan for multiple select queries in traditional database
systems.

In the evaluation section, we will compare three strategies for multiple proof
extraction:
\begin{itemize}
    \item \emph{naive:} the proofs are extracted one by one in a sequential
        manner following the order in which the challenges are received.
    \item \emph{sorted:} the challenges are first sorted in left-to-right order
        at the leaf level. The reason to do so is that the closer two leaves
        are at the leaf level, the more nodes are shared in their proofs. By
        processing the challenges in left-to-right order, we hope to have
        better cache properties: nodes that are shared are likely to still be
        in the cache when accessed for the second time.
    \item \emph{explicit sharing:} here we share reads from disk in an explicit
        manner. For this, the breadth-first order is easier: we read the tree
        level by level from the bottom to the top and keep track of which node
        to read at each level: if two nodes from the previous level require
        reading the same node at the current level, we only read it once from
        disk and append it to both proofs.
\end{itemize}

\section{Evaluation}
\label{sec:evaluation}

It follows from the discussion in Section~\ref{sec:design} that the depth-first
post-order (DF) layout from the Merkle tree has multiple advantages over the
breadth-first (BF) order:
\begin{itemize}
    \item for tree construction in BF order, the whole data needs to be
        brought back from disk to memory during the construction. In contrast,
        the DF order leads to a write-only tree construction.
    \item for proof extraction, the DF order has better data locality: some
        nodes are co-located with their parents.
\end{itemize}
The only reason why BF order might be preferable is if the explicit sharing for
batch extraction results in a significant performance gain (number of
challenges answered per second). The goal of this section is to confirm and
quantifying these effects.

\paragraph{Implementation.} My code can be obtained by cloning the git
repository at \url{http://projects.horel.org/thibaut/pos}. The implementation
is in Go to be able to later integrate it with an existing prototype
implementation of the Spacemint cryptocurrency available at
\url{https://github.com/kwonalbert/spacemint/}. Both implementations currently
share close to zero code, I re-implemented some parts of the original code as
needed to be able to more easily isolate and benchmark specific effect. The
only part which was re-used without any change is the construction of the
hard-to-pebble graph which falls outside of the scope of this report.

\paragraph{Experimental setup.} All evaluations were run on my personal laptop
with an Intel Core i7-6500U CPU @ 2.5GHz and 12\,GB of RAM memory. The hard
drive was a 512GB SSD.

\subsection{Tree construction}

In our first experiment we compare the DF and BF tree construction for
different tree heights. The results can be found in Figure~\ref{fig:const}. As
can be seen, DF order significantly outperforms BF order, with close to 25\%
running time improvement for the largest tested tree height. This improvement
seems to be mostly explained by the cache misses which confirms our hypothesis
that DF is better because it performs fewer reads (since the data is read from
disk, a read will directly translate in a cache miss).

When running this experiment with a single threaded program, it can be observed
that the disk read/write rate is far from maximum capacity while the CPU
utilization reaches 100\%. This indicates that parallelization would also
result in performance gains. A parallel implementation can be found in the code
repository. Due to time constraints, detailed results are not reported here,
but preliminary results indicate a significant speedup (linear in the degree of
parallelization up to 8 threads). We leave further investigation of the
parallelization behavior for future work.

\begin{figure}
    \begin{center}
    \includegraphics[width=0.49\textwidth]{time.pdf}
    \includegraphics[width=0.49\textwidth]{misses.pdf}
        \vspace{-2em}
    \end{center}
    \caption{Running time and cache-misses for DF and BF tree construction for
    varying tree heights.}
    \label{fig:const}
\end{figure}

\subsection{Proof extraction} We first compare the DF and BF orders for the
purpose of proof extraction. For this we generate a varying numbers of random
challenges (leaf index) and process these challenges sequentially both for the
BF and DF layout. The results can be found in Figure~\ref{fig:proofs}. We
observe two trends:
\begin{itemize}
    \item the per-proof running time decreases as the number of proof
extracted increases. We suspect that this because as the number of proof
increases the number of nodes which are shared by proofs also increases.
    \item proof extraction is faster for DF order. We suspect that this because
        of better data locality as explained in Section 3. Further
        investigation on page misses is deferred to future work.
\end{itemize}

\begin{figure}
    \begin{center}
        \includegraphics[width=0.7\textwidth]{proofs}
        \vspace{-2em}
    \end{center}
    \caption{Per-proof extraction time for DF and BF order for a tree of height
    31.}
    \label{fig:proofs}
\end{figure}

For our final experiment, we compare the different batch proof extractions
strategies discussed in Section 3. The results can be found in
Figure~\ref{fig:proofs2}. Contrary to earlier experiments (which were
misleading due to hot page cache), the best strategy seems to be the explicit
sharing of node retrieval; at the exception of very large number of proofs.

\begin{figure}
    \begin{center}
        \includegraphics[width=0.7\textwidth]{proofs2.pdf}
        \vspace{-2em}
    \end{center}
    \caption{Per-proof extraction time for DF and BF order for a tree of height
    31.}
    \label{fig:proofs2}
\end{figure}

\section*{Conclusion}

We studied Proofs of Space from the data systems perspective: proofs of space
are based on an interactive protocol of challenges and answers which is similar
to the queries/answers interaction which happens in traditional data systems.
Hence, the same questions naturally arise: how to design the data layout and
access method to make it efficient. We discussed, compared and evaluated
experimentally different strategies, both for the proof of space construction
and for the proofs retrieval. As a temporary conclusion, we propose that
significant performance improvements can be obtained by: (1) using depth-first
order instead of breadth first order (2) retrieving proofs in batch and
exploiting shared nodes in the proofs to minimize disk IO. However we are far
from having a complete understanding of the different trade-offs:
\begin{itemize}
    \item a more careful investigation of our current experiments needs to be
        conducted. Some effects have not been completely explained yet and
        monitoring specific metrics (page misses, branch misses, etc.) would
        provide a finer-grained explanation. Some experiments should be re-run
        with other settings of parameters (different tree heights, hard disk
        drive as opposed to SSD, etc.). Finally, the virtualisation of disk
        accesses happening at the OS level makes some effect harder to measure;
        performing pure in-memory experiments would also help in refining our
        conclusions.
    \item more complex data layouts could be explored. A natural one which
        comes to mind is the one used in ``Fractal tree indexes''
        \cite{fractal} which force the co-location of nodes with their
        parents.
    \item we exploited shared nodes between proofs to improve the performance
        of proof retrieval, but it could also be exploited to reduce the
        space taken by proofs (in Spacemint these proofs are broadcast to the
        network, so compressing them would reduce the network load) and make
        proof verification more efficient.
\end{itemize}


\paragraph{Acknowledgments.} I would like to thank Sunoo Park for introducing
me to the notion of Proofs of Space and the Spacemint application, Albert Kwon
for a helpful discussion on the prototype implementation of Spacemint,
Professor Stratos Idreos for his guidance in turning my vague ideas into
a project for his CS265 class at Harvard and CS265 TA Manos Athanassoulis for
his help with the experimental evaluation.

\bibliography{main}
\bibliographystyle{abbrv}


\end{document}