\documentclass[10pt]{article} \usepackage{amsmath, amsthm, amssymb} \usepackage[utf8x]{inputenc} \usepackage[pagebackref=false,breaklinks=true,colorlinks=true]{hyperref} \usepackage{microtype} \usepackage[capitalize, noabbrev]{cleveref} \usepackage{algorithm} \usepackage{algpseudocode} \usepackage{graphicx} %\usepackage{geometry} \newcommand{\N}{\mathcal{N}} \newcommand{\I}{\mathcal{I}} \newcommand{\R}{\mathbf{R}} \newcommand{\eps}{\varepsilon} \newcommand{\defeq}{:=} \renewcommand{\O}{\mathrm{OPT}} \newcommand{\E}{\mathbb{E}} \renewcommand{\P}{\mathbb{P}} \renewcommand{\epsilon}{\varepsilon} \newcommand{\inprod}[1]{\left\langle #1 \right\rangle} \newtheorem{theorem}{Theorem} \newtheorem{lemma}[theorem]{Lemma} \newtheorem{definition}[theorem]{Definition} \theoremstyle{remark} \newtheorem*{remark}{Remark} \newcommand{\INPUT}{\item[{\bf Input:}]} \newcommand{\OUTPUT}{\item[{\bf Output:}]} \newcommand{\RR}{{\mathbb R}} \newcommand{\myvec}[1]{\mathbf{#1}} \DeclareMathOperator*{\argmax}{arg\,max} \newcommand{\ignore}[1]{}% \title{Proof of Space Systems} \author{Thibaut Horel} \begin{document} \maketitle \section{Introduction} \paragraph{Proofs of Work.} Proofs of Work (PoW) were introduced in the early 90s \cite{pow} as a way restrict the access to a shared resource by forcing a service requester to dedicate a significant amount of computational work to every request. The original application was to fight spam in emails: email servers would require an email to be accompanied with a proof of work (computed by the sending email server) which would both be easy to verify but would require non-trivial computation to be constructed. Nowadays, the most famous application of PoW is the Bitcoin digital currency where miners compete with each other to provide the best proof of work and add blocks of transactions to the blockchain. The simplest PoW---and approximately the one which is used in Bitcoin---is, given a hash function $\mathcal{H}$, finding a string $s$ such that $\mathcal{H}(s)$ starts with $t$ zeros (where $t$ is a pre-specified parameter). \paragraph{Proofs of Space.} Proofs of Space (PoS) have recently been introduced in \cite{pos} as an alternative to Proofs of Work. In Proofs of Space, the service requires the service requester to dedicate disk space instead of computational work to every request. Spacemint \cite{spacemint} is a recent cryptocurrency proposal where PoW (as in Bitcoin) is replaced by PoS. A toy example of PoS is the following: the prover (the service requester) stores the inverse value table of a hash function: \begin{center} \begin{tabular}{|c|c|c|c|} \hline $f(x_1)$ & $f(x_2)$ & \ldots & $f(x_n)$\\ \hline $x_1$ & $x_2$ & \ldots & $x_n$\\ \hline \end{tabular} \end{center} The prover can then prove that he allocated the space by answering challenges: a challenge is random value in the output space of the hash function $f$ (some $f(x)$ for a random $x$). The answer to the challenge consists in a pre-image of $f(x)$, \emph{i.e} $y$ such that $f(y) = f(x)$. Unless the prover stores the entire value table, answering the challenge would amount to break the second-preimage resistance of the hash function. On the other hand, if the prover has honestly allocated the space, answering the challenge can be done with negligible computation (for example in $O(1)$ time assuming implicit indexing). Unfortunately, the above approach of storing the value table of a hash does not have desirable cryptographic guarantees as a dishonest could cheat by trading off disk space for computation time in a non-trivial way (see \cite{pos} for details). In \cite{pos}, the PoS instead relies on the notion of \emph{hard-to-pebble} graphs. In this approach, the prover generates a graph from a known family of directed acyclic graphs. Then the graph is labeled in the following way, label $\ell_v$ of vertex $v$ is defined by: \begin{equation} \label{eq:foo} \ell_v = \textsf{hash}(v, \ell_{p_1},\dots,\ell_{p_t}) \end{equation} where $p_1,\dots,p_t$ denote the parents of vertex $v$. The prover can then prove he allocated the space by being challenged about a random vertex $v$. The answer to the challenge is then the label of $v$ as well as the labels of its parents. The validity of the proof can then be verified by checking that Equation~\eqref{eq:foo} is satisfied by the answered labels. In this project, we investigate PoS from a systems perspective: a PoS can indeed be seen as a data system where the data is owned by the prover who then answer queries. As in any data system, one could ask the following question of PoS systems: \begin{center} \emph{How to design both the data layout and data access methods of a Poof of Space system to be able to answer queries (challenges) efficiently and in a reliable way?} \end{center} The organisation of this report is as follows: after giving the necessary background on PoS in Section~\ref{sec:background}, we discuss design decisions and trade-offs for the data layout and data access methods of a PoS system in Section~\ref{sec:design}. In Section~\ref{sec:evaluation}, we present the experimental evaluation we conducted. Finally we conclude with future research directions. \section{Background} \label{sec:background} In this section we give a high-level overview of PoS. Only the aspects which are relevant to understand PoS as a data system are presented, the details of the construction (and the theoretical guarantees) can be found in \cite{pos}. \paragraph{Proofs of Space.} A PoS system is an interactive protocol between two parties, the prover and the verifier. This protocol works in two distinct phases: \begin{itemize} \item  \emph{initialization phase:} in this phase the prover allocates some disk space and write some data $S$ to it. He then sends to the verifier a commitment $\phi(S)$ to this data. \item \emph{proof phase:} at any later time, the verifier can initiate a proof phase by sending a challenge $c$ to the prover. The prover computes an answer $a(c, S)$ to the challenge $c$ using the data $S$ in his allocated space. The verifier then verifiers that the answer is a valid answer to challenge $c$ and consistent with the original commitment $\phi(S)$. \end{itemize} The initialization phase (and more precisely, the commitment) prevents the prover from cheating: once the space has been allocated and data written to it, the same data has to be used to answer challenges; the space cannot be re-used for other purposes. \paragraph{Hard-to-pebble graphs.} As mentioned in the introduction, the data that the prover writes to disk when allocating disk space is a directed acyclic graph from a known family of so-called \emph{hard-to-pebble} graphs. The content of the vertices of the graph are labels (outputs of a hash function) computed as in Equation~\eqref{eq:foo}. \paragraph{Merkle tree commitment.} The commitment $\phi(S)$ is the root of a Merkle tree \cite{merkle} computed on top of the vertices of the graph. For simplicity, we assume that the graph has $n = 2^k$ nodes. The Merkle tree is then a complete binary tree computed on top of these $2^k$ nodes: the bottom layer of the tree (the leaves) are the vertices of the hard-to-pebble graph. Each of the $2^{k+1}-1-2^k$ internal node of the Merkle tree contains the hash of the concatenation of its two children. The prover computes and stores (it will be needed during the proof phase) the entire Merkle tree and sends the root as the commitment $\phi(S)$ to the verifier. \paragraph{Merkle tree proof.} During the proof phase, the verifier sends a challenge $c$ to the prover. The challenge is the index of a randomly chosen node of the hard-to-pebble graph, \emph{i.e.} a random leaf in the Merkle tree. The answer to this challenge is a \emph{proof of inclusion} of the leaf in the tree: the prover sends the content of the leaf as well as the content of the sibling of all the nodes on the path from the leaf to the root. Upon receiving this proof, the verifier can compute the content of all the nodes on the path from the leaf to the root and verify that the computed root coincides with the commitment $\phi(S)$ that the prover sent during the initialization phase. \paragraph{} An illustration of a Merkle tree commitment and a proof of inclusion is given in Figure~\ref{fig:proof}. The content of the nodes of the tree are outputs of a hash function. Our implementation uses the SHA-256 hash function where each hash is 256 bits (32 bytes) long. For a graph with $n=2^k$ nodes, the height of the tree is $k$ and the tree contains $2^{k+1}-1$ nodes. Table~\ref{table} shows the amount of allocated space for different tree heights: the height of the tree is a parameter that the prover chooses depending on the amount of space he wishes to allocate. \begin{table} \begin{center} \begin{tabular}{rl} Tree height & Data Size\\ \hline 25 & 1\,GB\\ 30 & 32\,GB\\ 32 & 128\,GB\\ 35 & 1\,TB \end{tabular} \end{center} \caption{Size of the data written to the allocated disk space for different tree heights.} \label{table} \end{table} \begin{figure} \begin{center} \includegraphics[width=0.7\textwidth]{graph7.pdf} \end{center} \caption{A Merkle tree commitment and proof of inclusion. The bottom layer of the tree (the leaves) are the vertices of the hard-to-pebble graph of the PoS system. Each node in the tree contains the hash of the concatenation of its two children. The commitment A challenge is a random leaf in the tree (here in red). The answer to the challenge is the leaf itself as well as the green nodes: the siblings of the nodes on the path from the root to the leaf. With this information the verifier can compute the blue nodes from the bottom up (by computing the hash of the concatenation of the two children) and eventually compute the content of the root.} \label{fig:proof} \end{figure} \section{Design} \label{sec:design} It is important to note that the following are desirable properties for a PoS system: \begin{itemize} \item efficient disk space allocation (tree construction): in a typical application of PoS, the tree construction happens once and is then used repeatedly to answer challenges during multiple proof phases. Hence, the tree construction can be seen as a one-time cost and might not seem particularly crucial to optimize. However, in a cryptocurrency application as in Spacemint \cite{spacemint}, this one-time cost is a barrier to entry for newcomers and should ideally be as low as possible. Part of the appeal of a PoS-based cryptocurrency is that anyone can start mining using an unused hard drive as opposed to acquiring dedicated hardware as is now required to mine bitcoins. \item efficient proof extraction: when being queried (challenged) for a random lead, the prover should be able to efficiently extract the proof from the Merkle tree. If this requires non trivial computation, this defeats the purpose of using a PoS instead of a PoW. \item efficient verification: it should be easy to verify proofs so that catching dishonest provers can be achieved with sufficient probability of success. \end{itemize} In the following subsections, we will describe and discuss several methods for tree construction and proof extraction. The experimental evaluation is conducted in Section~\ref{sec:evaluation}. \subsection{Tree construction} The hard-to-pebble graph is only used to compute the leaves of the Merkle tree and its structure (the edges) can be considered fixed. In what follows, we will thus abstract it away as an oracle that we can query during the tree construction to get the content of any leaf and we will focus on the actual tree construction. It follows from the definition of a Merkle tree (a node contains the hash of its children) that constructing it requires traversing its node in an order where the parent of a node is visited after the node itself. Two natural orders naturally come to mind: breadth-fist order and depth-first post-order. Figure~\ref{fig:orders} illustrates these two orders for a tree of height 3. \begin{figure} \begin{center} \includegraphics[width=0.49\textwidth]{bfs.pdf} \includegraphics[width=0.49\textwidth]{dfs.pdf} \end{center} \caption{Two traversal orders for the Merkle tree construction: breadth-first on the left, and depth-first post-order on the right. In both cases, the nodes are traversed according to their label and written to disk in that order.} \label{fig:orders} \end{figure} \paragraph{Breadth-first order.} In this order, the tree is written to disk layer by layer starting from the leaves. Since an entire layer does not fit in memory, the algorithm has to go back and read what has already been written to compute the hashes of the internal nodes: for example, in Figure~\ref{fig:orders}, when reaching node 10, the algorithm has to read from disk the content of nodes 3 and 4 to compute the hash of their concatenation. \paragraph{Depth-first order.} A naive implementation of this order would also require going back and reading to disk as internal nodes are constructed. However, we note that by keeping track of the right information in memory, the tree can be constructed in one pass without ever reading back: the algorithm only performs writes. Specifically, we maintain a stack of hashes: whenever a subtree is completed, the hash of its root is pushed to the stack. By doing so, whenever we reach a node, the top two elements on the stack are the hashes of its children: we pop both of them and push the hash of their concatenation. Let us illustrate this with the right tree in Figure~\ref{fig:orders}: when reaching node 13, the subtrees rooted at 7, 10, 11 and 12 have been completed already, so the stack contains (top is on the right): $[7, 10, 11, 12]$. The algorithm then pops $11$ and $12$ and replaces them with the hash of their concatenation, the result is also written to disk at offset $13$. The stack now contains $[7, 10, 13]$. We can now compute and write to disk node $14$ by hashing the top elements of the stack. The stack now contains $[7, 14]$ and we can finally compute the root $15$. A pseudo-code of this construction can be found in Algorithm~\ref{alg:df}. We present it as a recursive function which computes the hash of the root of any subtree. The total Merkle tree can be obtained by calling this function for the root of the subtree. Note that in this recursive presentation, the hash stack is implicit; it is automatically maintained by the recursive call stack. For performance reasons, as we will discuss below, the actual implementation is non-recursive and uses an explicit stack. \begin{algorithm} \caption{Depth-first construction of a subtree of the Merkle tree} \label{alg:df} \begin{algorithmic}[1] \Require $i$ index of the root of the subtree \Ensure $h$ the return value is the hash of the subtree \If{$i$ is a leaf} \State $h\gets$ result of the leaf oracle for $i$ \Else \State $i_1,i_2\gets$ labels of the left and right children of $i$ \State $h_{1}\gets$ recursively call the algorithm for $i_1$ \State $h_{2}\gets$ recursively call the algorithm for $i_2$ \State $h\gets\textsf{hash}(h_{1}, h_2)$ \EndIf \State write $h$ to disk at location $i$ \State\textbf{return} $h$ \end{algorithmic} \end{algorithm} It is important to note that the size of the stack is proportional to the height of the Merkle tree, \emph{i.e.} is logarithmic in the data size. For a tree of height 32 (that is, a data size of 1\,TB) the stack size is 1.1\,KB and easily fit in memory (and even in the L1 cache). \paragraph{Parallelization.} Both constructions are easily parallelizable. For the breadth-first order, the parallelization occurs level by level: each level is split into contiguous chunks of nodes, and each chunk is sent to a different processing element. For the depth-first order, the parallelization occurs subtree by subtree: all subtrees of a given height can be computed in parallel (each processing element being in charge of computing its own subtree). \subsection{Proof extraction} Regardless of the order chosen for the layout of the tree on disk, extracting the proof of a given leaf can be done in a similar manner: we traverse the path from the leaf to the root and for each node along the path read the content of its sibling and append it to the proof. We note that from that perspective, the depth-first tree layout appears to have better data locality: right children are co-located with their parents; this never happens in depth-first order. \paragraph{Batch extraction.} An important aspect of proof of space systems is that during the proof phase, the verifier sends multiple challenges to the prover. The number of challenges depends on the security level (probability of failure) required by the application, but as discussed in \cite{spacemint}, for the Spacemint application, a typical setting would require answering about 2000 challenges for a data size of 1\,TB. It is then natural to ask whether batch processing challenges is better from the IO perspective. More precisely, the proof for two or more leaves might have nodes in common, these nodes only need to be read once from disk instead of once for each leaf. This is similar to doing a shared table scan for multiple select queries in traditional database systems. In the evaluation section, we will compare three strategies for multiple proof extraction: \begin{itemize} \item \emph{naive:} the proofs are extracted one by one in a sequential manner following the order in which the challenges are received. \item \emph{sorted:} the challenges are first sorted in left-to-right order at the leaf level. The reason to do so is that the closer two leaves are at the leaf level, the more nodes are shared in their proofs. By processing the challenges in left-to-right order, we hope to have better cache properties: nodes that are shared are likely to still be in the cache when accessed for the second time. \item \emph{explicit sharing:} here we share reads from disk in an explicit manner. For this, the breadth-first order is easier: we read the tree level by level from the bottom to the top and keep track of which node to read at each level: if two nodes from the previous level require reading the same node at the current level, we only read it once from disk and append it to both proofs. \end{itemize} \section{Evaluation} \label{sec:evaluation} It follows from the discussion in Section~\ref{sec:design} that the depth-first post-order (DF) layout from the Merkle tree has multiple advantages over the breadth-first (BF) order: \begin{itemize} \item for tree construction in BF order, the whole data needs to be brought back from disk to memory during the construction. In contrast, the DF order leads to a write-only tree construction. \item for proof extraction, the DF order has better data locality: some nodes are co-located with their parents. \end{itemize} The only reason why BF order might be preferable is if the explicit sharing for batch extraction results in a significant performance gain (number of challenges answered per second). The goal of this section is to confirm and quantifying these effects. \paragraph{Implementation.} My code can be obtained by cloning the git repository at \url{http://projects.horel.org/thibaut/pos}. The implementation is in Go to be able to later integrate it with an existing prototype implementation of the Spacemint cryptocurrency available at \url{https://github.com/kwonalbert/spacemint/}. Both implementations currently share close to zero code, I re-implemented some parts of the original code as needed to be able to more easily isolate and benchmark specific effect. The only part which was re-used without any change is the construction of the hard-to-pebble graph which falls outside of the scope of this report. \paragraph{Experimental setup.} All evaluations were run on my personal laptop with an Intel Core i7-6500U CPU @ 2.5GHz and 12\,GB of RAM memory. The hard drive was a 512GB SSD. \subsection{Tree construction} In our first experiment we compare the DF and BF tree construction for different tree heights. The results can be found in Figure~\ref{fig:const}. As can be seen, DF order significantly outperforms BF order, with close to 25\% running time improvement for the largest tested tree height. This improvement seems to be mostly explained by the cache misses which confirms our hypothesis that DF is better because it performs fewer reads (since the data is read from disk, a read will directly translate in a cache miss). When running this experiment with a single threaded program, it can be observed that the disk read/write rate is far from maximum capacity while the CPU utilization reaches 100\%. This indicates that parallelization would also result in performance gains. A parallel implementation can be found in the code repository. Due to time constraints, detailed results are not reported here, but preliminary results indicate a significant speedup (linear in the degree of parallelization up to 8 threads). We leave further investigation of the parallelization behavior for future work. \begin{figure} \begin{center} \includegraphics[width=0.49\textwidth]{time.pdf} \includegraphics[width=0.49\textwidth]{misses.pdf} \vspace{-2em} \end{center} \caption{Running time and cache-misses for DF and BF tree construction for varying tree heights.} \label{fig:const} \end{figure} \subsection{Proof extraction} We first compare the DF and BF orders for the purpose of proof extraction. For this we generate a varying numbers of random challenges (leaf index) and process these challenges sequentially both for the BF and DF layout. The results can be found in Figure~\ref{fig:proofs}. We observe two trends: \begin{itemize} \item the per-proof running time decreases as the number of proof extracted increases. We suspect that this because as the number of proof increases the number of nodes which are shared by proofs also increases. \item proof extraction is faster for DF order. We suspect that this because of better data locality as explained in Section 3. Further investigation on page misses is deferred to future work. \end{itemize} \begin{figure} \begin{center} \includegraphics[width=0.7\textwidth]{proofs} \vspace{-2em} \end{center} \caption{Per-proof extraction time for DF and BF order for a tree of height 31.} \label{fig:proofs} \end{figure} For our final experiment, we compare the different batch proof extractions strategies discussed in Section 3. The results can be found in Figure~\ref{fig:proofs2}. Contrary to earlier experiments (which were misleading due to hot page cache), the best strategy seems to be the explicit sharing of node retrieval; at the exception of very large number of proofs. \begin{figure} \begin{center} \includegraphics[width=0.7\textwidth]{proofs2.pdf} \vspace{-2em} \end{center} \caption{Per-proof extraction time for DF and BF order for a tree of height 31.} \label{fig:proofs2} \end{figure} \section*{Conclusion} We studied Proofs of Space from the data systems perspective: proofs of space are based on an interactive protocol of challenges and answers which is similar to the queries/answers interaction which happens in traditional data systems. Hence, the same questions naturally arise: how to design the data layout and access method to make it efficient. We discussed, compared and evaluated experimentally different strategies, both for the proof of space construction and for the proofs retrieval. As a temporary conclusion, we propose that significant performance improvements can be obtained by: (1) using depth-first order instead of breadth first order (2) retrieving proofs in batch and exploiting shared nodes in the proofs to minimize disk IO. However we are far from having a complete understanding of the different trade-offs: \begin{itemize} \item a more careful investigation of our current experiments needs to be conducted. Some effects have not been completely explained yet and monitoring specific metrics (page misses, branch misses, etc.) would provide a finer-grained explanation. Some experiments should be re-run with other settings of parameters (different tree heights, hard disk drive as opposed to SSD, etc.). Finally, the virtualisation of disk accesses happening at the OS level makes some effect harder to measure; performing pure in-memory experiments would also help in refining our conclusions. \item more complex data layouts could be explored. A natural one which comes to mind is the one used in ``Fractal tree indexes'' \cite{fractal} which force the co-location of nodes with their parents. \item we exploited shared nodes between proofs to improve the performance of proof retrieval, but it could also be exploited to reduce the space taken by proofs (in Spacemint these proofs are broadcast to the network, so compressing them would reduce the network load) and make proof verification more efficient. \end{itemize} \paragraph{Acknowledgments.} I would like to thank Sunoo Park for introducing me to the notion of Proofs of Space and the Spacemint application, Albert Kwon for a helpful discussion on the prototype implementation of Spacemint, Professor Stratos Idreos for his guidance in turning my vague ideas into a project for his CS265 class at Harvard and CS265 TA Manos Athanassoulis for his help with the experimental evaluation. \bibliography{main} \bibliographystyle{abbrv} \end{document}