Indexed Merkle Tree (Nullifier Tree)
Overview
This article will introduce the concept of an indexed merkle tree, and how it can be used to improve the performance of nullifier trees in circuits.
This page will answer:
- Why we need nullifier trees at all
- How indexed merkle trees work
- How they can be used for membership exclusion proofs
- How they can leverage batch insertions
- Tradeoffs of using indexed merkle trees
The content was also covered in a presentation for the Privacy + Scaling Explorations team at the Ethereum Foundation.
Primer on Nullifier Trees
Currently the only feasible way to get privacy in public blockchains is via a UTXO model. In this model, state is stored in encrypted UTXO's in merkle trees. However, to maintain privacy, state cannot be updated. The very act of performing an update leaks information. In order to simulate "updating" the state, we "destroy" old UTXO's and create new ones for each state update. Resulting in a merkle tree that is append-only.
A classic merkle tree:
To destroy state, the concept of a "nullifier" tree is introduced. Typically represented as a sparse Merkle Tree, the structure is utilized to store notes deterministically linked to values in the append-only tree of encrypted notes.
A sparse merkle tree (not every leaf stores a value):
In order to spend / modify a note in the private state tree, one must create a nullifier for it, and prove that the nullifier does not already exist in the nullifier tree. As nullifier trees are modeled as sparse merkle trees, non membership checks are (conceptually) trivial.
Data is stored at the leaf index corresponding to its value. E.g. if I have a sparse tree that can contain values and want to prove non membership of the value . I can prove via a merkle membership proof that , conversely if I can prove that I can prove that the item exists.
Problems introduced by using Sparse Merkle Trees for Nullifier Trees
While sparse Merkle Trees offer a simple and elegant solution, their implementation comes with some drawbacks. A sparse nullifier tree must have an index for , which for the bn254 curve means that the sparse tree will need to have a depth of 254. If you're familiar with hashing in circuits the alarm bells have probably started ringing. A tree of depth 254 means 254 hashes per membership proof. In routine nullifier insertions, a non membership check for a value is performed, then an insertion of said value. This amounts to two trips from leaf to root, hashing all the way up. This means that there are hashes per tree insertion. As the tree is sparse, insertions are random and must be performed in sequence. This means the number of hashes performed in the circuit scales linearly with the number of nullifiers being inserted. As a consequence number of constraints in a rollup circuit (where these checks are performed) will sky rocket, leading to long rollup proving times.
Indexed Merkle Tree Constructions
As it turns out, we can do better. This paper (page 6) introduces the idea of an indexed merkle tree. A Merkle Tree that allows us to perform efficient non-membership proofs. It achieves this by extending each node to include a specialized data structure. Each node not only stores some value but also some pointers to the leaf with the next higher value. The data structure is as follows:
Based on the tree's insertion rules, we can assume that there are no leaves in the tree that exist between the range .
More simply put, the merkle tree pretty much becomes a linked list of increasing size, where once inserted the pointers at a leaf can change, but the nullifier value cannot.
Despite being a minor modification, the performance implications are massive. We no longer position leaves in place therefore we no longer need a deep tree, rather we can use an arbitrarily small tree (32 levels should suffice). Some quick back of the napkin maths can show that insertions can be improved by a factor of 8 .
For the remainder of this article I will refer to the node that provides the non membership check as a "low nullifier".
The insertion protocol is described below:
-
Look for a nullifier's corresponding low_nullifier where:
if is the largest use the leaf: