Davidson blog

Caching proofs

To save bandwidth and processing power during IBD, we can cache UTXOs that were created by a recent block. Since bitcoin UTXOs are more likely to be spent soon after they are created, we can cache them for a short period of time. This is a simple optimization that can save us much time and bandwidth during IBD.

One problem with caching is how the bridge nodes can provide only the positions required to prove a few UTXOs. If we allow the client to pick how much it want to cache, the bridge node would need to filter a block proof to only include the UTXOs that the client doesn’t have in its cache. This is a problem because the bridge node would need to process the block proof for every client, which would be very resource intensive.

The algorithm proposed here allows clients to request only the positions they want, while not imposing a heavy load on the bridge node. This is possible because the client only request a byte-slice of the block proof, and the bridge node can just read that range from disk and write to the client, without needing to process the whole block proof.

Algorithm

To make this work, we need to sort the proof and positions in a way where if we byte-slice the proof, we can still get a valid proof for the positions requested. To achieve this, we first need to sort the targets from older to newer, based on how many blocks passes since it was created. If they were created in the same block, than the current leaf position should be taken, smallest first. We need a collection of u64’s called proof_positions

for target in proof_targets:
1. take the current node sibling and push it to proof_positions
2. current_node := parent(current_node)
3. if current_node is a root
  - break
4. if current_node OR its sibling is already in the hashes accumulator
  - break
5. goto 1

at the end, you pick the hash for all nodes in proof_positions and save it in the exact same order.

To verify a proof, do the following

for target in proof_targets:
1. pop the first proof element
2. use the target hash and the element from (1) to compute the parent hash for those two nodes
3. compute the parent position for the current node
4. if the position from (3) is a root
  - save this root in a list of computed roots and break
5. if the position from (3) was already visited
  - check if what we computed in (2) matches what we had previously computed
    - fail if not
6. goto 1

check if all roots we got are in the accumulator. Fail if not.

Since the first algorithm only computes positions, a CSN can compute it and see what positions it would need.

Example

Take the following tree as example:

14
|---------------\
12              13
|-------\       |-------\
08      09      10      11
|---\   |---\   |---\   |---\
00  01  02  03  04  05  06  07

Say we need 01 and 10 to prove. In this case, it’s evident that 01 is older than 10, but in a real case, the first step is to sort the proofs in order. The nodes visited during proving of those leaves (including computed positions) are

01 = [00, 08, 09, 12, 14]
10 = [11, 12, 13, 14]

Let’s run the algorithm for 01:

Current node=01, proof=[]       # push 01's sibling, make current_node=parent(01)
Current node=08, proof=[00]     # push 08's sibling, make current_node=parent(08)
Current node=12, proof=[00, 09] # push 12's sibling, make current_node=parent(08)
current node=14, proof=[00, 09] # 14 is a root, break

We then reach a root and break. The final proof is [00, 09, 13], now for 10

current node=10, proof=[00, 09, 13] # push 10's sibling, make current_node=parent(10)
current node=13, proof=[00, 09, 13, 11]    # 13 is already in the proof, break

So the final thing is [00, 09, 13, 11].

If a node wishes to verify this proof, it would verify thing upward, from the oldest target to the newest, always popping the next element in the proof. We need to keep computed positions around, as we may need them where they got omitted.

The first target is, again, 01. So we get the hashes in the exact same order as the slice above, and do.

proof=[00, 09, 13, 11], current_proof_element=00, current_node=01, hash=parent_hash(00, 01), next_node=parent(01), nodes=[00, 01, 08]
proof=[09, 13, 11], current_proof_element=10, current_node=08, hash=parent_hash(08, 09), next_node=parent(08), nodes=[00, 01, 08, 09, 12]
proof=[13, 11], current_proof_element=13, current_node=12, hash=parent_hash(08, 09), next_node=parent(12), nodes=[00, 01, 08, 09, 12, 13]
    parent(12) is a root, we store the computed root value and skip to the next target

proof=[11], current_proof_element=11, current_node=10, hash=parent_hash(10, 11), next_node=parent(12), nodes=[00, 01, 08, 09, 12, 13, 10, 11]
    parent(11) was already visited. So we compute if hash==nodes[13] and break

Now verify if the roots we reached (14 in this case) is equal to any roots in our accumulator.

Byte-slicing

The advantage here is that, if we only need proof for 01 (maybe we have 10 cached). We can ask for the range [0, 32 * 4], that is, the bytes from the start until where 13 is. This requires no further processing by a bridge node, as it can simply copy those bytes from disk and send to you without even thinking about it. Since this whole thing only uses positions, CSNs can pre-calculate the required positions given their cache, and request adoringly.