PoW Fraud Proofs and improving Light Clients’ Security
Table of Contents
This post is roughly based on a talk by Ruben Somsen at the BTC Azores meetup. The talk was about ways to increase the security of light clients by using techniques like PoW fraud proofs. I found this subject very interesting and decided to write about it. I think it’ll be a nice fit in Floresta, since it can provide better security than SPV clients, while still being very lightweight.
The ideas are not new, some examples are this mailing list post by Ruben and his softchains proposal, but I don’t think there’s a real working implementation of it. I’ll try to explain the ideas in a simple way, and provide some examples of how it could be implemented in Floresta.
Bitcoin offers multiple levels of security. Full node security is the ideal scenario, but it’s not always possible. Full nodes are expensive to run and require a lot of storage. SPV clients, on the other hand, are the most common alternative, as you need to download just a small amount of data and perform negligible amount of computation. However, SPV clients’s security relies on two premises:
- You are not being targeted by a Sybil attack - The longest chain is the valid one (i.e the majority of the hash power is honest)
The first one isn’t exclusive to SPV clients, it affects full nodes as well; the second one is the main problem. If the majority of the hash power is controlled by an attacker, he can rewrite the history of the blockchain or even trick SPV clients into accepting invalid transactions, which won’t be accepted by full nodes. In theory, Sybil Attack a SPV client should be easier, and likely more profitable, since you can violate the rules without the SPV node knowing anything about it. 51% attacking a full node only allows you to “cancel” transactions by creating a double-spending paying to yourself, and reorging the chain to remove the original transaction. This is a very expensive attack, and it’s not clear if it’s profitable. On the other hand, 51% attacking a SPV client allows you to claim outputs that doesn’t belong to you (or even create new coins out of thin air), which is a much more profitable attack.
Relying on a simple majority is dangerous, specially since you’re blind to what’s happening onchain. If the majority wants to trick SPVs into accepting, e.g inflation, there’s no way SPVs can detect it. Since most wallets would accept transactions as final after 6 confirmations, holding the majority of the hash power for 6 blocks would be enough to trick SPVs into accepting invalid transactions.
But what if we could increase the threshold? What if we could make it so that the simple majority of the hash power is not enough to trick SPVs? This is where PoW fraud proofs come in.
The idea is simple: assuming that at least
n > 0 miners are honest, if we observe the last 100 blocks, if one of them is invalid, there should be a fork with at least ~
n blocks. If we find such block, we download both blocks and find which one is valid (if any). With this model, we increase the security of light clients to
n% of the hash power, where
n may be smaller than 50%. Assuming something like 100% of the hash power is colluding would be an existential threat to Bitcoin, no blocks would be produced and the network would be unusable. So,
n should be a reasonable number, like 10% or 20%. 90% of the hash power is a much more difficult attack to pull off than 51%. Moreover, the attack would need to be unrolled in a short period of time, otherwise the attacker would need to keep the fork alive for a long time.
Here’s a small example of how it would work:
Let’s first assume the following scenario:
- The attacker controls 90% of the hash power - The honest miners control 10% of the hash power - The attacker wants to trick SPV clients into accepting an invalid transaction - The light client sets its security threshold to 10%
A light client using this idea would observe the latest 10 blocks, and see if there’s at least a 1-block fork. Now let’s say that block 3 has one such fork.
The light client would download both blocks 3a and 3b, and check which one is valid. In this case, block 3b is valid, so the light client would reject block 3a and continue to download the chain from block 3b. Even though 3a has more PoW. This is frontally different from SPV clients, which would accept 3a, since it has more PoW, regardless of its validity.
Initial Block Download
Initial block download is the process of downloading and validating the current state of the network for a first time (or after some time offline). It’s an inherently slow process, as you need to download and validate every single block. It’s also a very important process, as it’s the only way to be sure that you have the correct state of the network. While we can optimize it in some ways, it would always be intractable for low power devices.
But what if we could skip it? What if we could download only the latest block and be sure that it’s valid? One way would be committing to the UTXO set in the block header (hard-fork) or in the coinbase transaction (soft-fork). A light client, willing to IBD would download the chain of headers, and look for interesting forks. If it finds one, it would validate the fork and find which chain is valid - if both are valid, just pick the one with more PoW. If the fork is invalid, it would rewind the UTXO set to the fork point and continue from there. This would be a very fast process, as you would only need to download the headers, and a few blocks to validate forks. This would make IBD much more tractable for low power devices.
Of course you have the problem of committing to the UTXO set, which is not a very intrusive change, but won’t work for blocks in the past. If we can build on the assumption of “you’re not Sybil”, we can then download the Utreexo accumulator from a full node, and look for divergences. If they all agree (again, you’re not Sybil), this must be the state for that block.
One might wonder, how would this be achieved? Well, there’s few tricks here. We need a way to rewind the UTXO set a given height, and we need to somehow find information about all forks, even invalid ones. The last one may seem easy, but if I’m not mistaken, Bitcoin Core won’t relay invalid headers, so we would need an overlay network to do that.
Rewinding the UTXO set might be tricky, since (at the time of writing) it is a database of over 7GB. Luckily, there’s an easier way to do that with Utreexo. With utreexo, the UTXO set can be compressed down to a few kilobytes, and it can be easily rewinded.
A light client may not have the Utreexo accumulator for the fork point, in this case you need to ask a full node for it. But how do you know if the full node is honest? Well, you can ask multiple full nodes, and if they all agree, you can be pretty sure about this state. If they don’t agree, you ask for an older block, until you find one that they agree on, then you can rewind the UTXO set and check from there. You’re guaranteed to have at least one honest peer (remember, you’re not Sybil). An open problem here is how to prevent an attacker from actively lying, forcing light nodes to go further and further back into the chain, just to DoS them.
If the light client is fully synced, it may be feasible to update the accumulator as blocks come, but not validate anything. Then, if a fork is detected, it can validate the fork and update the accumulator accordingly. This would be a nice tradeoff between security and performance. If a user is running such a client in their phone, this could save a lot of battery and bandwidth. If we detect a transaction that we’re interested in, we can go back to the full validation mode and validate the transaction and the block that contains it (as well as the previous blocks, if we don’t have them). Since end users usually don’t get many transactions, we’ld be in PoW fraud proof mode most of the time. A really paranoic implementation could postpone validation to when the user plugs the phone to charge, or when it’s connected to WiFi.
If we build a compact utreexo proof, composed of one Bach proof and the leaf hashes for inputs and outputs, the proof size for a block with
n inputs and
m outputs is
32*(n + m) + len(proof). We can reduce the size of the proof by constructing a proof over multiple blocks, which increases the overlap between proofs, reducing overall size. For a block with 2000 inputs and 2000 outputs we wold have 128000 + len(proof) (TODO calculate proof size), you’ll download this in addition to a BIP158 block filter and the block header. If the filter matches your wallet, you’ll download the block, look for your transaction. If you find a transaction involving your wallet, you validate this block and any ancestor, if any.
Study case: Floresta
I’ve been working on a project called Floresta, which is a Bitcoin full node written in Rust. It’s still in early stages, but it could be a good candidate for implementing PoW fraud proofs, since it already uses Utreexo, and is meant to be a resource efficient full node. The node is modular enough that it could be easily extended to support PoW fraud proofs, just requiring a few tweaks on the
chainstate structure. An ideal API would allow users to easily switch between full validation and PoW fraud proof mode, and also allow them to set the security threshold (which is basically the number of blocks to wait for a fork).
floresta, all validation and state of our current chain happens inside
chainstate. It gets headers through
accept_header and blocks through
accept_header deals with forks, verifying work, checking if it extends the best chain and so on.
connect_block, on the other hand, performs all block-specific validation, like checking the Merkle Root, validating transactions, updating the UTXO set and so on. To actually pull off a PoW fraud proof, we need to change
accept_header to accept forks, and postpone calls to
chainstate keeps track of blocks in one of five states:
InvalidFork. I think the names are self explanatory.
FullyValid means we validated the full block and everything went well. A
HeadersOnly block means we called
accept_header, and it succeeded, but we didn’t call
Orphan is a block which we don’t have an ancestor for. If a block is not in our main chain, it may be
InFork (if the branch is valid, but has less PoW) or
InvalidFork (if the fork has at least one invalid block). We only consider safe,
FullyValid blocks, and we try to download
HeadersOnly blocks AFAIK, so we can validate them and add them to our chain. In a PoW fraud proof scenario, we need another state,
PoWFraudProof, which means we didn’t validate the block, but we don’t know any fork that might indicate that this blocks is invalid.
If we get a competing block for a
PoWFraudProof block, we then ask for both blocks, and validate each. If both are valid, just take the one with more Pow. During this process, we might need to download an updated accumulator from the network, if our own acc is old.
I haven’t implemented all of this yet, but don’t seems a big diff from what we already have. Most of the work will be contained inside just one module (inside one or two methods). I’ll probably do it in the next few weeks.
As with nodes using BIP157/BIP158 client size filtering, like neutrino, this approach is very privacy friendly. Full nodes cannot tell which transactions you’re interested in, you only ask for usual block filters, and if you find a transaction that matches your wallet, you download the block and validate it. It would be nicer if you download the block from an ephemeral peer, but that’s not a big deal.
PoW fraud proofs are a very interesting idea, and I think they could be a great addition to Bitcoin. They would increase the security of light clients, making it harder for miners to trick them into accepting invalid transactions. More development is needed to make this a reality, but I think it’s worth it.