zksnarks and blockchain scalability

Four fun approaches to blockchain scalability

Disclaimer: I am not a zksnark expert. What follows may be rubbish and any errors are mine. Many thanks to Chris Reitwießner for reviewing this post. I recommend his introductory article on zksnarks. Also thanks to Ariel Gabizon from Zcash for pointing several subtleties I overlooked.

Definition

Let P be a program which takes two inputs. The first input i is meant to be public. The second input w is a witness which can be kept private. We use o for an output of P.

Given a program P, an input i and an output o, if a prover knows a witness w such that o = P(i, w) then the prover can produce a proof where:

(space density) The proof is very short (just a few hundred bytes even for a very large program; e.g. Zcash achieves 288 bytes) (time density) The proof verifies in time linear to the lengths of i and o (zero knowledge) The proof leaks nothing about the execution of P or the contents of w other than the existence of w

Such proofs are called zksnarks.

(As a technical aside, zksnarks are only known for verifier programs P which run in polynomial time to the lengths of i and w. We are also assuming that the security parameter “lambda” is fixed.)

Intuition

I like to think of zksnarks as simultaneously the equivalent of cryptographic hashes and signatures for computation (extending the two cryptographic primitives beyond data). Like hashes, zksnarks “compress” computation into a short and quick to verify fingerprint. Like signatures, zksnarks “attest” for the validity of computation.

Below we show approaches using zksnarks to address blockchain scalability. These are mostly theoretical given today’s state-of-the-art. Verification of zksnarks is cheap but generation of the proofs is the real bottleneck. Research in the cryptography surrounding zksnarks has advanced at a fast pace in recent years. Orders-of-magnitude improvements have made Zcash a reality. It is possible that the approaches outlined below may become practical in a few years.

1) Fast sync

Traditionally synching a blockchain requires bandwidth, storage and computation. For example in Bitcoin most of the computation is in checking signatures. With zksnarks the burden of explicitly verifying transaction signatures one-by-one can be removed. A single zksnark can attest for all the signatures of a block. This means the computation requirements for synching the Bitcoin can be significantly reduced.

To see this let P be the program where:

i is a block hash

is a block hash w is a block

is a block P returns True if w has hash i and all transactions in w have a valid signature, otherwise P returns False

Assuming blocks are bounded in size (e.g. 1 MB) then zksnarks can be generated for P whenever o is True. Such zksnarks cryptographically prove that the signatures are valid, without having to verify (or even download!) the signatures. This is effectively SegWit on steroids. Notice that i and o are constant-sized (256 bits and 1 bit respectively) so proof verification takes constant time.

2) Ultra-light clients

Traditionally a light client that wants to answer some question about a blockchain has to do two things. First it has to download the relevant blockchain data and second do the relevant computations on that data. For example a light client that wants to answer the question “Has this specific crowdfund reached its goal yet?” would download the crowdfund transactions and add up the amounts. With zksnarks a client can be cryptographically shown such facts without downloading data or doing computations on that data. This opens the door for “ultra-light” clients which have close-to-zero bandwidth, storage and computation resources, but nonetheless maintain cryptographic sovereignty.

To see this let P be a light-client query virtual machine where:

i contains both a block hash specifying a blockchain to be queried, and a script (e.g. some web3 JavaScript) specifying a query for that blockchain

contains both a block hash specifying a blockchain to be queried, and a script (e.g. some web3 JavaScript) specifying a query for that blockchain w is the blockchain data required for the query, along with the required parts of the Patricia-Merkle tree for light-client verification

is the blockchain data required for the query, along with the required parts of the Patricia-Merkle tree for light-client verification o is the output of the script

Given a query i and an output o, one can produce a corresponding zksnark attesting for the validity of the query result.

You can imagine a world where dedicated provers provide zksnarks-as-a-service for ultra-light clients. Because generating zksnarks can be expensive, specialised ASICs would speed up the process. Provers would become the new miners where the utility of specialised hardware shifts from providing security to providing scalability, and mining fees shift to “proving fees”.

3) Abstracting away code and storage

Currently on Ethereum the code and storage of smart contracts (which together describe the state) are stored in full on the blockchain. This takes up resources and can be bad for privacy. It turns out zksnarks can condense the state of a smart contract into a single hash, obfuscating both the underlying code and storage.

To see this, let P be an Ethereum light-client verifier where:

i is a triple of hashes. The first hash is a block hash specifying the blockchain in its current state. The second hash h_now fingerprints the state of a smart contract as currently represented in the blockchain. Finally, h_future is the hash of the state of the smart contract at some further time in the future.

is a triple of hashes. The first hash is a block hash specifying the blockchain in its current state. The second hash fingerprints the state of a smart contract as currently represented in the blockchain. Finally, is the hash of the state of the smart contract at some further time in the future. w has the state of the smart contract at time h_now , plus the intermediary transactions that advance state from h_now to h_future , plus the parts of the Patricia-Merkle tree required to do light-client verification of the state

has the state of the smart contract at time , plus the intermediary transactions that advance state from to , plus the parts of the Patricia-Merkle tree required to do light-client verification of the state P returns True if the state corresponding to h_future is valid, otherwise it returns False

zksnarks can be produced for P which prove the validity of state changes of a smart contract without revealing either the code of the smart contract or its storage.

4) Universal state channels

The technique above used to abstract away code and storage from the blockchain (pushing the state offchain, e.g. on IPFS) also allows for batching the state changes that happen between h_now and h_future. This construction is effectively a universal state channel. With the exception of the setup and settlement transactions, all activity happens offchain.

The construction allows for state channels that only store a single hash on the blockchain, and where state changes are enforced on the blockchain with a single zksnark verification.

You can imagine a blockchain (zEthereum?) which only stores hashes of smart contracts, and where the only opcode is for zksnark verification.