vbuterin: vbuterin: I already expect Merkle receipts to be the main contributor to data bloat

The good news is that multihashing is an incremental thing (as opposed to all-or-nothing). Multihashing can be enabled for some subset of consensus objects. They can also be “partially” enabled in the sense that they can be exposed from a data perspective but not used for verification of proofs (e.g. Merkle proofs) at the consensus layer.

To illustrate low hanging fruit of multihashing, let’s say that the only changes we make to the beacon chain are to make BeaconBlock and BeaconState roots (Keccak256, MiMC)-multihashes where Keccak256 is the “dominant” hash used for cryptographic witnessing. That is, MiMC is exposed in a few places but never actually used by the consensus.

Notice that in the above construction there’s essentially zero data bloat. Specifically, BeaconBlock s are 64 bytes larger (32 bytes extra for parent_root , 32 bytes extra for state_root ). And the BeaconState is 32b * LATEST_BLOCK_ROOTS_LENGTH = 262kB larger for latest_block_roots , and likewise batched_block_roots is doubled in size.

Because MiMC is SNARK-friendly, we allow dApps to efficiently access state with SNARKs. For example, large cross-shard transaction witnesses can be compressed into a SNARK. And being able to prove claims about the BeaconState in zero knowledge may be useful for privacy.