Yesterday’s record breaking 128MB Bitcoin block mined on the mainnet is no doubt a big event, however it did create some controversy. There were some people who think that storing data on the blockchain isn’t a good idea and are arguing that there are much better ways to store data. They think that just storing a hash of the data on the blockchain is a better idea, but isn’t that redundant? Let’s explore this a bit further.

So what is the alternative? Storing data locally, only to be accessed by that device? Storing data with a cloud storage provider, gaining portability but trusting them for security? Storing your data with a cloud storage provider and then storing a hash of it on the blockchain to ensure integrity? Why not just store the whole thing directly on the blockchain?

“Why should the miners store your data forever after only one intial payment?”

I agree, why should they? At the moment, we are still in the early stages of Bitcoin where the system is not yet fully mature. At the moment, miners do store the data. However, technically they don’t need to. After your data has been mined into the blockchain, it is most likely stored in an OP_RETURN output which is no longer spendable. In other words, your data is pretty much just sitting there and can’t be used in the creation of another block, so it can technically be deleted without affecting mining business.

Here is where archive nodes, or cloud nodes as I like to call them, come in. Cloud nodes will store a full copy of the blockchain and get paid to do that (or as much of the blockchain as they are paid to store). Even better, they will index and store data appropriately in order to make retrieval much faster and more efficient. Basically, the same thing cloud storage providers do today. The only difference is that the security needed for cloud nodes is much easier than security needed for cloud storage providers (which have been hacked multiple times in the past). All they really need is backup of the data so it does not get lost. The rest of the security is out-sourced to the Bitcoin miners and the security of Proof of Work. Even the timing of the data creation is preserved by the PoW security. The data stored on the cloud node cannot be hacked or altered because as soon as anything is changed, the corresponding Bitcoin merkle proofs will no longer be valid (IOW the data is no longer valid on the blockchain).

With this in mind, the payment made to the miners is a payment to secure the data, not to store it. Then, a (periodic) payment is required for a cloud node to store this data. This does not prevent miners from still storing the data and providing an extra service in a different market. Where are the cloud storage providers to take advantage of this business opportunity? Google, Amazon, Microsoft, etc?

Back to the controversial claim that it is a better idea to just store a hash of the data on the blockchain. Given the reasoning above, it is arguably more expensive to store a hash of the data on the blockchain since you will have to pay to store the data as well as a hash of the data, instead of just the data itself. Plus, how can other people make use of the data stored on the blockchain if there is only a hash and not the data itself?