Blockchain is a method to store data in a distributed fashion. But what if we could store data structured as documents on the blockchain? Integration requirements to store individual data points to the blockchain necessitate a total overhaul of systems across entire business networks and, at present, involve manual input of each datapoint (e.g. IBM food trust). As such, it is simpler to integrate into enterprise processes a blockchain that can store documents, and this “on-chain” document storage provides several advantages.

There’s a lot of discussion around on-chain storage, its benefits and how it can be realized. Insolar has formulated its own approach based on Insolar Blockchain Platform architecture. However, before we get into how Insolar approaches on-chain storage and what makes it possible without sacrificing processing speed, let’s explore a little about the differences between off-chain and on-chain data in the realm of distributed ledger technology.

What is off-chain data?

Off-chain data storage means doing and/or storing something not physically on the blockchain, but referencing the result on the chain through various technical means. Usually, off-chain storage is for any data that is too heavy to be stored on a blockchain. This is since large datasets generally take time to process, and a blockchain network which needs to handle transactions quickly can be easily clogged up by having to process data large objects. As a general rule, this data is generally complex and of greater size than simple transactional data (e.g. A transfers x amount to B, subsequent decrease in balance of A’s account and respective increase in B’s account balance), and may include data that is required by law to be erasable or editable (e.g. in line with data protection laws). Data stored outside the blockchain is also typically unstructured, therefore making it difficult to categorize within a DLT.

However, off-chain data can be linked to blockchain ledgers through a hash, whereby the data object is stored elsewhere but a unique string of numbers (a hash) is used to check the validity of said object. On the whole, we can assume that the files stored off chain are too large to be stored on certain blockchain platforms or may exceed the need to be stored there as per the function of the blockchain network. In another case, the owner of the data may choose to store the object off chain so as not to share it within the distributed network and keep it secure for themselves in a highly-secure centralized database.

This is the approach that many other blockchain platforms advocate, however, this brings some issues of its own.

Off-chain issues

First of all it brings security problems since access to existing data stores may have previously been provided to certain parties that are not part of the DLT network, leaving the door open to disclosure of (new) data that is added by the network participants. Moreover, if an existing store is used, it will need to be altered so that it is able to store the relevant cryptographic signatures which prove that the document stored off chain has not been altered since it was put there. If the off-chain storage undergoes updates and modifications frequently, hash recalculation and its storage will be a laborious task.

Secondly, there are also performance issues associated with off-chain storage in that one cannot expect to have uniform response times from different data stores which implement different technologies, etc. As such, there needs to be a set of hardware and software across the network to ensure service levels are met. Otherwise this could lead to complications regarding network performance. Another issue related to performance is scalability and the need for elastic storage network-wide. Where a blockchain system is connected to other data stores, throughput issues on the blockchain platform could in themselves cause delays, since data flowing into the blockchain platform could clog the network.

Thirdly, it can be argued that it is impractical to have several connected systems running at once when there is an option to run a single, unified system in which all data, no matter what size or format, is stored. Not only this forms part of the logic behind creating a blockchain platform that is able to store documents on chain, but the main reason is that on-chain storage provides complete traceability and proofs/guarantees without having to leave the system.

Storing documents on chain: Insolar

As highlighted above, it is common practice for blockchain platforms to store files on an enterprise content management (ECM) system wherever they need, storing a hash to this file in a blockchain. To check the hash, and therefore the validity of the document, the file contents need to be fetched from the company’s enterprise content management system. However, Insolar Blockchain Platform stores a hash of the document on the main chain: the so-called lifeline of the main business object, with the document linked to it and as the justification for the creation of said object. The document itself is stored in a separate shard of the chain. This shard (the so-called sideline) is actually a different blockchain. In effect, files can be efficiently stored within a single DLT network comprising many interlinked blockchains.

Since Insolar Blockchain Platform creates storage space on a different chain via sharding, placing the document on a separate blockchain within the same network, large files can be processed and stored on the sidechain while the business objects originated by those documents are stored on the mainchain or other side chains without slowing down the network as a whole.

Advantages of on-chain

Users in companies using on-chain storage for registering transactions, but off-chain for storing data (documents) related to them, need to go out of the DLT system to verify the validity of the document. Conversely, with on-chain document storage, data is held near the main business object, meaning that it is tied directly to applicable contracts and stored consistently on blockchain. In turn, this means that further operations can take place with the document as the basis for a change in object state. Moreover, these operations can take place without delays, an affinity feature, or having to exit the system. As such, any operation can be verified directly against originating documents — meaning there is a full stack of information within the network, which is beneficial for enterprise: the document can be pulled from the blockchain at any time without having to trust its validity.

Example of on-chain storage

A prime example of where on-chain document storage brings benefits is when there is a master agreement for an OTC contract and this needs to be checked every time a payment is made according to the schedule. The master agreement contains clauses that specify banks, currencies, accounts and conditions (events to trigger payments). It is much simpler and efficient to have a golden copy that is open to all sides and retrievable at all times. On-chain distributed ledgers provide a single, retrievable copy that is visible to all parties.

On-chain document storage is set to become standard practice in blockchain as enterprise needs data retrieval to be as simple as possible, maintaining a set process for file management; storage across several locations is too complex. Multi-party networks need trust and this faith can be placed in a DLT network whereby all participants can see the documents and data contained within that is relevant to them. Overall, on-chain storage provides for greater operational efficiency and additional trust.

Check our Github and leave feedback on the code.

Follow Insolar on social media: