It seems prescient that the World Economic Forum in 2011, hosted in Geneva, Switzerland, was already talking about personal data as an emerging asset class. Blockchain was not yet a household name - if it is even now - and only in the past three or so years has the technology really gained public profile. What was most firmly highlighted in the report on personal data coming out of the summit was this: firms, governments, and researchers already collect and analyse data on a massive scale and this will see continual growth.

“At its core, personal data represents a post-industrial opportunity. Utilising a ubiquitous communications infrastructure, the personal data opportunity will emerge in a world where nearly everyone and everything are connected in real time.” “That will require a highly reliable, secure and available infrastructure at its core.”

Blockchain meets big data: “personal data represents a post-industrial opportunity”

The report directly preludes GDPR which came into effect May this year. While it recognises the crucial role data has to play for businesses, there is great emphasis on the central role of the individual. For a blockchain-based data solution this is a notion well-aligned—as data is produced daily on a vast scale, the technology could prove to change who owns data and who has access to it. This is a concept explored by IOTA through a Directed Acyclic Graph (DAG) architecture - touted as Blockchain 3.0 - and they envision a data marketplace whereby users can sell the data they produce on a daily basis. Rather than companies mining data from actions performed by users on their platform, they can instead purchase this data by making micropayments to the individuals. Having the ability to control where your data goes and how it is used fits into the essential goal of GDPR; the users producing data should have total sovereignty over their information.

Are we prepared for such vast quantities of data?

Hyperscale data centers are fuelling the high capital spends coming from Google, Amazon, Facebook, etc. amid the ever-growing quantities of data being collected. But there are two inherent problems in relying on hyperscale data centers into the future.

Still hardware-based

These massive data centers depend entirely on hardware which means an upfront cost for the servers, high energy consumption, and continual maintenance costs along with a three-year upgrade cycle on servers. Only very few companies can afford to build a hyperscale data center, let alone several.

Not a scalable solution

Google’s capital spend figures for 2016 show that they probably sunk around $10 billion US Dollars into data storage; these numbers are probably now much, much higher. This is because the quantities of data collected is ever-increasing and stands to continue seeing exponential growth. It does not seem feasible for hardware to scale to the unfathomably vast future world of Big Data, but nor does blockchain technology claim to have an answer at this moment in time. Many startups remain reliant on traditional Ethereum and Bitcoin blockchains despite the fact that they have yet unresolved scaling issues.

A DAG architecture can scale to the far future

Well this is what DAG-based network CyberVein are saying. It’s the thinking behind the Chinese startup’s aim to create globally decentralized databases with donated disk space from users’ devices. The idea follows that, by using Proof of Contribution as opposed to Bitcoin’s Proof of Work, crypto tokens provide an incentive for nodes to offer storage. Mining is not needed because the validation of the transaction is built into the transaction itself. Numerous offshoots from the main DAG tree mean that, unlike in blockchain, nodes do not have to process the entire network and instead just deal with subsets of transactions. A DAG architecture could be far more efficient than blockchain in big data.

The right DAG implementation could work

The World Economic Forum’s report on personal data predicted a world of Big Data which will require infrastructure that is reliable, secure, and available. An effective DAG-based network in theory would provide exactly that. Data landscapes are changing rapidly and an efficient method of handling volumes of Big Data is sorely needed, while data as an emerging asset class will be increasingly defined by who has control of it. This means a shift from personal data being organisation-controlled to more individual-controlled and, to achieve this, a DAG-based solution may prove to be key.