The challenges of artificial intelligence (AI), just like human intelligence, is that it needs to be trained. Machines and AI require data to learn, while human intelligence is built from experiences in life. Simply put, the fundamental stepping stone towards better AI is data. AI startups and researchers out there have great capability to develop better algorithms and computational programs, but they have no data, whereas multinationals and government organizations around the world have a vast amount of data generated through day-to-day activities, but they failed to create value out of their data. The problem is, AI and data are not connecting, and one of the most effective ways to solve this without violating data privacy is through decentralizing every data asset.

When data sit in different silos, they won’t be of much use for the companies. At the same time, there are not many companies around the world that take advantage of AI well. For example, getting data across from the police to health service or vice versa will probably take many years. Let alone connecting those data in a safe and secure way. The few companies that are able to make the most out of AI are Google and Facebook, which would not be what they are today without the billions of users who pay for their services with personal data.

Here’s where Ocean Protocol comes in. Using blockchain technology, Ocean is a decentralized data exchange protocol to unlock data for AI. We sat down with Mike Anderson, a Founding Member of Ocean Protocol & CTO of DEX, where he shares his experience working with data and his vision towards using blockchain protocol to transform the sharing of data across different industry in the near future.

Using blockchain to connect data and AI

At Ocean, the ultimate goal is to bring power back to the wider economy to make use of AI, and to ensure that data can be shared so that everyone in the ecosystem of data economy can gain an advantage of AI and different data asset that is made available around the world. Ocean is founded by two companies:

DEX, a Singapore-based company that is a business-to-business for creating data marketplaces

BigchainDB, a Berlin-based technology company which builds blockchain database product that has been used by many enterprises

Coming in from data marketplace angle and blockchain angle, Ocean’s core principle is to democratize and decentralize the capabilities of data and AI, thereby connecting the two. Up to date, the 7 industries-led sprints that Ocean Protocol is collaborating with the government of Singapore are healthcare, finance, AgriTech, utility, retail, built environment, and mobility.

Traditional data sharing vs. Decentralized future

With decentralized approach to technology which will change the way we think about many different business models, applications and utilities, you no longer have to rely on a single centralized authority to let you do things, and that to me is the most exciting aspect of the blockchain technology.

- Mike Anderson, a Founding Member of Ocean Protocol & CTO of DEX

There are many limitations to traditional data sharing, with one being the bespoke expensive solutions. Protocols have to be based on automation so that these data transfer can happen in an efficient way and doesn’t take months or years of engineering to make this happen.

Another key issue with traditional data sharing is audit and provenance in data sites, especially when it comes to sensitive data or extremely valuable data. The ecosystem needs to have an audit trail of provenance on how exactly the data was generated and how it was used. This problem can be solved through the decentralization of data asset: blockchain keeps a record of the transaction of what happened, so anyone can publicly verify what actually happened.

Not only that, but decentralization can also give users a way of confirming the transaction of data and exchange of value. Let’s say, if I give you my data set, I get tokens in return. This is a way of facilitating market-based exchanges, and different players in the economy can monetize and get value from their data asset.

Key ecosystem participants

“We are not trying to build a product here. We are trying to create a standard and a protocol by which others can interact with”, said Mike. The ecosystem is made up with different key players.

Keepers and Verifiers : Being the core of the ecosystem, these are the people who are actually on the blockchain, and they’re the nodes on the blockchain sharing information on the public ledger.

: Being the core of the ecosystem, these are the people who are actually on the blockchain, and they’re the nodes on the blockchain sharing information on the public ledger. Data marketplaces and Service providers : These are the people who provide data storage, computation, and machine learning services. Marketplace is where buying and selling of data or machine learning services take place.

: These are the people who provide data storage, computation, and machine learning services. Marketplace is where buying and selling of data or machine learning services take place. Publishers and Consumers : The world’s access to data asset is primarily controlled by data publishers, who have great data asset and might want to make available to the whole world. With different level of data asset access rules, there is open data that is accessible by everyone in the world, private data that never get shared with anyone (typically personal sensitive personal information or medical record), and data privacy that falls somewhere in the middle of the access rules. On the other hand, consumers could be researchers who need to get their hands on different research data from different countries and bring them together to analyze.

: The world’s access to data asset is primarily controlled by data publishers, who have great data asset and might want to make available to the whole world. With different level of data asset access rules, there is open data that is accessible by everyone in the world, private data that never get shared with anyone (typically personal sensitive personal information or medical record), and data privacy that falls somewhere in the middle of the access rules. On the other hand, consumers could be researchers who need to get their hands on different research data from different countries and bring them together to analyze. Curators and Referrers: These people are there to ensure the quality of data assets, making sure that they’re easy to be discovered.

These people are there to ensure the quality of data assets, making sure that they’re easy to be discovered. General public : Most of these data will mostly be about people, and they have important rights that have to be protected.

: Most of these data will mostly be about people, and they have important rights that have to be protected. Regulators and Token acquirers: It is very important to work with the government, which include regulators and people involved in the token economy. Within the ecosystem, Ocean Protocol token serves as the utility token that enables value to be exchanged. Let’s say if I want to sell a data asset, I can sell it with an exchange of an Ocean token. The advantage of this is it creates a way for people to be rewarded on the ecosystem and this creates the right incentives for people to add value.

Ultimately, the goal is to get to fully decentralized services where different people can add value in whatever way that makes the most sense, each with different roles. You may have storage providers, multiple marketplaces, or different people providing computational services.

Data sharing without sharing data

The data itself does not become public unless the publisher chooses to do so.

Typically, the consumers do not have to see the raw data. They will see aggregated data, analysis formed from the data or some predictions created by an AI model. A common configuration is that the data stays within a secure environment, that might be the publisher’s environment itself or a clustered third-party broker who has a secure infrastructure. With computation which actually happens within that environment, the consumers only get to see the result, green sanitized data, which can be anonymized and made sure no data leakage. Another value the blockchain provides is the ability to decentralize trust in the network. Before, we had to go through a centralized technology provider to upload information, and it took trust in doing so.

Ocean Protocol’s approach to drive adoption from different players

Due to the immaturity of the industry, getting people to adopt the technology could be a challenge. In order to facilitate the adoption, Ocean is made totally open source and free for use for anyone. Users can download the software and customize it in whatever way they want, and no one in the world can stop them from doing it due to the decentralized nature of the technology. Unlike centralized data exchange, Ocean is not going to collect any data set from the users. What it wants is for other people in the ecosystem to keep the data sets, set up marketplaces and sell services.

This sort of open ecosystem where there is no gatekeeper and where anyone can participate will have a much higher chance of adoption. A good analogy of how Ocean Protocol works can be found in HTTP protocol that powers the entire web. What made the web so successful is it was an open protocol: anyone can build a website, anyone can build a web browser (Internet Explorer, Chrome, Firefox). And because it was a standard, all the different web browsers work with all the different web services, and the ecosystem was open for anyone to participate - this was what made the web so successful. The same thing applies for Ocean Protocol as the data sharing protocol.