Learn how Streamr is building a data economy through IoT.

It's not every day that you talk to a company that raised 30 million Swiss francs (that's over 30 million USD) in a 2017 ICO and is kicking some amazing goals not only when creating a new ecosystem for trading IoT data, but also the actual technology framework and solutions that underpin it.

That company is Streamr who created a decentralized network for scalable, low-latency, untamperable data delivery and persistence. Anyone – or any device – can publish new data to data streams, and others can subscribe to these streams to power Ðapps (decentralized applications), smart contracts, microservices, and intelligent data pipelines. Much of the data is free, but where that’s not the case, the terms of use are stored in Ethereum smart contracts.

You may also like: DZone Guide to IoT: Connecting Devices and Data

I recently sat down with CEO Henri Pihkala to find out about the progress of the company. I learned that below Streamr's decentralized data network and marketplace is a decentralized solution to messaging and event processing, replacing platforms such as Azure EventHub and Azure Stream Analytics. This includes a suite of tools to help developers create real-time, data-driven, and trustworthy blockchain applications. This includes a powerful analytics engine and a UI for rapid development of real-time Ðapps.

The infrastructure they create consists of a mighty technology stack that helps connect and incentivize computers in a global peer-to-peer (P2P) network. This is a network that provides low-latency, robust, and secure data delivery and persistence, and all at scale.

Key Components of the Streamr Stack

Streamr Editor is a usability layer and toolkit that enables the rapid development of decentralized, data-driven apps.

Streamr Engine is a high-performance event processing and analytics engine that executes off-chain in a decentralized fashion. It can run on a decentralized computing provider such as Golem.

Streamr Data Market is a universe of shared data streams that anyone can contribute and subscribe to.

Streamr Network is the data transport layer, defining an incentivized peer-to-peer network for messaging in the decentralized data pipeline

Streamr Smart Contracts enable nodes in the Streamr network to reach consensus, hold stream metadata, handle permissions and integrity checking, and facilitate secure token transfers.

The company had recently announced the success of its Corea milestone release — a publish/subscribe (pub/sub) network for real-time data, and basically, the first out of three major milestones that the company had pledged since receiving their funding.

Messages produced to a stream (sometimes called a topic) get delivered to all subscribers listening on that stream in realtime. It’s a bit like an instant messenger for machines, which — in IM lingo — supports “group chats” (many-to-many), “channels” (one-to-many), as well as “private” (one-to-one) messaging patterns.

Henri noted:

"In terms of scale we follow the number of messages per second that are published to the network to measure usage of the network. This tends to be around 3000-5000 per second — a year ago, it was 1000 messages per second. So we’re seeing a significant increase in the adoption of the network technology that we’re building so we’re happy to see that and also validate what we’re using that people are actually using it. At launch, the network already carries 12.9 billion messages per month."

You can view the number of data points per second currently on the network in real-time and create your own stream.

The Process of Trial and Error

I was interested to learn more about the kind of process that goes into creating this kind of technology. Henri shared:

"We're talking about massively distributed systems of thousands of nodes (points where a message can be created, received, or transmitted) all around the world. How do you make sure that the system behaves as intended? How do you measure the performance characteristics of the system at scale? These are relatively difficult problems to solve."

Testbeds created a space for trialing the technology using purpose-built tools "that allowed devs to easily add simulated nodes into topology constructions and see how the network gets formed when it goes up to thousands or tens of thousands and calculates metrics from that network such as, how many hops they take to travel from one to the other edge of the network?" This an important metric considering factors such as latency - one of the persistent pinpoints in IoT when real-time data is needed.

Henri further detailed:

"Another layer is network regulation. So when nodes are actually connected over network we want to see what happens at scale. Networks have different properties. They have variance in latency between knowledge which is called jitter. Also links can go down and thus nodes can disappear. So this kind of thing sort of chaos can be emulated. We can have nodes going down randomly. We can have links disappearing between nodes. We can have links becoming suddenly super slow and so on."

This was achieved through the use of a Core network emulator that could test around 2000 nodes, going beyond the capabilities and expenses of what could be tested using cloud services. Notably, Streamr had undertaken extensive research before their testbed efforts:

"We could delve into all the existing research that's out there in both the academic literature as well as in real-world things that have been built like bitcoins or matrix or blockchain and take lessons learned from these areas and then find a direction that was highly probable to work in practice."

The company will be releasing a white paper later this year describing their results where:

"We can actually show metrics and numbers. What were the results? How did we how did the stress test this network and how they behave? So we can offer an interesting read to people who are developing similar systems or are serious users that are seeking to apply this technique in some way."

Reader's note: I've attempted to condense a fairly extensive chat into digestible bites using plain language so any errors are mine! If you would like to learn more prior to the whitepaper, I'd suggest Streamr's Medium post detailing their testing process in further detail. I'll be following up with a second post detailing a deep dive into their progress in data monetization — it's a fascinating topic for anyone interesting in IoT, and while some of the blockchain concepts can be a challenge initially, it is well worth the read.

Further Reading

DZone Guide to IoT: Connecting Devices and Data

How Data Monetization Is Creating a New Data Economy for IoT