Building the next generation of distributed data-transfer and analytic tools

The imminent exponential growth of data creation demands a radical new model of data-oriented systems,in areas such as logistics, smart city, fleet management to name a few. In these early days of the information age, we will see a transition to a more distributed model, where edge-devices become the computational hubs, to compensate for increasing demand, but also to provide localised analysis to minimise external workload. Collectively, this huge data pool will require additional intelligence to direct Users to real-time issues within an edge-device network.

The success of IoT depends not only on a model’s scalability but also on its capacity to confirm the legitimacy and security of data. The IOTA protocol provides such a solution for this by having a public proof of information exchange, or ledger, held within the Tangle. The ledger is immutable and given sufficient encryption, 100% private, facilitating unparalleled P2P & M2M communication.

In this article, we will provide our vision for the future of secure & distributed data-transfer and analytics systems, including a description of our PoC implementation of such a solution: Lighthouse

Change is coming

Current data-transfer services rely heavily upon Cloud-based computing which risks single point failures, as well as, concentrating large computational loads on service providers and clients. Distributing the computational power across edge-devices spreads the load as well as preventing single point failures.

In parallel, the popularity of distributed blockchain cryptocurrencies, Bitcoin and Ethereum, has proven that the future of information lies in decentralization and immutable transactions ledgers. Traditionally we rely on trusted services to send and store data, however, clouds are not transparent about what happens with your data. On the other hand, in a trustless-fog system there is no need for trust as every transaction is publicly provable, via a public transaction ledger.

We can summarise these two aforementioned paradigm shifts:

Centralized to decentralized computation

Trust to trustless transactions

However, blockchain based cryptocurrencies require mining to confirm transactions which is hugely resource-expensive. Furthermore, the owner of each blockchain transaction pays a fee to miners to confirm its validity, making blockchain unscalable and ultimately not feasible for the IoT future. In contrast, the IOTA Tangle is one such technology that enables distributed & immutable transactions whilst remaining feeless at comparatively minimal computational cost.

These two emerging paradigm shifts can be combined into a powerful new vision of information processing, and this is the foundation of Lighthouse.

The light coming through the fog

What is needed is a scalable, immutable & distributed data transfer system without transaction fees. Through this, for example, connected vehicles could send a real-time feed of valuable telematic information and analytics regarding engine wear or fuel levels to a fleet operator without fear of a Cloud being hacked or data being lost. A fleet manager could track, message and monitor her fleet securely using a dashboard to interact with the key features automatically extracted by an additional layer of intelligence. Furthermore, this would completely remove the need to develop a complex cloud infrastructure.

This vision is also shared by smart-factories of the future, where hundreds of sensors monitor dozens of different processes along a complex product supply chain. A unified system could learn-on-the-fly issues as they arise, signalling to maintenance staff when anomalies are detected to ensure quality control or even ordering replacement parts without human intervention. IOTA-MAM also makes possible the monetization of “interesting” sensor data to competitors or researchers.

Our solution. Lighthouse is an edge-device level implementation of data analytics using the Tangle as an immutable database providing secure channel communication.

Why the Tangle with MAM?

Transactions, by default, are not encrypted and so sending data via the Tangle alone means that anyone can read their contents. This is hugely beneficial in some applications, for example, enabling transparent democratic elections. However, to facilitate secure data-transfer from known identities, IOTA developed Masked Authenticated Messaging (MAM) as an additional layer to ensure that messages are encrypted and signed.

We won’t go deep into MAM theory, that is a topic of elsewhere [2,5]. For the first PoC release private-mode MAM will be available. This mode is similar to that of an encrypted police radio, in that only a pre-defined group can access data. Each message, or data piece, contains the key to the next, and so once access is given to a User they can follow the chain and stream the continuous data flow. This mode also means that you cannot look back in time along a data stream, which would restrict the visibility of information prior to receiving access to the asset, in the case of a fleet operator wanting to message a driver, they may not want their driver to see (potentially sensitive) information in that particular communication channel’s history. Implementing restricted mode MAM will enable admin to control subscriptions to data streams, revoking access rights after, for example, a change of driver or client.

In IOTA’s current implementation, each full node contains a snapshot representation of the whole Tangle. That is, a ledger of all transactions within it. In essence, this is the DB. From this, a single node can recreate the full Tangle. As long as the seed of a data stream isn’t lost, and the whole Tangle isn’t compromised, the data will always be available, unchanged and with its original source known, made possible by the Merkle tree signature scheme.

The IOTA Lighthouse

We’ve described the vision, let’s now look at our PoC implementation. The source of information begins at the edge-device sensor. We are now feeding in publicly available, air-pollution data from Berlin and public transport geolocations from Helsinki [3,4]. Every time these sensors generate a reading, it stores it in a message, attaches it to a transaction via MAM and then broadcasts it to the Tangle. This replicates how a real edge-device would feed in sensor data, where the proof of work would be done locally (for example on a Raspberry Pi).

By this process, it becomes amalgamated into the Tangle, and the gossip protocol transmits this new information to all other nodes. The sensor data has now become an unchangeable component of the whole Tangle.