In this article we will provide an updated roadmap based on the current state of development. More importantly however, we will share details about the ongoing development and powerful use case for the APEX Network Data Module that we have devoted most of our development resources to lately. This module is crucial to the highest value use case for APEX Network’s public chain, and considering the potential we view this use case as having, the code will not be open sourced at this point in time. Having first mover advantage with this use case for our blockchain is of the utmost importance to preserve a competitive advantage. However we will be sharing our prototype UI wireframe and explain what the data module is all about. Firstly however, we must explain the concept of federated learning.

What is federated learning

Modern mobile devices store a large amount of data suitable for machine learning, which can greatly improve the user experience on the device. For example, language models can improve speech recognition and text input, and image models can automatically select high-quality photos. However, this rich data is often very sensitive due to privacy concerns, in large quantities, or both. This may prevent you from logging into the data center and using conventional methods for training. Google proposed an alternative in 2016 that keeps training data for predictive AI algorithms on mobile devices and enables aggregating updates from local calculations. Google calls this decentralized approach federated learning. The federated learning proposed by Google has guided the development of the entire private computing technology. With the current technology, it is possible to create a federated learning data protocol that complies with GDPR regulation.

Let’s take an example and introduce federated learning in detail

Suppose there are two different companies, A and B. They have a large proportion of the same customers, but the data content of the customers is inconsistent. For example, Enterprise A is a large supermarket chain with consumer shopping data; Enterprise B is a video website with customer video

consumption data. According to the GDPR guidelines, these two companies cannot rudely merge the data of both parties, because the original providers of the data (their respective users) cannot agree to do so.

Why federated learning between companies needs blockchain technology

In this section, we will introduce some of the implementation details of the horizontal federated learning task, so that everyone can further understand why federated learning needs blockchain technology to support and enable it.

Baseline variables for Enterprise A and Enterprise B using federated learning

The approximate execution process for federated learning

If Enterprise A finds that the data owned by Enterprise B has significantly improved the model, both parties will establish a data pipeline to provide prediction services. The joint modeling model has features owned by Company B, tree models (Randomforest, Xgboost and other models, Kaggle has proven that Xgboost is a well-deserved king in structured tasks) and requires Enterprise B to provide it when it reaches the corresponding node. The federated learning model is in the hands of Enterprise A, and Enterprise A determines the decision result (left branch or right branch) through the value returned by the data channel. This process may require hundreds of data requests. If the new client company B does not have corresponding data, company A will use its own original data to model the prediction.

So why blockchain?

In the prediction task, there may be a problem arising from the fact that a potentially competitive relationship can exist between Enterprise B and Enterprise A. There may not be any problem with the modeling data provided

by Enterprise B, but Enterprise B could have fake data in the subsequent forecasting services. If that was the case, it would impact the quality of the resulting prediction model and thus cause losses to Company A. Even if Company B is honest, it is possible that after a period of time the predictive power of the model has severely decreased. At this point, cooperation should be terminated and remodeled. Dishonest behavior should be punished for members of the entire alliance. Therefore, the intermediate data (calculation results, not raw customer data) of the entire prediction in the prediction task should be uploaded to retain the evidence for accountability. Monitoring the state of the model is very important, and for this reason we have added monitoring of model stability and forecast costs to the enterprise level public blockchain system. This is in line with GDPR compliant data sharing as no raw customer data is transmitted, and raw customer data is not stored on the blockchain (which would violate the right to be forgotten) but instead references in the form of higher level calculation results. These calculation results cannot be deciphered, and are only stored to ensure accountability / prevent bad actors. Using an immutable ledger (blockchain) in this way guarantees data authenticity and preserves privacy by design, in a way that centralized databases cannot do. On a side note, considering that the federated learning based data module is using our public blockchain, naturally mainnet CPX is needed to pay fees for enterprises using the module to improve their AI algorithms.

Public or private blockchain?

The data module will be a built in protocol on the public APEX Network blockchain. In our view it does not make sense to translate this process to private blockchains as companies have no incentive to do so. The purpose of the federated learning process is to be open and transparent, which runs counter to the idea of private blockchains.

In addition, consider the following: Even if an enterprise builds their own private chain, finding cooperative companies is not enough. At the same time, the data owned by those companies must be relevant and they must be able to improve each other’s models. This process would carry prohibitive costs. The more members of the APEX Federated Learning Alliance, the greater the value of the platform.

To relate to existing market structures: If you choose to sell things online, would you choose to build your own website or open a store directly on Amazon? Chinese entrepreneurs like Taobao and Alipay are by some called the infrastructure of the Internet world.

The approximate analysis interface is as follows

Monitoring of model status