Growing public awareness and concerns over data privacy are pushing tech companies to explore new ways to advance machine learning models without centralizing people’s personal data. One of the world’s largest financial companies, Alibaba subsidiary Ant Financial, recently introduced Shared Machine Learning (SML) as their solution for data privacy.

Ant Financial has spent two years on the research and development of SML as a learning paradigm that can aggregate and process multiparty information while protecting the privacy of individual participants.

Current data protection technologies are either based on Trusted Execution Environment (TEE) or Multiparty Computation (MPC) systems. TEE indicates a secure area for a third-party processor that protects data with confidentiality and integrity. Examples include SGX from Intel, SEV from AMD, and Trust Zone from ARM.

MPC on the other hand is an emerging subfield of cryptography that jointly computes a function over data from different parties while keeping the data private. A recent example is federated learning, a distributed method Google introduced in 2017 to train models on a large corpus of decentralized data.

Ant Financial’s SML solution combines TEE and MPC for applications in banking, insurance, and commerce.

TEE-based SML

SML uses Intel’s SGX technology in its foundation layer and is compatible with other TEE implementations. The SGX-based SML method supports both online prediction and offline training.

Online prediction models have a higher requirement for stability in load balancing, failover, and dynamic capacity expansion. One of the key technologies for improving stability is clustering, but conventional clustering solutions are not applicable on SGX. This prompted Ant Financial to design a new distributed online service framework as shown below.

Unlike conventional clustering methods, in the SML framework each service will register and maintain a heartbeat connection with the ClusterManager (CM).

This framework is able to:

Solve the problems of load balancing, failover, dynamic expansion and shrinkage, and disaster recovery with the clustering solution.

Solve problems such as code upgrade, grayscale publishing, and release rollback with multi-cluster management and SDK heartbeat mechanism.

Reduce the user’s access cost through a built-in technology of ServiceProvider with SDK.

Provide an easy-to-use development framework for users to care less about distributed logic.

Provide a provision agent mechanism to ensure that SGX does not need to connect to the external network, which improves system security.

The framework supports a variety of commonly used prediction algorithms including LR, GBDT, and Xgboost; and enables prediction on encrypted data from multiple parties.

In offline training, the SGX-based SML framework is compatible with Xgboost using LibOsOcclum and a home-grown distributed networking system to support data fusion and distributed training. Ant Financial is currently also using this solution to migrate TensorFlow.

TEE-based shared machine learning on multiparty data works as follows:

Institutional users download encryption tools from Data Lab;

Data is encrypted using an encryption tool that embeds the RA process to ensure that encrypted information is only decrypted in the specified Enclave;

Users upload encrypted data to cloud storage;

Users build training tasks on Data Lab’s training platform;

The training platform delivers training tasks to the training engine;

The training engine starts the training-related Enclave and reads the encrypted data from the cloud storage to complete the specified training tasks.

MPC-based SML

Ant Financial’s MPC-based SML framework has three layers:

Security technology layer: The security technology layer provides basic security technology implementation, such as the secret sharing, homomorphic encryption, and obfuscation circuits mentioned above; and additional security features such as differential privacy technology, DH algorithm, etc.

Basic operator layer: Based on the security technology layer, Ant Financialwill do some basic operator encapsulation, including multiparty data security intersection, matrix addition, matrix multiplication, and calculation of sigmoid function, ReLU function, etc. in multiparty scenarios. The same operator may have multiple implementations to adapt to different scenarios and keep the interfaces consistent;

Secure machine learning algorithm: With the basic operator, it is very convenient to develop secure machine learning algorithms. The technical difficulty here is how to reuse existing algorithms and existing frameworks as much as possible.

The MPC-based SML framework supports popular algorithms including LR, GBDT, GNN, etc. Below is the training process:

Users download training services from Data Lab and deploy locally;

User builds training tasks on Data Lab’s training platform;

The training platform sends the training task to the training engine;

The training engine sends the task to the training server on the organization side;

Worker loads local data;

Workers complete training tasks through multiparty security protocols based on the training tasks delivered by workers.

The specific architecture of the training engine is shown below:

Federated learning vs. shared machine learning

Ant Financial also identified a couple of major differences between federated learning and shared machine learning:

Federal learning only solves a problem with in-domain data, which limits the use of technology (only MPC algorithms meet this requirement), while SML is applicable to TEE.

Federal learning requires similar “identity and status” of different participating parties, while in SML different participants can be different roles.

More information is available (in Mandarin) in this Ant Financial tech post.