Hello everyone,

In this 15th Dev Letter, we are happy to propose a new article that goes deeper in the technology behind the iExec decentralized cloud. This letter’s focus will be on Graphics Processing Unit (GPU) management.

Introduction

In a previous article, we have introduced the iExec paradigm of “provider sharing” and focused on virtual machine (VM) as well as Docker management as first use cases. Within iExec, this paradigm of provider sharing is not limited to applications, but also permits to share libraries.

In this article, we’ll first introduce GPU and TensorFlow. We’ll afterward show how our paradigm of provider sharing permits to include GPU in our scheduling, so that applications with such a requirement are executed by GPU-enabled computing resources only. Finally, we’ll conclude with a use case detailing how a user can register a GPU application and submit jobs for it.

General Purpose GPU

The use of a Graphics Processing Unit (GPU) together with a CPU accelerates engineering and artificial intelligence applications.

Introduced in the last decade, GPU technology is now widely used in all platforms, from supercomputers to personal computers, for general purpose computing (GPGPU) [2].

The performance improvement goes along with suitable “green efficiency” properties, showing a better performance/cost ratio than CPUs.

The new programming concepts and tools are now well managed by software editors and many scientific libraries.

Indeed, different softwares and solution libraries in various application areas among which deep learning, physics simulations or computer vision, are now able to leverage the important speed-up/cost ratio benefits.

TensorFlow for Deep Learning

TensorFlow is an open-source machine learning framework that was originally developed by researchers and engineers within Google’s Machine Intelligence research organisation.

The system is designed to facilitate and democratize research in machine learning and put the power of deep learning into the hands of many developers.

In only a few years, TensorFlow has became one of the first-in-class open-source frameworks for implementing machine learning. Airbnb, Uber, Intel, Dropbox or Snapchat are all companies making use of TensorFlow.

At iExec, we are exploring machine and deep learning applications with intense computational requirements and using libraries such as TensorFlow.

The CUDA version of TensorFlow is a promising option to reduce significantly simulation elapsed time.

Image Recognition on NVIDIA CUDA GPUs

To illustrate this, we are building a dapp that proposes to solve image recognition problems with TensorFlow. It is based on the advanced tutorial of the TensorFlow documentation [1].

Our dapp is designed to be run on iExec, backed by NVIDIA CUDA GPUs to speed up the simulation. The goal of this tutorial is to build a Convolutional Neural Network (CNN) for recognizing images.

The model used in this CIFAR-10 tutorial is a multi-layer architecture, consisting of alternating convolutions and nonlinearities.

Training the neural network is highly computational to create an accurate model, therefore the need to train it simultaneously on multiple GPUs.

The next step will be to run this use case with the iExec SDK on GPU workers.

Toward running an application on GPU workers

The iExec SDK is written to propose a single workflow to ease, as much as possible, the understanding of the services and protocols implemented. The SDK permits to specify requirements for any dapp.

For example, a Docker-based dapp to be deployed on iExec must have the Docker application type as shown below (please refer to this article):

app: { type: ‘DOCKER’, envvars: ‘XWDOCKERIMAGE=ikester/blender’, }

Writing a GPU-enabled dapp consists of defining another Docker image and simply introducing a new application parameter:

app: { type: ‘DOCKER’, envvars: ‘XWDOCKERIMAGE=nvidia/cuda’, neededpackages : ‘CUDA’, }

That’s it for this Dev Letter. Credits to Eric Rodriguez, our GPU and HPC expert as well as Oleg Lodygensky, our CTO. Stay tuned for the next one as we still have exciting news to share with you! 🚀

Limits

We’ve started working on GPU support with early tests. While GPU computing is the main topic of Version 4 (HPC), we hope to already have in the near future some specific frameworks and applications running on selected GPU providers. This means GPU will not be fully supported immediately, as this is a challenging work due to the variety of hardware, libraries and VM support.

Sources

[1] Convolutional Neural Networks from the TensorFlow documentation

[2] General-purpose computing on graphics processing units on Wikipedia

Stay in touch