SPOC is a set of tools for GPGPU programming with OCaml.

The SPOC library enables the detection and use of GPGPU devices with OCaml using Cuda and OpenCL. There is also a camlp4 syntax extension to handle external Cuda or OpenCL kernels, as well as a DSL (called Sarek) to express GPGPU kernels from the OCaml code.

This work was part of my PhD thesis (UPMC-LIP6 laboratory, Paris, France) and was partially funded by the OpenGPU project. I continued this project in 2014-2015 in the Verimag laboratory (Grenoble, France) and then from 2015 to 2018 in the LIFO laboratory in Orléans, France. I currently work at Nomadic Labs

It has currently been tested on multiple architectures and systems, mostly 64-bit Linux and 64-bit OSX systems. It should work with Windows too.

To be able to use SPOC, you’ll need a computer capable of running OCaml (obviously) but also compatible with either OpenCL or Cuda. For Cuda you only need a current proprietary NVidia driver while for OpenCL you need to install the correct OpenCL implementation for your system. SPOC should compile anyway as everything is dynamically linked, but you’ll need Cuda/OpenCL eventually to run your programs.

SPOC comes with some examples and I strongly advise anyone interested to look into the slides and papers. For basic tutorials, you should look into our live demos.

Current work

Spoc and Sarek are still in development, here is a list of features we plan to add (bold ones are currently in development) :

Add a performance model to Sarek

Add custom types to Sarek (using Ctypes/Js_of_ocaml) -> example with Cards

-> example with Cards Allow recursive functions in Sarek

Enable List handling with Spoc and Sarek

Add interoperability with OpenGL

Docker image (Probably deprecated)

Demos in your browser (experimental)

Using WebCL and js_of_ocaml :

Sadly WebCL development seems stopped, [WebGPU]https://github.com/gpuweb/gpuweb/wiki/Implementation-Status) might be an alternative…

This has been tested with Firefox 26-34 and this plugin under Windows 8.1 (32bit or 64bit), multiple Linux distributions (32bit and 64 bit) and Mac OS/X Mavericks and Yosemite.

Of course, you’ll also need to have OpenCL on your system.

How to test?

You should install/have :

Firefox 28-34 on your system,

Nokia’s plugin

an OpenCL implementation for your sytem (AMD’s one should work for most multicore x86 CPUs)

I - How to Build SPOC

1 - Dependencies

Requires :

* ocaml >= 4.01.0 (mainly tested with ocaml 4.02.1) * camlp4 * ocamlfind * camlp4-extra (for ubuntu) * m4

For cuda compilation :

* nvcc

2 - Compilation & Installation

From Opam (Deprecated for now, new packages may be developed in the future)

SPOC and Sarek should be in the opam repository.

For development releases (more up to date but maybe instable), simply add our repository :

opam repository add spoc_repo https://github.com/mathiasbourgoin/opam_repo_dev.git

then

opam update spoc_repo opam install spoc

From sources

To compile SPOC:

cd Spoc make make install

3 - Build Documentation

From the sources :

make doc

Will build the ocamldoc html pages in the Spoc/docs directory

II - Testing SPOC

The “Samples” directory contains few programs using SPOC.

To compile those programs:

cd Samples make

Binaries will be located in the Samples/build folder

III - SPOCLIBS

The “SpocLibs” directory contains few libraries based on Spoc.

Compose allows basic composition over GPGPU kernels

allows basic composition over GPGPU kernels Cublas allows to use some functions of the Cublas library (Cublas needs Cuda SDK to compile)

allows to use some functions of the Cublas library (Cublas needs Cuda SDK to compile) Sarek is an experimental embedded DSL for OCaml to express kernels from the OCaml program

The Sample directory contains few samples using those libraries

Publications

Talks

Book Chapter

M. Bourgoin, E. Chailloux, J.-L. Lamotte : “Experiments with Spoc.”, In Patterns for parallel programming on GPUs, F. Magoules (Ed.), Saxe-Coburg Publications, 2015

Peer Reviewed Papers

General Presentations

M. Bourgoin, E. Chailloux, J.-L. Lamotte : “Efficient Abstractions for GPGPU Programming” , International Journal of Parallel Programming, 2013.

, International Journal of Parallel Programming, 2013. M. Bourgoin, E. Chailloux, J.-L. Lamotte : “High-Performance GPGPU Programming with OCaml” , OCaml 2013, 2013

, OCaml 2013, 2013 M. Bourgoin, E. Chailloux, J.-L. Lamotte : “Efficient Abstractions for GPGPU Programming” , International Symposium on High-level Parallel Programming and Applications (HLPP), 2013.

, International Symposium on High-level Parallel Programming and Applications (HLPP), 2013. M. Bourgoin, E. Chailloux, J.-L. Lamotte : “SPOC: GPGPU Programming Through Stream Processing with OCAML” , Parallel Processing Letters, vol. 22 (2), pp. 1-12 (2012)

, Parallel Processing Letters, vol. 22 (2), pp. 1-12 (2012) M. Bourgoin, E. Chailloux, J.-L. Lamotte : “SPOC : GPGPU programming through Stream Processing with OCaml”, HLPGPU2012 workshop, pp. 1-8 (2012)

On Composition and Skeletons

M. Bourgoin, E. Chailloux : “GPGPU Composition with OCaml” , Array 2014, 2014.

, Array 2014, 2014. M. Bourgoin, E. Chailloux, J.-L. Lamotte : “Experiments with Spoc.”, Workshop OpenGPU, HIPEAC 2012., Paris, France (2012)

On Web Programming with SPOC and Sarek

M. Bourgoin, E. Chailloux : “High-Level Accelerated Array Programming in the Web Browser” , Array 2015, 2015.

, Array 2015, 2015. M. Bourgoin, E. Chailloux : “High Performance Client-Side Web Programming with SPOC and Js_of_ocaml”, OCaml 2014, 2014.

On Applications

M. Bourgoin, E. Chailloux, J.-L. Lamotte : “Retour d’experience : portage d’une application haute-performance vers un langage de haut niveau”, ComPas/RENPAR 2013, pp. 1-8 (2013)

LIFO - Bâtiment IIIA

Rue Léonard de Vinci

B.P. 6759

F-45067 ORLEANS Cedex 2

France

Mathias.Bourgoin (at) univ-orleans.fr