This was originally given as a PyHEP 2018 talk, It is designed to be interactive, and can be run in SWAN if you have a CERN account. If you want to run it manually, just download the repository: github.com/henryiii/pybindings_cc. It is easy to run in Anaconda.

Focus

What Python bindings do

How Python bindings work

What tools are available

Caveats

Will cover C++ and C binding only

Will not cover every tool available

Will not cover cppyy in detail (but see Enric’s talk)

in detail (but see Enric’s talk) Python 2 is dying, long live Python 3! but this talk is Py2 compatible also



Overview:

Part one

Tool(s) Features Opionion of Author ctypes, CFFI Pure Python, C only Great for simple cases CPython How all bindings work Too complex for most cases SWIG Multi-language, automatic Too automatic for most cases Cython New language Can be very verbose Pybind11 Pure C++11 Often a great fit CPPYY From ROOT’s JIT engine Handles templates!

Part two

An advanced binding in Pybind11

If you have the original talk from the repository, it is an interactive notebook, and no code will be hidden. Here are the required packages:

! pip install -- user cffi pybind11 numba # Other requirements: cython cppyy (SWIG is also needed but not a python module) # Using Anaconda recommended for users not using SWAN

If you are not on SWAN, you will want cython and cppyy as well. SWIG is also needed but not a python module, so be sure you find a way to get that.

Here are the standard imports. We will also add two variables to help with compiling:

from __future__ import print_function import os import sys from pybind11 import get_include inc = '-I ' + get_include ( user = True ) + ' -I ' + get_include ( user = False ) plat = '-undefined dynamic_lookup' if 'darwin' in sys . platform else '-fPIC'

What is meant by bindings?

Bindings allow a function(alitiy) in a library to be accessed from Python.

We will start with this example:

%% writefile simple . c float square ( float x ) { return x * x ; }

Desired usage in Python:

y = square ( x )

C bindings are very easy. Just compile into a shared library, then open it in python with the built in ctypes module:

! cc simple . c - shared - o simple . so

from ctypes import cdll , c_float lib = cdll . LoadLibrary ( './simple.so' ) lib . square . argtypes = ( c_float ,) lib . square . restype = c_float lib . square ( 2.0 )

4.0

This may be all you need! Example: AmpGen Python interface. In fact, in Pythonista for iOS, we can even use ctypes to access Apple’s public APIs!

The C Foreign Function Interface for Python

Still C only

Developed for PyPy, but available in CPython too

We start with the same example as before:

from cffi import FFI ffi = FFI () ffi . cdef ( "float square(float);" ) C = ffi . dlopen ( './simple.so' ) C . square ( 2.0 )

4.0

Let’s see how bindings work before going into C++ binding tools

This is how CPython itself is implemented

C reminder: static means visible in this file only

%% writefile pysimple . c #include <Python.h> float square ( float x ) { return x * x ; } static PyObject * square_wrapper ( PyObject * self , PyObject * args ) { float input , result ; if ( ! PyArg_ParseTuple ( args , "f" , & input )) { return NULL ;} result = square ( input ); return PyFloat_FromDouble ( result );} static PyMethodDef pysimple_methods [] = { { "square" , square_wrapper , METH_VARARGS , "Square function" }, { NULL , NULL , 0 , NULL } }; #if PY_MAJOR_VERSION >= 3 static struct PyModuleDef pysimple_module = { PyModuleDef_HEAD_INIT , "pysimple" , NULL , - 1 , pysimple_methods }; PyMODINIT_FUNC PyInit_pysimple ( void ) { return PyModule_Create ( & pysimple_module ); } #else DL_EXPORT ( void ) initpysimple ( void ) { Py_InitModule ( "pysimple" , pysimple_methods ); } #endif

Build:

! cc { inc } - shared - o pysimple . so pysimple . c { plat }

Run:

import pysimple pysimple . square ( 2.0 )

4.0

C++: Why do we need more?

Sometimes simple is enough! And, if we are in C++, we can use export "C" to export a C interface. But, C++ API can have overloading, classes, memory management, etc… We could manually translate everything using C API, but there’s a better way…

Solution:

C++ binding tools!

This is our C++ example:

%% writefile SimpleClass . hpp #pragma once class Simple { int x ; public : Simple ( int x ): x ( x ) {} int get () const { return x ;} };

Overwriting SimpleClass.hpp

SWIG: Produces “automatic” bindings

Works with many output languages

Has supporting module built into CMake

Very mature

Downsides:

Can be all or nothing

Hard to customize

Customizations tend to be language specific

Slow development

%% writefile SimpleSWIG . i % module simpleswig % { /* Includes the header in the wrapper code */ #include "SimpleClass.hpp" % } /* Parse the header file to generate wrappers */ % include "SimpleClass.hpp"

Overwriting SimpleSWIG.i

! swig - python - c ++ SimpleSWIG . i

! c ++ - shared SimpleSWIG_wrap . cxx { inc } - o _simpleswig . so { plat }

import simpleswig x = simpleswig . Simple ( 2 ) x . get ()

2

Built to be a Python+C language for high performance computations

Performance computation space in competition with Numba

Due to design, also makes binding easy

Easy to customize result

Can write Python 2 or 3, regardless of calling language

Downsides:

Requires learning a new(ish) language

Have to think with three hats

Very verbose

Aside: Speed comparison Python, Cython, Numba

We’ll take a quick minute to look at what Cython (and Numba) was built for: fast from-scratch computing.

If we look at a really stupidly useless example in Python:

def f ( x ): for _ in range ( 100000000 ): x = x + 1 return x

And time it:

%% time f ( 1 )

CPU times: user 6.21 s, sys: 708 µs, total: 6.22 s Wall time: 6.21 s

We’ll see that it takes a long time just to add numbers. Let’s try in Cython:

% load_ext Cython

%% cython def f ( int x ): for _ in range ( 100000000 ): x = x + 1 return x

%% timeit f ( 23 )

64.9 ns ± 3.96 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Wow, that’s so much faster. In fact, if we assume this only consists of one instruction per add, with 100M instructions, we have a 15 PHz machine! Hopefully, this does not sound right; in fact, the C compiler was smart enough to optimise the loop into a single add!

Let’t try again in numba. This time, we don’t need any magics or special compilers, just the numba library:

import numba @numba . jit def f ( x ): for _ in range ( 100000000 ): x = x + 1 return x

% time f ( 41 )

CPU times: user 13 µs, sys: 1 µs, total: 14 µs Wall time: 39.1 µs

%% timeit f ( 41 )

213 ns ± 19 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Here, we have a similar result; the first run was “slow”, and the rest of the runs were almost as fast as Cython (and like Cython, this does not depend on the number of iterations; the LLVM compiler has optimised away the loop). There is a tiny bit more overhead in the call, and that’s it.

Binding with Cython

Back to our original problem, making bindings.

%% writefile simpleclass . pxd # distutils: language = c++ cdef extern from "SimpleClass.hpp" : cdef cppclass Simple : Simple ( int x ) int get ()

%% writefile cythonclass . pyx # distutils: language = c++ from simpleclass cimport Simple as cSimple cdef class Simple : cdef cSimple * cself def __cinit__ ( self , int x ): self . cself = new cSimple ( x ) def get ( self ): return self . cself . get () def __dealloc__ ( self ): del self . cself

! cythonize cythonclass . pyx

Compiling pybindings_cc/cythonclass.pyx because it changed [1/1] Cythonizing pybindings_cc/cythonclass.py

! g ++ cythonclass . cpp - shared { inc } - o cythonclass . so { plat }

import cythonclass x = cythonclass . Simple ( 3 ) x . get ()

3

Similar to Boost::Python, but easier to build

Pure C++11 (no new language required), no dependencies

Builds remain simple and don’t require preprocessing

Easy to customize result

Great Gitter community

Used in GooFit 2.1+ for CUDA too [CHEP talk]

Downsides:

Still verbose

Development variable

%% writefile pybindclass . cpp #include <pybind11/pybind11.h> #include "SimpleClass.hpp" namespace py = pybind11 ; PYBIND11_MODULE ( pybindclass , m ) { py :: class_ < Simple > ( m , "Simple" ) . def ( py :: init < int > ()) . def ( "get" , & Simple :: get ) ; }

Overwriting pybindclass.cpp

! c ++ - std = c ++ 11 pybindclass . cpp - shared { inc } - o pybindclass . so { plat }

import pybindclass x = pybindclass . Simple ( 4 ) x . get ()

4

Born from ROOT bindings

Built on top of Cling

JIT, so can handle templates

See Enric’s talk for more

Downsides:

Header code runs in Cling

Heavy user requirements (Cling)

ROOT vs. pip version

Broken on SWAN due to ROOT version

import cppyy

cppyy . include ( 'SimpleClass.hpp' ) x = cppyy . gbl . Simple ( 5 ) x . get ()