🧲 Part Two — The Foreign Function Interface

Rather than write C directly within the project directory, the FFI gem makes it possible to directly call functions inside compiled libraries via the C library libffi.

This approach differs significantly to MKMF:

Since the intention is to call functions directly within existing binaries, there is no need to compile any native extensions (except for the FFI gem itself when it gets installed);

You don’t need to write any C, just FFI’s DSL for invoking C functions and building/accessing structs and data collections;

It’s compatible with non-MRI Rubies like JRuby (if you use MKMF then MRI is your only available Ruby);

There is extra protection from injury — FFI handles memory management for you, so memory allocated by C is released when the FFI object in question gets garbage collected. More on this later.

We’ll discuss FFI usage via examples from the bindings we wrote at Stuart for H3.

But first, let’s explain some more details about H3 so the upcoming code samples make sense in context.

What is H3?

It’s all about the hexagons!

From Uber’s own literature:

The H3 geospatial indexing system is a multi-precision hexagonal tiling of the sphere indexed with hierarchical linear indexes.

In a nutshell, H3 is a way of identifying any part of the surface of the earth with a unique hexagon. Each hexagon has a unique identifier, known as the H3 index , and a resolution level. At the top resolution level (level 0), it’s possible to cover Earth with 110 hexagons and 12 pentagons (think of the way a football is stitched together! ⚽️💨). These top level hexagons are known as the base cells.

Each of these hexagons can then be broken into seven smaller hexagons. Repeating the process recursively down to the smallest supported resolution allows indexing to the square-metre level of accuracy. In total, there are over 600 trillion unique H3 indexes on Earth.

So, of all shapes, why hexagons?

Bees have long been exploiting the unique properties of regular hexagons.

Well, there are only three polygons that tessellate regularly: the equilateral triangle, the square, and the regular hexagon. Hexagons are unique in that each neighbouring hexagon is the same distance away (if you measure from the centre). Triangles and squares don’t have this property.

As shown below, a triangle has 12 neighbours at 3 differing distances, a square’s 4 diagonal neighbours are further away then its 4 remaining neighbours, and a hexagon’s 6 neighbours are all equidistant from its centre.

So why is this useful? Well, Uber uses H3 for surge pricing by tracking rides in realtime, bucketing them to their containing hexagons, and dynamically adjusting prices bases on supply and demand in those regions.

Hexagons were an important choice because people in a city are often in motion, and hexagons minimize the quantization error introduced when users move through a city.

If you want to learn more, check out this talk by Joseph Gilley:

Really cool stuff — major respect to Uber for open-sourcing this! Read the Uber Engineering H3 page to learn more.

Building the bindings

Ok, enough hexagons! Back to Ruby bindings.

We started out with the MKMF approach when we began writing our Ruby bindings for H3. It worked reasonably well but resulted in quite a lot of boilerplate. The decision was made to move over to FFI and take advantage of the DSL and automatic memory management.

Let’s get our hands dirty with FFI by wrapping a simple function.

The H3 library defines a function called h3ToParent, which takes a H3 index and a resolution, then returns the parent H3 index at the given resolution i.e. the larger hexagon which contains the given hexagon.

The child hexagon (red) contained by its parent hexagon (green) at 1 resolution higher.

🚨 NOTE: These examples are purely demonstrative and may not work when executed in isolation. We encourage you to read the H3 Ruby source code on GitHub for a full working example, boilerplate included.

After telling FFI to load the H3 library, we use attach_function to allow the method to be called from Ruby.

The first argument* gives the name we want to use when calling the method (so snake-case 🐍 instead of camel-case 🐫);

The second argument is the actual name of the function in the C library so FFI can find it;

The third argument is an array of types which informs FFI of the argument types we expect to be passing in (in order);

The last argument is the expected type of the return value.

This kind of approach has a lot of advantages!

It’s easy to read, and we push the burden of validating/converting data types down to FFI itself so we can focus on the details (if we pass a string where it’s expecting an integer, it will raise ArgumentError for us).

*It’s also possible to call attach_function without this alias, so it can be called with the original camel-case.

Custom Types

FFI also allows us to work with types in a more fine-grained way. There’s a typedef method that functions as an alias e.g.

typedef :ulong_long, :h3_index

Now we can talk about h3 indexes instead of unsigned long longs when calling the attach_function method.

Complex Custom Types

We can also build up more complex types using FFI’s DataConverter module.

Let’s say we want to validate the int argument (which represents a resolution) to ensure the value is within an acceptable range of resolutions i.e. 0–15 inclusive.

Now we can use the Resolution class in our attach_function definitions and we’ll get validation errors if the number is out of range!

Passing structs to C functions

Simple functions with only native types for arguments and return values are pretty straightforward to integrate. But what about functions that expect to receive a pointer to a struct as an argument?

Well, FFI has us covered there, too.

Let’s wrap a function, geoToH3 , which expects a GeoCoord struct containing a pair of latitude/longitude coordinates.

In this method, we tell FFI to expect a pointer to a GeoCoord struct. We set one up using the DSL, populate the struct’s fields and then pass it right in.

You’ll notice we wrap the geoToH3 function within a Ruby method, geo_to_h3 , rather than giving it a snake-case alias and calling it directly. This allows Ruby calling code to pass a 2-element array of degree coordinates, rather than needing to care about building a struct with radians coordinates.

So that’s passing structs in. How about a function that returns a struct?

The h3ToGeo function does the inverse of geoToH3 and returns a GeoCoord struct with coordinates corresponding to the given H3 index.

There’s something subtle at play here.

The C function is declared to return void i.e. it doesn’t return anything! This is weird territory as a Ruby programmer — we pass in a GeoCoord struct as an argument by reference and the h3ToGeo function updates the struct’s contents rather than returning us a fresh struct.

“I don’t think we’re in Kansas any more, Toto.”

So, why is this?

Well, C libraries are often written in this way, and it’s to make the client code 100% responsible for memory management. If the function returned a pointer to a new struct, then the function would be responsible for allocating its memory on the heap. This means the client code would then be responsible for eventually freeing it later. This half-and-half responsibility can result in memory leaks, and also takes control away from the client code regarding how memory gets allocated in the first place.

Manipulating memory that the client code is responsible for is preferable, so that’s what good C library developers do. If you’re curious, read more about this (and other minimalist C approaches) here.

Pointers, Memory, and Arrays

At this point, let’s familiarise ourselves with a crucial difference between C and Ruby — how memory is allocated and managed.

Ruby is like staying in a hotel room where you needn’t concern yourself with cleaning up or cooking since you have house-keeping and room-service to look after you.

Ice cream? Certainly, sir!

You simply create objects using .new and get on with your life, with no need to worry about allocating memory (just ask for ice cream and it will be brought to you).

Similarly, you don’t (usually) need to care about destroying objects when you’re finished with them. This is because Ruby’s garbage collector keeps an eye on your allocated objects and destroys them for you once they fall out of scope (housekeeping will get rid of that used ice cream dish).

By contrast, C is being stuck home alone. If you don’t clean up, things get messy; if you don’t cook, you don’t eat; if you’re not careful, you’ll get burned.

“I forgot to call free() !”

Using Pointers with FFI

Let’s take a look at a more involved example where we have to concern ourselves with memory allocation. Thankfully, FFI makes this as painless as possible.

The h3ToString function takes an H3 index in numerical form and converts it to the equivalent hexadecimal representation.

Now our Ruby is beginning to resemble C! 😱

We use FFI’s MemoryPointer class to initialise a piece of memory for us. We tell it that we expect the contents to be of type char , and that there will be a maximum of 17 characters (16 hexadecimal digits plus a null terminator character to indicate the end of the string has been reached). This allows FFI to calculate precisely how many bytes of memory it will need to allocate.

Once our memory buffer is ready, we pass the pointer into h3ToString , along with the expected size. C populates the memory for us, then on the Ruby side we use FFI::MemoryPointer#read_string to read all the characters until the null terminator character is encountered.

This approach does add a little overhead to writing client code, particularly when you have nested structs that need initialising, or if C returns a nested struct that you need to iterate over. However, it has a nice benefit in that MemoryPointer objects are automatically garbage collected (taking the allocated C memory with them). This frees us from the obligation to manually release memory so (hopefully 🤞) we don’t get any leaks.

Building nested structs

Finally, let’s consider using pointer arithmetic to build nested structs.

The GeoFence struct contains a pointer to an array of GeoCoord structs plus an integer count of how many structs are in the array. This allows arbitrary-shaped regions to be described.

This struct requires a bit more legwork to build!

The Ruby method build_geofence accepts an array of coordinate pairs. We set num_verts to be the size of this array, and we use FFI::MemoryPointer to initialise enough memory to hold that many GeoCoord structs.

Now the tricky part.

The memory is reserved, but currently empty. We need to fill it with GeoCoord structs and then populate the lat/lon values for each. The work here is done by GeoCoord.new(ptr + i * GeoCoord.size) .

In this case, ptr is pointing to the memory we initialised to hold our array of structs. FFI memory pointers support pointer arithmetic, so ptr + x will calculate the memory location that is x bytes further on from the first location.

FFI structs can accept a memory location as an argument when initialised, so we pass it the correct memory location via .new .

In the first loop iteration i is zero, so the first GeoCoord struct is initialised with the memory location at the beginning of the memory region referenced by ptr . In the second iteration i is one, so the offset is 1 * GeoCoord.size . This allows the second GeoCoord struct to slot in right after the first. By the end, we have an array of structs referenced by the coords variable but held in contiguous memory by FFI and referenced by ptr .

Now we iterate over the given list_of_coords and populate the structs’ fields with the lat/lon values, converted to radians.

Finally, we set the GeoFence struct’s verts field to equal the ptr variable, and we’re all set.

Phew! 😅