Exposing Rust objects to C code

When librsvg parses an SVG file, it will encounter elements that generate path-like objects: lines, rectangles, polylines, circles, and actual path definitions. Internally, librsvg translates all of these into path definitions. For example, librsvg will read an element from the SVG that defines a rectangle like

<rect x="20" y="30" width="40" height="50" style="..."></rect>

and translate it into a path definition with the following commands:

move_to (20, 30) line_to (60, 30) line_to (60, 80) line_to (20, 80) line_to (20, 30) close_path ()

But where do those commands live? How are they fed into Cairo to actually draw a rectangle?

Get your Cairo right here

One of librsvg's public API entry points is rsvg_handle_render_cairo() :

gboolean rsvg_handle_render_cairo (RsvgHandle * handle, cairo_t * cr);

Your program creates an appropriate Cairo surface (a window, an off-screen image, a PDF surface, whatever), obtains a cairo_t drawing context for the surface, and passes the cairo_t to librsvg using that rsvg_handle_render_cairo() function. It means, "take this parsed SVG (the handle ), and render it to this cairo_t drawing context".

SVG files may look like an XML-ization of a tree of graphical objects: here is a group which contains a blue rectangle and a green circle, and here is a closed Bézier curve with a black outline and a red fill. However, SVG is more complicated than that; it allows you to define objects once and recall them later many times, it allows you to use CSS cascading rules for applying styles to objects ("all the objects in this group are green unless they define another color on their own"), to reference other SVG files, etc. The magic of librsvg is that it resolves all of that into drawing commands for Cairo.

Feeding a path into Cairo

This is easy enough: Cairo provides an API for its drawing context with functions like

void cairo_move_to (cairo_t *cr, double x, double y); void cairo_line_to (cairo_t *cr, double x, double y); void cairo_close_path (cairo_t *cr); /* Other commands ommitted */

Librsvg doesn't feed paths to Cairo as soon as it parses them from the XML; that is done until rendering time. In the meantime, librsvg has to keep an intermediate representation of path data.

Librsvg uses an RsvgPathBuilder object to hold on to this path data for as long as needed. The API is simple enough:

pub struct RsvgPathBuilder { ... } impl RsvgPathBuilder { pub fn new () -> RsvgPathBuilder { ... } pub fn move_to (&mut self, x: f64, y: f64) { ... } pub fn line_to (&mut self, x: f64, y: f64) { ... } pub fn curve_to (&mut self, x2: f64, y2: f64, x3: f64, y3: f64, x4: f64, y4: f64) { ... } pub fn close_path (&mut self) { ... } }

This mimics the sub-API of cairo_t to build paths, except that instead of feeding them immediately into the Cairo drawing context, RsvgPathBuilder builds an array of path commands that it will later replay to a given cairo_t . Let's look at the methods of RsvgPathBuilder .

" pub fn new () -> RsvgPathBuilder " - this doesn't take a self parameter; you could call it a static method in languages that support classes. It is just a constructor.

" pub fn move_to (&mut self, x: f64, y: f64) " - This one is a normal method, as it takes a self parameter. It also takes (x, y) double-precision floating point values for the move_to command. Note the " &mut self ": this means that you must pass a mutable reference to an RsvgPathBuilder, since the method will change the builder's contents by adding a move_to command. It is a method that changes the state of the object, so it must take a mutable object.

The other methods for path commands are similar to move_to. None of them have return values; if they did, they would have a " -> ReturnType " after the argument list.

But that RsvgPathBuilder is a Rust object! And it still needs to be called from the C code in librsvg that hasn't been ported over to Rust yet. How do we do that?

Exporting an API from Rust to C

C doesn't know about objects with methods, even though you can fake them pretty well with structs and pointers to functions. Rust doesn't try to export structs with methods in a fancy way; you have to do that by hand. This is no harder than writing a GObject implementation in C, fortunately.

Let's look at the C header file for the RsvgPathBuilder object, which is entirely implemented in Rust. The C header file is rsvg-path-builder.h . Here is part of that file:

typedef struct _RsvgPathBuilder RsvgPathBuilder; G_GNUC_INTERNAL void rsvg_path_builder_move_to (RsvgPathBuilder *builder, double x, double y); G_GNUC_INTERNAL void rsvg_path_builder_line_to (RsvgPathBuilder *builder, double x, double y);

Nothing special here. RsvgPathBuilder is an opaque struct; we declare it like that just so we can take a pointer to it as in the rsvg_path_builder_move_to() and rsvg_path_builder_line_to() functions.

How about the Rust side of things? This is where it gets more interesting. This is part of path-builder.rs :

extern crate cairo; // 1 pub struct RsvgPathBuilder { // 2 path_segments: Vec<cairo::PathSegment>, } impl RsvgPathBuilder { // 3 pub fn move_to (&mut self, x: f64, y: f64) { // 4 self.path_segments.push (cairo::PathSegment::MoveTo ((x, y))); // 5 } } #[no_mangle] // 6 pub extern fn rsvg_path_builder_move_to (raw_builder: *mut RsvgPathBuilder, // 7 x: f64, y: f64) { assert! (!raw_builder.is_null ()); // 8 let builder: &mut RsvgPathBuilder = unsafe { &mut (*raw_builder) }; // 9 builder.move_to (x, y); // 10 }

Let's look at the numbered lines:

1. We use the cairo crate from the excellent gtk-rs, the Rust binding for GTK+ and Cairo.

2. This is our Rust structure. Its fields are not important for this discussion; they are just what the struct uses to store Cairo path commands.

3. Now we begin implementing methods for that structure. These are Rust-side methods, not visible from C. In 4 and 5 we see the implementation of ::move_to() ; it just creates a new cairo::PathSegment and pushes it to the vector of segments.

6. The " #[no_mangle] " line instructs the Rust compiler to put the following function name in the .a library just as it is, without any name mangling. The function name without name mangling looks just like rsvg_path_builder_move_to to the linker, as we expect. A name-mangled Rust function looks like _ZN14rsvg_internals12path_builder15RsvgPathBuilder8curve_to17h1b8f49042ff19daaE — you can explore these with " objdump -x rust/target/debug/librsvg_internals.a "

7. " pub extern fn rsvg_path_builder_move_to (raw_builder: *mut RsvgPathBuilder ". This is a public function with an exported symbol in the .a file, not an internal one, as it will be called from the C code. And the " raw_builder: *mut RsvgPathBuilder " is Rust-ese for "a pointer to an RsvgPathBuilder with mutable contents". If this were only an accessor function, we would use a " *const RsvgPathBuilder " argument type.

8. " assert! (!raw_builder.is_null ()); ". You can read this as " g_assert (raw_builder != NULL); " if you come from GObject land.

9. " let builder: &mut RsvgPathBuilder = unsafe { &mut (*raw_builder) } ". This declares a builder variable, of type &mut RsvgPathBuilder , which is a reference to a mutable path builder. The variable gets intialized with the result of " &mut (*raw_builder) ": first we de-reference the raw_builder pointer with the asterisk, and convert that to a mutable reference with the &mut. De-referencing pointers that come from who-knows-where is an unsafe operation in Rust, as the compiler cannot guarantee their validity, and so we must wrap that operation with an unsafe{} block. This is like telling the compiler, "I acknowledge that this is potentially unsafe". Already this is better than life in C, where *every* de-reference is potentially dangerous; in Rust, only those that "bring in" pointers from the outside are potentially dangerous.

10. Now we have a Rust-side reference to an RsvgPathBuilder object, and we can call the builder.move_to() method as in regular Rust code.

Those are methods. And the constructor/destructor?

Excellent question! We defined an absolutely conventional method, but we haven't created a Rust object and sent it over to the C world yet. And we haven't taken a Rust object from the C world and destroyed it when we are done with it.

Construction

Here is the C prototype for the constructor, exactly as you would expect from a GObject library:

G_GNUC_INTERNAL RsvgPathBuilder *rsvg_path_builder_new (void);

And here is the corresponding implementation in Rust:

#[no_mangle] pub unsafe extern fn rsvg_path_builder_new () -> *mut RsvgPathBuilder { // 1 let builder = RsvgPathBuilder::new (); // 2 let boxed_builder = Box::new (builder); // 3 Box::into_raw (boxed_builder) // 4 }

1. Again, this is a public function with an exported symbol. However, this whole function is marked as unsafe since it returns a pointer, a *mut RsvgPathBuilder . To Rust this declaration means, "this pointer will be out of your control", hence the unsafe . With that we acknowledge our responsibility in handling the memory to which the pointer refers.

2. We instantiate an RsvgPathBuilder with normal Rust code...

3. ... and ensure that that object is put in the heap by Boxing it. This is a common operation in garbage-collected languages. Boxing is Rust's primitive for putting data in the program's heap; it allows the object in question to outlive the scope where it got created, i.e. the duration of the rsvg_path_builder_new() function.

4. Finally, we call Box::into_raw() to ask Rust to give us a pointer to the contents of the box, i.e. the actual RsvgPathBuilder struct that lives there. This statement doesn't end in a semicolon, so it is the return value for the function.

You could read this as " builder = g_new (...); initialize (builder); return builder; ". Allocate something in the heap and initialize it, and return a pointer to it. This is exactly what the Rust code is doing.

Destruction

This is the C prototype for the destructor. This not a reference-counted GObject; it is just an internal thing in librsvg, which does not need reference counting.

G_GNUC_INTERNAL void rsvg_path_builder_destroy (RsvgPathBuilder *builder);

And this is the implementation in Rust:

#[no_mangle] pub unsafe extern fn rsvg_path_builder_destroy (raw_builder: *mut RsvgPathBuilder) { // 1 assert! (!raw_builder.is_null ()); // 2 let _ = Box::from_raw (raw_builder); // 3 }

1. Same as before; we declare the whole function as public, exported, and unsafe since it takes a pointer from who-knows-where.

2. Same as in the implementation for move_to(), we assert that we got passed a non-null pointer.

3. Let's take this bit by bit. " Box::from_raw (raw_builder) " is the counterpart to Box::into_raw() from above; it takes a pointer and wraps it with a Box, which Rust knows how to de-reference into the actual object it contains. " let _ = " is to have a variable binding in the current scope (the function we are implementing). We don't care about the variable's name, so we use _ as a default name. The variable is now bound to a reference to an RsvgPathBuilder. The function terminates, and since the _ variable goes out of scope, Rust frees the memory for the RsvgPathBuilder. You can read this idiom as " g_free (builder) ".

Recapitulating

Make your object. Box it. Take a pointer to it with Box::into_raw() , and send it off into the wild west. Bring back a pointer to your object. Unbox it with Box::from_raw() . Let it go out of scope if you want the object to be freed. Acknowledge your responsibilities with unsafe and that's all!

Making the functions visible to C

The code we just saw lives in path-builder.rs . By convention, the place where one actually exports the visible API from a Rust library is a file called lib.rs , and here is part of that file's contents in librsvg:

pub use path_builder::{ rsvg_path_builder_new, rsvg_path_builder_destroy, rsvg_path_builder_move_to, rsvg_path_builder_line_to, rsvg_path_builder_curve_to, rsvg_path_builder_close_path, rsvg_path_builder_arc, rsvg_path_builder_add_to_cairo_context }; mod path_builder;

The mod path_builder indicates that lib.rs will use the path_builder sub-module. The pub use block exports the functions listed in it to the outside world. They will be visible as symbols in the .a file.

The Cargo.toml (akin to a toplevel Makefile.am) for my librsvg's little sub-library has this bit:

[lib] name = "rsvg_internals" crate-type = ["staticlib"]

This means that the sub-library will be called librsvg_internals.a , and it is a static library. I will link that into my master librsvg.so . If this were a stand-alone shared library entirely implemented in Rust, I would use the "cdylib" crate type instead.

Linking into the main .so

In librsvg/Makefile.am I have a very simplistic scheme for building the librsvg_internals.a library with Rust's tools, and linking the result into the main librsvg.so :

RUST_LIB = rust/target/debug/librsvg_internals.a .PHONY: rust/target/debug/librsvg_internals.a rust/target/debug/librsvg_internals.a: cd rust && \ cargo build --verbose librsvg_@RSVG_API_MAJOR_VERSION@_la_CPPFLAGS = ... librsvg_@RSVG_API_MAJOR_VERSION@_la_CFLAGS = ... librsvg_@RSVG_API_MAJOR_VERSION@_la_LDFLAGS = ... librsvg_@RSVG_API_MAJOR_VERSION@_la_LIBADD = \ $(LIBRSVG_LIBS) \ $(LIBM) \ $(RUST_LIB)

This uses a .PHONY target for librsvg_internals.a , so " cargo build " will always be called on it. Cargo already takes care of dependency tracking; there is no need for make/automake to do that.

I put the filename of my library in a RUST_LIB variable, which I then reference from LIBADD . This gets librsvg_internals.a linked into the final librsvg.so .

When you run " cargo build " just like that, it creates a debug build in a target/debug subdirectory. I haven't looked for a way to make it play together with Automake when one calls " cargo build --release ": that one puts things in a different directory, called target/release . Rust's tooling is more integrated that way, while in the Autotools world I'm expected to pass any CFLAGS for compilation by hand, depending on whether I'm doing a debug build or a release build. Any ideas for how to do this cleanly are appreciated.

I don't have any code in configure.ac to actually detect if Rust is present. I'm just assuming that it is for now; fixes are appreciated :)

Using the Rust functions from C

There is no difference from what we had before! This comes from rsvg-shapes.c :

static RsvgPathBuilder * _rsvg_node_poly_create_builder (const char *value, gboolean close_path) { RsvgPathBuilder *builder; ... builder = rsvg_path_builder_new (); rsvg_path_builder_move_to (builder, pointlist[0], pointlist[1]); ... return builder; }

Note that we are calling rsvg_path_builder_new() and rsvg_path_builder_move_to() , and returning a pointer to an RsvgPathBuilder structure as usual. However, all of those are implemented in the Rust code. The C code has no idea!

This is the magic of Rust: it allows you to move your C code bit by bit into a safe language. You don't have to do a whole rewrite in a single step. I don't know any other languages that let you do that.