Why not use FFI? Here's why. [Jun. 2nd, 2010|07:22 pm] djberg96

On the FFI wiki there's a nice list of reasons why you should use FFI. That's the foreign function interface for Ruby. I imagine other languages have some similar facility.



It's all Unicorns and Rainbows in theory. It'll be pure Ruby! It's cross platform! You can use it with JRuby and Rubinius!



Unfortunately, my experience has not been too keen with FFI. I'm going to lay out a few reasons why you might not want to switch to FFI unless you absolutely, positively have to get your extensions working with JRuby, Rubinius, or IronRuby.



First, the last time I checked (and someone correct me if this has changed), you can't build FFI with anything except the GNU tool chain. That means no support for a Ruby built with the Sun Studio compiler, the HP-UX compiler and, perhaps most importantly, the Microsoft tool chain. The good news for Microsofties is that you have the mingw one-click installer option. If you're using that you'll be ok. Otherwise, tough crap.



Second, the alternative implementations may not give you the low level access you need. For example, my file-temp library does not work with JRuby even though it uses FFI because JRuby cannot deal with low level file descriptors. That kills a lot of low level systems programming right out of the gate.



Third, the up front declarations, combined with cross platform support, are proving to be extremely burdensome. Consider a simple interface for the getpwent() function. You might naively start with something like this on Linux:

attach_function :getpwent, [], :pointer attach_function :setpwent, [], :void attach_function :endpwent, [], :void class PasswdStruct < FFI::Struct layout( :pw_name, :string, :pw_passwd, :string, :pw_uid, :uint, :pw_gid, :uint, :pw_gecos, :string, :pw_dir, :string, :pw_shell, :string ) end

This works fine. On Linux. But you immediately run into trouble the moment you try to run this on Solaris. Why? Because the passwd struct on Solaris not only contains different members, it contains some of the same struct members, but in a different order.



For those of you who might not be C programmers, I'll elaborate a bit on why the order matters. You see, when you declare a variable in C, you're really reserving memory. With a struct you're essentially reserving a block of contiguous memory. That's the important bit.



With C this doesn't matter. You can just access the memory by name, e.g. pwd->pw_name, pwd->pw_uid, and so on. At worst you'll have to add an #ifdef check before trying to access it. I don't have to worry about the ordering, because it's already been ordered for me by a header file included on the operating system.



With FFI this becomes a major hassle. It's easier to show you why if I show you what the declaration would have to look like on Solaris:

class PasswdStruct < FFI::Struct layout( :pw_name, :string, :pw_passwd, :string, :pw_uid, :uint, :pw_gid, :uint, :pw_age, :string, :pw_comment, :string, :pw_gecos, :string, :pw_dir, :string, :pw_shell, :string ) end

There are two things to notice. First, it contains two additional members, pw_age and pw_comment. Second, it also has a pw_gecos field, but it's not in the same position. That's where that contiguous memory comes into play. I can't simply reference :pw_gecos by name on any old Unix platform and call it a day the way I can with C, because it's a different segment of memory. To be more specific, :pw_gecos on Linux should be held in bytes 16-19, while on Solaris it's 24-27.



So, if you had thoughts of just declaring one massive struct that contains every struct member from every platform you can think of you're out of luck because, while you can reference that data, it's probably not going to return the data you think it will because it's the wrong chunk of memory.



So now what do we do?



We could create an array of members first, and adjust it based on platform like this:

# Danger... members = [ :pw_name, :string, :pw_passwd, :string, :pw_uid, :uint, :pw_gid, :uint, :pw_gecos, :string, :pw_dir, :string, :pw_shell, :string ] members.insert(8, :pw_age, :string, :pw_comment, :string) if CONFIG['host_os'] =~ /solaris/ layout(*members)

Unfortunately, there are a host of problems with this approach.



First, it means you now have have to eyeball every struct definition on every platform to see what the declaration order is. That means sprinkling your code with a bunch of platform checks. Even then you might get it wrong, because the struct definitions may be different on earlier or later versions of the operating system. For 3rd party libraries, the definitions could change between releases, and you're again relegated to eyeballing the struct declarations.



Second, it wouldn't be so bad, except that 3rd party libraries (and some operating systems) have a habit of declaring their own variable types. Now you not only have to know the struct definition, you have to figure out what the hell a type "foo_int_t" is (or whatever) so that you're sure to reserve the right amount of memory for it.



Third, some struct members are opaque, and you simply can't declare the variable type, because there's no way for you to figure it out. Now you're relegated to using FFI::Pointers and extra work.



Lastly, the Ruby community might be good at local testing, but it has proven to be exceedingly bad when it comes to cross-platform testing. In practice most testing only occurs on Linux and OS X (and in some cases only the latter), with either no thought whatsoever given to other platforms, or simply no ability to access those other platforms. Now you're relying much more heavily on 3rd party patches.



So, what do we do in practice then? Well, you could do what the JRuby guys did and just create separate source files for every single platform where you have this kind of issue. To wit:

$ find . -name "etc.rb" ./ruby/site_ruby/shared/ffi/platform/i386-openbsd/etc.rb ./ruby/site_ruby/shared/ffi/platform/powerpc-aix/etc.rb ./ruby/site_ruby/shared/ffi/platform/i386-linux/etc.rb ./ruby/site_ruby/shared/ffi/platform/sparc-solaris/etc.rb ./ruby/site_ruby/shared/ffi/platform/x86_64-darwin/etc.rb ./ruby/site_ruby/shared/ffi/platform/x86_64-solaris/etc.rb ./ruby/site_ruby/shared/ffi/platform/x86_64-linux/etc.rb ./ruby/site_ruby/shared/ffi/platform/powerpc-darwin/etc.rb ./ruby/site_ruby/shared/ffi/platform/i386-windows/etc.rb ./ruby/site_ruby/shared/ffi/platform/i386-solaris/etc.rb ./ruby/site_ruby/shared/ffi/platform/i386-darwin/etc.rb ./ruby/site_ruby/shared/ffi/platform/sparcv9-solaris/etc.rb

Wow, that looks like a real joy to maintain, doesn't it?



The other solution is to sprinkle your code with a bunch of platform checks. I also released mkmf-lite just this week to help with this problem, too, but it's like putting a band aid on a fractured arm really.



Anyway, the upshot of all this work is that, in my opinion, FFI is actually more difficult to use than a C extension in practice for all but the simplest libraries.



You've been warned.