Partially Applied Functions in C

2013-07-20

There are some functions in the standard C library that takes a function pointer to be used as a callback later on. Examples include atexit() and signal() . However, these functions can't receive an arbitrary pointer (which could hold some important program state) in addition to the function pointer, so you're left with pesky global variables:

/* You have: */ atexit ( foo ); /* foo() will have to fetch program state from globals */ /* Instead of: */ static struct program_state state ; atexit ( foo , & state ); /* foo() now have a pointer to program state */

Turns out that there's a workaround, but it involves some black magic.

I believe the overall mechanism to be quite interesting, however I do not recommend its usage. Not only because the implementation wastes a whole memory page for a callback, but also because I don't want to encourage people to perpetuate this kind of take-pointer-to-function-without-argument nonsense.

I'll try to explain how this contraption works by showing the smaller parts first. I'll begin with the template function. The idea is to have a function whose code can be patched up later -- however that code turns out to be generated by the compiler:

#define PARAMETER_CONSTANT 0xFEEDBEEF #define FUNCTION_CONSTANT 0xABAD1DEA static void partial_template_function ( void ) { (( void ( * )( void * )) FUNCTION_CONSTANT )(( void * ) PARAMETER_CONSTANT ); }

The funky-looking cast basically says "call a function pointer at FUNCTION_CONSTANT with a pointer pointing to PARAMETER_CONSTANT ". Of course, if you call this code as is, the program will most likely crash. The idea is that this generates this code (IA32 assembly):

0f00deba < partial_template_function >: 0: 55 push %ebp 1: 89 e5 mov %esp,%ebp 3: 83 ec 18 sub $0x18,%esp 6: c7 04 24 ef be ed fe movl $0xfeedbeef,(%esp) d: b8 ea 1d ad ab mov $0xabad1dea,%eax 12: ff d0 call *%eax 14: c9 leave 15: c3 ret

Even if you don't know assembly, if you squint a little bit, you can clearly see the magic constants defined in the C code above. By writing a trivial function to patch these magic values to something useful (such as a real function or some real pointer argument):

static bool patch_pointer ( void * code_addr , size_t code_len , void * look_for , void * patch_with ) { unsigned char * code = code_addr ; intptr_t look = ( intptr_t ) look_for ; do { if ( * (( intptr_t * ) code ) == look ) { union { unsigned char octet [ sizeof ( void * )]; void * ptr ; } patch ; patch . ptr = patch_with ; code [ 0 ] = patch . octet [ 0 ]; code [ 1 ] = patch . octet [ 1 ]; code [ 2 ] = patch . octet [ 2 ]; code [ 3 ] = patch . octet [ 3 ]; return true ; } code ++ ; } while ( code_len -- ); return false ; }

And using it to patch the pointers in a page allocated with mmap() (comments and error recovery have been ommitted for brevity; full source code is linked below):

struct Partial * partial_new ( void ( * func )( void * data ), void * data ) { struct Partial * t ; if ( ! func ) return NULL ; t = calloc ( 1 , sizeof ( * t )); /* partial_template_function must be declared just before partial_new * so that caller_len is calculated correctly */ t -> caller_len = ( size_t )(( intptr_t ) partial_new - ( intptr_t ) partial_template_function ); t -> caller = mmap ( 0 , t -> caller_len , PROT_WRITE | PROT_READ , MAP_PRIVATE | MAP_ANONYMOUS , - 1 , 0 ); memcpy ( t -> caller , partial_template_function , t -> caller_len ); patch_pointer ( t -> caller , t -> caller_len , ( void * ) FUNCTION_CONSTANT , func ); patch_pointer ( t -> caller , t -> caller_len , ( void * ) PARAMETER_CONSTANT , data ); mprotect ( t -> caller , t -> caller_len , PROT_EXEC | PROT_READ ); return t ; }

The end result will be a function that can be called without arguments -- which will magically call another function with a given parameter:

static void test ( void * data ) { printf ( "Test called with data=%p

" , data ); } int main ( void ) { struct Partial * p ; p = partial_new ( test , ( void * ) 0x12341337 ); atexit ( partial_to_function ( p )); return 0 ; }

Which, when executed, will print:

[leandro@navi /tmp]$ ./a.out Test called with data=0x12341337

So there you have it, partially applied functions in C. Useful? Hardly. Interesting? I think so. Fun? Yup.

If you'd like to try, the full source code, with comments and error recovery is available in this gist.