This article will discuss how arbitrary binding expressions work. Since the template syntax and meta-programming system can be a sticking point for some students, a runtime version will be described first to discuss the fundamentals. This also paves the way for readers interested in implementing this in some language that lacks generics or templates. Following this version will be a simplified template version that will be limited to 3 arguments. Upon conclusion you should be better prepared to understand the source code for boost::bind or std::bind.

Basic Object System

Since the runtime version of arbitrary binding requires a protocol for introspection a bit beyond what the C++ language does for its classes, a basic object system needs to be used and everything that can be an argument in a binder must inherit from a single base class. What follows is a very elementary set of such objects that will be used for further discussion. This can be done in any language–I’ve implemented this in C in the past–but C++ is used here because it’s just plain easier.

// This is basically all that is needed. Every object in the system // must inherit from this base class to allow the parameter system to work // and then RTTI will take care of the rest because it's a polymorphic class. struct Object { virtual ~Object() {} }; // A couple obvious example classes... struct Integer : Object { Integer(int i) : value(i) {} Integer() : value(0) {} Integer& operator = (int i) { value = i; return *this; } int get() const { return value; } private: int value; }; struct Double : Object { Double(double d) : value(d) {} Double() : value(0.) {} Double& operator = (double d) { value = d; return *this; } double get() const { return value; } private: double value; };

This basically provides a dynamic type system for C++. If everything inherits from Object then you can pass around anything as this base type. It’s like void pointers with a touch more predictability. This will be important for our non-templated bind expression system because in order to run arbitrary functions we need to be able to pass arbitrary types of information.

Placeholders and argument lists

The bind system is actually a fairly simple set of constructs and operations that correctly dispatch a call to the requested function with the correct arguments. What we will have is two lists of arguments, a function to call, a lookup operation that will translate arguments into values based on the argument lists and the parameters passed in, and a set of invokers that can be used to translate the two argument lists into a function call and invoke that function. The first argument list is what is provided when you bind to a function and it can contain placeholders. These placeholders point to indices within the second set of arguments, which are the arguments the bind type was invoked with. It is a very simple algorithm, but wrapping your head around it can be difficult without examples and explanation.

The first part of the system then is the argument list. In our case a vector of pointers to Object is quite sufficient. To make the lookup operation for this type then is not all that difficult. We first come up with a test:

BOOST_AUTO_TEST_CASE(lookup) { std::vector<Object*> args{ new Integer(0), new Integer(1), new Integer(2) }; BOOST_CHECK(select_arg(args, a2) == args[1]); // pointer to the second element Integer * i = new Integer(42); BOOST_CHECK(select_arg(args, i) == i); // pointer to the parameter we passed in }

In this case what we are testing is that when we pass in a placeholder–the a2 variable–we want the item from the argument list but when we pass in something else, anything else, we want that very same thing back instead. A placeholder then needs to simply be an index wrapped up in a very specific type:

struct Placeholder : Object { Placeholder(size_t i) : index(i) {} size_t get() const { return index; } private: size_t index; };

We also need access to particular placeholders to make our lives easier and to provide a well understood language for expressing, “I want the nth argument to go here in the invocation of function ‘f’.” In this runtime system we simply declare a few in the header and then later initialize them as constant globals:

// binder.h: extern Placeholder *const a1; extern Placeholder *const a2; extern Placeholder *const a3; // binder.cpp: Placeholder *const a1 = new Placeholder(0); Placeholder *const a2 = new Placeholder(1); Placeholder *const a3 = new Placeholder(2);

Now the lookup function is rather rudimentary:

Object* select_arg(std::vector<Object*> const& args, Object * key) { if (Placeholder * ph = dynamic_cast<Placeholder*>(key)) { return args[ph->get()]; } return key; }

This will later serve as the central point to the bind invocation algorithm set.

Building argument lists

In this system we want to be able to bind to a function of arbitrary parameter count that can then be invoked with as many arguments as the call site provides. This requires variadic functionality and in our runtime system that means C-style variadic functions, the kind with ‘…’ as an argument. This will be done in at least two places: binder creation and binder invocation. Thus it is prudent to make this a separate operation that can be reused. First, the test:

std::vector<Object*> call_build_args(Object * obj, ...) { va_list ap{}; va_start(ap, obj); std::vector<Object*> args = build_args(obj, &ap); va_end(ap); return args; } BOOST_AUTO_TEST_CASE(build_args_test) { std::vector<Object*> args = call_build_args(new Integer(2), new Integer(3), new Integer(5), END_ARGS); BOOST_CHECK(args.size() == 3); BOOST_CHECK(static_cast<Integer*>(args[0])->get() == 2); BOOST_CHECK(static_cast<Integer*>(args[1])->get() == 3); BOOST_CHECK(static_cast<Integer*>(args[2])->get() == 5); for(Object * obj : args) { delete obj; } args = call_build_args(END_ARGS); BOOST_CHECK(args.size() == 0); }

The “call_build_args” function simply makes it easier to call our utility function that will actually require a va_list as its second argument. There’s no way to just build a va_list outside the parameters of a variadic function. You’ll note above the use of another constant called “END_ARGS”. This can be implemented with a special type if desired, but I simply implemented it by making a placeholder I’d never, ever use:

// binder.h: extern Placeholder *const END_ARGS; //binder.cpp: Placeholder *const END_ARGS = new Placeholder(-1);

The “build_args” function now is again quite simple:

std::vector<Object*> build_args(Object * obj, va_list * ap) { std::vector<Object*> args{}; while (obj != END_ARGS) { args.push_back(obj); obj = va_arg(*ap, Object*); } return args; }

All we are doing is iterating the va_list we receive until we reach the end, which we must have a way of finding out and has been shown with the END_ARGS value, and shove the arguments into a vector of Object pointers. Now we have a way to easily construct our argument lists from any variadic function.

Invokers

The next major piece of the puzzle is an operation that can take a function, two lists of parameters, convert the two lists into a list of actual arguments and call the target function with them. This operation is actually going to be done here with the use of a dispatch table based on the number of parameters the target function takes. We know how many the target function takes because it’s the same amount that was provided to the binder creation.

To begin with we need a way to hand an arbitrary function type around to the invokers. This requires some type casting and by C++ standard it is type casting of the worst kind: reinterpret_cast. Used correctly though this construct’s use of reinterpret_cast won’t pose a major problem.

I did not write a unit test for this aspect of the system because it’s quite hidden within and is rather trivial. So here it is in full:

// binder.h typedef void (*FUNPTR)(); // binder.cpp typedef Object* (*INVOKER_T)(FUNPTR, std::vector<Object*> const&, std::vector<Object*> const&); namespace { Object* invoker0(FUNPTR fun, std::vector<Object*> const&, std::vector<Object*> const&) { return reinterpret_cast<Object*(*)()>(fun)(); } Object* invoker1(FUNPTR fun, std::vector<Object*> const& a, std::vector<Object*> const& b) { return reinterpret_cast<Object*(*)(Object*)>(fun)(select_arg(b, a[0])); } Object* invoker2(FUNPTR fun, std::vector<Object*> const& a, std::vector<Object*> const& b) { return reinterpret_cast<Object*(*)(Object*,Object*)>(fun)(select_arg(b, a[0]), select_arg(b, a[1])); } Object* invoker3(FUNPTR fun, std::vector<Object*> const& a, std::vector<Object*> const& b) { return reinterpret_cast<Object*(*)(Object*,Object*,Object*)>(fun)(select_arg(b, a[0]), select_arg(b, a[1]), select_arg(b, a[2])); } std::vector<INVOKER_T> invokers{invoker0, invoker1, invoker2, invoker3 }; }

Our system is thus limited to functions that can take 3 or less parameters, though we are not bound in any way by how many arguments we can provide when we invoke it…which you will see in the next section.

Note how each invoker that invokes a function that has arguments is asking to select the argument from the second list based on the element in the first at the relative position in the argument list. This is the important step that does all the work. If, for example, the list of arguments we created the bind with were “0, 2, 3” then we want f called with those exact elements. However, if we say instead something like “a2, 2, 3”, then what we’re really creating is a binder that expects at least two arguments and that second argument will form the FIRST in the actual function call.

The last piece

The last piece of the puzzle is the actual binder object that can be created with a bind expression and then later invoked. The test first:

Object* add_two(Object * obj1, Object * obj2) { Integer *int1 = dynamic_cast<Integer*>(obj1); Integer *int2 = dynamic_cast<Integer*>(obj2); return new Integer(int1->get() + int2->get()); } BOOST_AUTO_TEST_CASE(final_binder_check) { Integer i1{42}; Integer i2{5}; binder b1{reinterpret_cast<FUNPTR>(add_two), &i1, &i2, END_ARGS}; Integer * result = static_cast<Integer*>(b1(END_ARGS)); BOOST_CHECK(result->get() == 47); delete result; result = static_cast<Integer*>(b1(&i2, END_ARGS)); BOOST_CHECK(result->get() == 47); delete result; binder b2(reinterpret_cast<FUNPTR>(add_two), &i1, a2, END_ARGS); result = static_cast<Integer*>(b2(0, &i1, END_ARGS)); BOOST_CHECK(result->get() == 84); delete result; }

Here we make sure that various combinations of placeholder vs. true argument are applied correctly. For example, in the last test for the b2 binder we are making sure that i1 is taken from the creation site, i1 again from the call site, and used together to call “add_two” to get the appropriate result of i1 + i1.

This then gives us the interface we must create:

struct binder : Object { binder(FUNPTR, Object * obj, ...); Object* operator()(Object * obj, ...) const; private: FUNPTR fun; std::vector<Object*> args; };

The first step then is to create the first set of arguments from the creation site and store them for later:

binder::binder(FUNPTR f, Object * obj, ...) : fun(f), args() { va_list ap{}; va_start(ap, obj); args = build_args(obj, &ap); va_end(ap); }

We then simply use our dispatch table to find the invoker that calls functions with the correct set of arguments (based on what was supplied to us at the creation site) and call it:

Object * binder::operator() (Object * obj, ...) const { va_list ap{}; va_start(ap, obj); std::vector<Object*> args_b = build_args(obj, &ap); Object * rval = invokers[args.size()](fun, args, args_b); va_end(ap); return rval; }

And that is that. We are done!

Conclusion

So as you can see, the algorithm for arbitrary binders is actually quite basic. Take two lists of arguments, one given at the bind creation site and the second at the point of invokation; use the first list to decide which elements in the second to use and in what order, and then invoke the bound function with a thus combined set of arguments. Any time you want to create a bind type of expression this is how it is done. In future installment(s) I will show how to take this exact same algorithm and apply it with templates to a type-safe bind expression that is much faster, and also to template meta-programming to recreate the boost MPL lambda construct. You will find that these concepts apply directly in all three cases though the application is of course quite different.

I have left many bugs in this system. The most obvious of course is that this thing doesn’t take any care about memory leaks at all. If I ran my unit tests in valgrind I’d be yelled at a lot. You will need to account for this somehow but the directions you can go here are so numerous and which you use completely beside the point of this exercise that I just left it out both for my own sanity and to keep things here quite simple. When I wrote arbitrary binding expressions in C I used an Objective-C like system of autorelease pools in all objects so the memory issues are fairly easy to deal with. There are other ways however and assuming they’re consistent you should have no problem at all applying this concept to your own object system to get good results.

Share this: Twitter

Facebook

Like this: Like Loading... Related