Generalized Function Pointers

The function facility, recently adopted by the C++ standards committee, provides a generalized way of working with arbitrary functions when all you know (or need to know) is their signature. In fact, as it turns out, you don't even need to know the target function's exact signature -- any target function with a compatible signature, meaning one where the parameter and return types are appropriately covertible, will work just fine.

Last time [1], I gave an overview of tuple types, one of the first two library extensions that were approved in October 2002 for the standard library extensions technical report (the "Library TR"). In this article, I'll cover the other: Doug Gregor's proposal for polymorphic function object wrappers. [2] Note that both of these facilities come directly from the Boost project [3], an important (and free) collection of C++ libraries available for you to download and try out today.

In brief, the function facility provides a generalized way of working with arbitrary functions when all you know (or need to know) is their signature. In fact, as it turns out, you don't even need to know the target function's exact signature - any target function with a compatible signature, meaning one where the parameter and return types are appropriately convertible, will work just fine.

A Motivating Example

Functions and functionlike things are everywhere: C++ has member functions, static member functions, and nonmember (namespace scope) functions, and on top of that it has functors (objects having an overloaded operator()) that can be invoked as though they were functions. (I'm going to use the term "functors" for the latter, instead of the alternative "function objects," to avoid confusion with "function objects" which means objects of the function library we're discussing.)

Calling a plain old function is easy:

// Example 1(a): Calling a nonmember function // string AppendA( string s ) { return s + 'a'; } string test = "test"; // Making an immediate call is easy: // cout << AppendA( test ) << endl; // immediate call

When it comes to using a function, however, it turns out that what and when to call aren't always decided at the same time. That is, a program may determine early on exactly which function needs to be called, long before it is ready to actually make the call; as noted in [2], a prime and common example is registering a callback function. In cases like that, we routinely know to squirrel away a reference to the function, typically by using a function pointer:

// Example 1(a), continued // // Making a delayed call involves storing a pointer // (or reference) to the function: // <font color="#009900">typedef string (*F)( string ); F f = &AppendA; // select function...</font> // ... later... cout << <font color="#009900">f( test )</font> << endl; <font color="#009900">// ... make delayed call</font>

This is handy and pretty flexible, in that f can be made to point to any nonmember function that takes and returns a string.

But of course the above method doesn't work for all functions and functionlike things. For example, what if we instead wanted to do this with a member function? "That's easy," someone might say, "we just have to store a member function pointer instead of a free function pointer, and then use it appropriately." That's true, and it means we would write code that looks something like this:

// Example 1(b): Calling a member function // class B { public: virtual string AppendB( string s ) { return s + 'b'; } }; class Derived : public B { string AppendB( string s ) { return s + "bDerived"; } }; string test = "test"; B b; Derived der; // Making an immediate call is easy: // cout << b.AppendB( test ) << endl; // immediate call // Making a delayed call involves storing a pointer // (or reference) to the member function, and requires // some appropriate object to call it on: // <font color="#009900">typedef string (B::*F)( string ); F f = &B::AppendB; // select function...</font> // ... later... cout << <font color="#009900">(b.*f)( test )</font> << endl; <font color="#009900">// ... make delayed call</font> cout << <font color="#009900">(der.*f)( test )</font> << endl; // ... another delayed call, // virtual dispatch works

That's fine, and this works, but note that it's already considerably less flexible than in Example 1(a). Why? Because the f in Example 1(b) can only be made to point to member functions of class B that take and return a string; this f can't be used to point to a member function of any other class, even if it otherwise has the right signature. While we're at it, note one more limitation: This f also doesn't carry around with it the knowledge of which B object to invoke. We can get rid of this last limitation, remembering the object on which the call should be made, by using the standard function binders:

// Example 1(b), continued: Alternative, using binders // <font color="#009900">typedef binder1st<mem_fun1_t<string,B,string> > F2; F2 f2 = bind1st( mem_fun( &B::AppendB ), &b );</font> // ... later... cout << <font color="#009900">f2( test )</font> << endl; <font color="#009900">// ... make delayed call </font> f2 = bind1st( mem_fun( &B::AppendB ), <font color="#009900">&der</font> ); // ... later... cout << <font color="#009900">f2( test )</font> << endl; // ... another delayed call, // virtual dispatch works

The type of F2 is unfortunately ugly, and it still is limited to binding to member functions of class B having an exact signature.

Of course, if the thing we wanted to call was actually a functor object, then none of the above options would work, this despite the fact that for immediate calls the functor can indeed be used seamlessly as though it were a function:

// Example 1(c): Calling a functor // class C { public: string operator()( string s ) { return s + 'c'; } }; string test = "test"; C c; // Making an immediate call is easy: // cout << c( test ) << endl; // immediate call // Making a delayed call is trickier. There's no easy way without // fixing the type of the object as a C, such as by taking a C&...: // <font color="#009900">typedef C& F; F f = c; // select functor...</font> // ... later... cout << <font color="#009900">f( test )</font> << endl; <font color="#009900">// ... make delayed call</font> // ...or by arranging for C to inherit from some common base class // with a virtual operator() and then taking a pointer or reference // to that base.

That covers nonmember functions, member functions, and functors. Finally, on a note that applies to all of these, what if you have a function (or functor) that you could call directly just fine, but the parameter types, although compatible, aren't quite identical? Consider a variant of Example 1(a):

// Example 1(d): 1(a) with slightly different types // // A little class just to get some conversions happening class Name { public: Name( string s ) : s_( s ) { } string Get() const { return s_; } private: string s_; }; // Stubbed-in function to demonstrate varying // parameter and return types: <font color="#009900">const char*</font> AppendD( <font color="#009900">const Name&</font> s ) { static string x; x = s.Get() + 'd'; return x.c_str(); } string test = "test"; // Making an immediate call is still easy: // cout << AppendD( test ) << endl; // immediate call // But the typedef from Example 1(a) no longer works, // even though the rest of the call is unchanged: // <font color="#009900">typedef string (*F)( string ); F f = &AppendD; // error: type mismatch</font> // ... later (in our dreams)... cout << <font color="#009900">f( test )</font> << endl; <font color="#009900">// ... never get here</font>

Enter function

What if we had a facility that let you form a "function pointer" to any function or functionlike thing? That would be genuinely useful. Enough people think it would be genuinely useful, in fact, that many have gone off and written their own using C++'s powerful templates, and today several versions of such facilities exist; see [2] for some references. And function in the draft standard library technical report is just such a facility, so "That's great, these already exist," one might think. "Why write yet another one for the library technical report?" Because these existing facilities are interesting (good), useful (even better), and not always portable or compatible (oops).

When something is genuinely and widely useful, and people keep reinventing the same wheel in slightly different variations, we have a prime candidate for standardization. After all, just imagine what the world would be like if, instead of one standard std::string, every vendor supplied their own incompatible string (or CString, or RWString, or blah_string) type with different names and different interfaces and features to learn and relearn on every system. Those of you who've been around C++ since about 1995 or earlier don't need to imagine the horror of lots of incompatible string types; you've lived it. Even today with a standard string, we still have some of those variant vendor-specific string types floating around, but at least there are far fewer of them than there used to be. Similarly now with generalized function pointers, the standards committee felt it was time we had a single general-purpose fucntion wrapper that we can rely on portably, and the one that the committee has chosen to bless is function.

The function template takes a function signature as its only required template parameter. (Following the practice of much of the rest of the standard library, it also takes an optional allocator template parameter, which I'm going to ignore henceforth.) Its declaration looks like this:

template<typename Function, typename Allocator = std::allocator<void> > class function;

How do you pass a function signature as a template parameter? The syntax may seem slightly odd at first, but once you've seen it it's just what you'd think:

function< <font color="#009900">string (string)</font> > f; // can bind to anything that // takes and returns a string

Any function or functor that is callable with the specified signature can be bound to the function object. In particular, the function object above can bind to all of the functions shown in the earlier examples, as demonstrated in Example 1(e):

// Example 1(e): One function<> to bring them all, // and flexibly to bind them // // This code uses the same class and function definitions // from Examples 1(a) through (d) // string test = "test"; B b; Derived der; C c; <font color="#009900">function<string (string)> f;</font> // can bind to anything that // takes and returns a string <font color="#009900">f = &AppendA;</font> // bind to a free function cout << f( test ) << endl; <font color="#009900"> f = bind1st( mem_fun( &B::AppendB ), &b );</font> cout << f( test ) << endl; // bind to a member function <font color="#009900">f = bind1st( mem_fun( &B::AppendB ), &der );</font> cout << f( test ) << endl; // with virtual dispatch <font color="#009900">f = c;</font> // bind to a functor cout << f( test ) << endl; <font color="#009900">f = &AppendD;</font> // bind to a function with a cout << f( test ) << endl; // different but compatible signature

"Wow," you say? Wow indeed. Note in particular the implicit conversions going on in the last case, for which we didn't even have legal code in Example 1(d) - AppendD's actual parameter and return types are not string, but can be converted from and to a string, respectively, and that's good enough for function< string (string) >.

Another Motivating Example: Higher-Order Functions

Let's say you want to write an arithmetic_operation function whose purpose in life is to return a functor that can perform some appropriate operation on two ints, so that the calling code can store it and later invoke it at the appropriate time. The catch: It has to decide at runtime what kind of operation will be needed; perhaps you want to add, or perhaps you want to subtract, or multiply, or divide, and so on. The arithmetic_operation function should decide which is needed, create an appropriate functor (e.g, std::plus<>, std::minus<>, etc.), and return it - but then how do you write the return type of arithmetic_operation, when the functors being returned all have different types? What you need is a return type that can be constructed from any of those types of objects -- in this case, a function<int (int, int)>:

// Example 2: Higher-order functions // (reproduced from [2], section Ib) // function<int (int x, int y)> arithmetic_operation(char k) { switch (k) { case '+': return plus<int>(); case '-': return minus<int>(); case '*': return multiplies<int>(); case '/': return divides<int>(); case '%': return modulus<int>(); default: assert(0); } }

Note what's going on here: At runtime, the arithmetic_operation function decides what kind of operation will be needed. Since each type can be bound to a function<int (int, int)>, we're done; no fuss, no muss.

More To Come: Observer, Comparisons, and Multicast, Oh My!

Comparison : The only significant limitation of function is that function objects can't be compared, not even for equality. As I will argue in depth in the upcoming companion column, there are compelling reasons why function should provide at least equality comparison ( == and != ). This is possible and usefully implementable, albeit with a ripple effect into the standard function binders (e.g., std::mem_fun ).

Multicast: Another, but fortunately less significant, limitation of function is that a function object can only be bound to one thing at a time. The reason this is not too significant a limitation is that there's a perfectly good workaround: It's possible to build a multi_function on top of function pretty easily and effectively, and I will motivate and demonstrate that as well in the companion column, including a sample (i.e., working, but rough) implementation.

Summary

For all of its power, function isn't all that expensive to use. Gregor notes: "A boost::function<> object is 3 pointers large (it could be optimized to 2 in most cases), and requires one extra indirect call (through a function pointer) for each invocation of the boost::function<> object. If one is careful, you can make sure that no big arguments are copied by the bridge to further reduce the penalty for large objects, although I haven't implemented this. (It's a simple application of, e.g., boost::call_traits to the parameter list)." [6]

Further, it's again a testament to the power of C++ that a function that works with all functions and functors can be implemented purely as a library facility, without any extensions to the existing standard C++ core languages. Don't try this at home in less-flexible languages.

In the next The New C++ column, I'll give a trip report on the April 2003 C++ standards meeting, held in Oxford, UK. The big news: Whereas in the October 2002 meeting the first two extensions were adopted for the standard library extensions technical report, in April no fewer than ten more were added. What were they, and which ones might you particularly care about? Stay tuned.

References

[1] H. Sutter. "Tuples" (C/C++ Users Journal, 21(6), June 2003).

[2] D. Gregor. "A Proposal to add a Polymorphic Function Object Wrapper to the Standard Library," ISO/ANSI C++ standards committee paper (ISO/IEC JTC1/SC22/WG21 paper N1402, ANSI/NCITS J16 paper 02-0060).

[3] www.boost.org

[4] H. Sutter. More Exceptional C++ (Addison-Wesley, 2002). An earlier version of this material is available online at www.gotw.ca/gotw/057.htm.

[5] H. Sutter. "Generalizing Observer" (C/C++ Users Journal, 21(9), September 2003). [should be published within 2-3 weeks of this article]

[6] D. Gregor, private communication

About the Author