In this post I want to describe a problem my colleagues have faced a couple of times recently, and show how it can be solved with C++. Here is the goal. We want to provide a function (or a set of overloaded functions) that would ‘do the right job’ for ‘practically any type’, or for ‘as many types as possible’. As an example of such ‘job’ consider std::hash : what we want to avoid is the situation, where you want to use some type X as a key in the standard hash-map, but you are refused because std::hash does not ‘work’ for X . In order to minimize the disappointment, the Standard Library makes sure std::hash works with any reasonable built-in or standard-library type. For all the other types, that the Standard Library cannot know in advance, it offers a way to ‘customize’ std::hash so that they can be made to work with hash-maps.

For another popular example, consider Boost.Serialization library. Its goal is that almost any type should be serializable with the same interface: the library knows how to serialize, built-in, std and boost popular types, and it offers a way to teach it to serialize new types.

We are going to see a number of ways how such a customizable framework can be implemented. We will be using information from the previous post: “Overload resolution”.

The task

For an example, I have chosen a task that is quite simple, but it should serve my goal of illustrating the practical problems the developers face underway. We want to be able to tell how much memory a given object occupies: both on the stack and on the heap. Let me give you an example; if we illustrated an std::vector as follows:

We can see the blue part, representing the “handle”: those are the three pointers that give us access to the rest of the data; the size of this part can be measured with operator sizeof . The green color represents the allocated heap memory (let’s forget there are different allocators for the moment). This cannot be measured with sizeof and needs to be computed manually for each type.

Now, we want to provide a ‘framework’ that can already compute the memory usage for many popular and built-in types, and offers a way to compute the usage for new types.

Conceptually, the plan is as follows. We are going to have a function called something like mem_usage . Using it on a given type X should have the following effects:

For scalar types, it just uses operator sizeof . In fact, it can be generalized to all trivially-copyable types. We provide a custom definition for a number of common types that we know in advance (e.g., std::vector , boost::optional ). For other types, by default a compile-time error should be issued. We offer a way for the users to customize our framework for their own types.

Required member function — a non-solution

The first thing that usually comes to mind in this case, a sort of habit, is to think: let’s require of every type that participates in our framework, that it provides a member function mem_usage . And we just use it.

But this will not work. While you can force your colleagues from the team to add member function mem_usage to all their classes, you cannot force the built-in scalar types to have a member; or you cannot force std types to have a member of your liking. In fact this requirement is moot for any third party library you may need to work with.

A perhaps even more hardcore idea is to require that if you want some type X to be usable with our framework, it should derive from a polymorhic interface class MemUsageAble . Not only does it not solve the problem (of framework working with any type), but also requires that we unnecessarily increase the size occupied by the objects (they need to store a pointer to vtable ): and in case of our task this affects the measurement. Also imagine that we need to use two frameworks: one forces the types to inherit from one polymorphic interface, the other requires the types to inherit from another. This becomes unbearable.

Therefore rather than expecting a member function of all the types, we had better define functions outside the type: this works for built-in scalar types, 3rd party types, and your own types alike.

Function overloads

From the previous post we already know we cannot use function template specializations. Thus, for our first attempt we will use function (template) overloading and rely on the ADL (argument dependent lookup).

We can implement requirements 1 and 3 above with one function template. In order to test whether a given type is trivially copyable or not we can use type trait std::is_trivially_copyable . However, because I found that the trait is not available in GCC until version 5.0, I decided to use another one: std::is_trivial , so that the examples work on more compilers.

We will use the enable_if trick to conditionally remove our function from the overload set:

namespace framework { template <typename T> typename std::enable_if<std::is_trivial<T>::value, size_t>::type mem_usage1(const T& v) { return sizeof v; } }

Type trait std::is_trivial is only available since C++11, but otherwise all the things we will be discussing here apply to C++03. (And you can use Boost libraries to emulate some missing features (if you use Boost.TypeTraits or Boost.StaticAssert.) In the remaining examples, I will use a C++14 alias template std::enable_if_t : this is to make the examples shorter, but I still claim the similar is achievable in C++03, with a bit longer syntax.

Now, how do you compute memory usage of std::vector ? Assuming the default allocator, it is the size of the handle + the recursive memory usage of each vector element + the remaining capacity. But before we proceed to implementation, we will have to face a technical question: what namespace should we define the function in?

It is reasonable to assume that our framework will also come with a number of ‘algorithms’: function templates that make use of mem_usage1 , for instance:

namespace framework { namespace algo { template <typename T> size_t score(const T& v) { // do some more things return mem_usage1(v); } } }

In the previous post we have concluded that in order for the overload resolution to be immune to header inclusion order, we have to declare our overload in the namespace enclosing the type for which we are overloading. But this would mean declaring an overload of mem_usage1 in namespace std . And this in turn triggers an undefined behavior. Quoting the Standard ([namespace.std]/1):

The behavior of a C++ program is undefined if it adds declarations or definitions to namespace std or to a namespace within namespace std unless otherwise specified.

Luckily, because std is almost part of the language, and every piece of the program knows about it, and its contents, we can define our mem_usage1 inside namespace framework , just below the primary overload, prior to any other function that may need to use it:

namespace framework { // primary overload: template <typename T> std::enable_if_t<std::is_trivial<T>::value, size_t> mem_usage1(const T& v) { return sizeof v; } // overload for std::vector: template <typename T> size_t mem_usage1(const std::vector<T>& v) { size_t ans = sizeof(v); for (const T& e : v) ans += mem_usage1(e); ans += (v.capacity() - v.size()) * sizeof(T); return ans; } }

We can get away with this only because std is so special: any other library in the world knows about std and can include it.

There is a price we have to pay for the trick, though. Now our framework unconditionally includes header <vector> . Even the user’s that never need to use it with vectors now transitively include the standard header.

But here comes the first problem. Suppose we want to also provide an overload for std::pair . We could include them in the following order:

namespace framework { // primary overload: template <typename T> std::enable_if_t<std::is_trivial<T>::value, size_t> mem_usage1(const T& v) { return sizeof v; } // overload for std::vector: template <typename T> size_t mem_usage1(const std::vector<T>& v) { size_t ans = sizeof(v); for (const T& e : v) ans += mem_usage1(e); ans += (v.capacity() - v.size()) * sizeof(T); return ans; } // overload for std::pair: template <typename T, typename U> size_t mem_usage1(const std::pair<T, U>& v) { return mem_usage1(v.first) + mem_usage1(v.second); } }

But if we want to use these definitions with type std::vector<std::pair<int, int>> :

int main() { std::vector<std::pair<int, int>> vp; framework::mem_usage1(vp); }

We get a compile-time error. This is because of the lookup rules in templates:

For overloads defined in the namespaces of the types they operate on: we can see all of them, For overloads defined in the namespace of the function template we are parsing: we only see the overloads declared prior to our template.

In our case, we are first selecting and parsing the overload for std::vector , and this works; but inside it, we want to find an overload for std::pair , but: we are in namespace framework , so we see only the previous declarations, and the overload for std::pair is only defined later.

If we reversed the overload declarations, we would fix our particular problem, but we would introduce a similar one for type std::pair<std::vector<int>, std::vector<int>> .

The way to solve it is to use forward declarations:

namespace framework { // primary overload: template <typename T> std::enable_if_t<std::is_trivial<T>::value, size_t> mem_usage1(const T& v) { return sizeof v; } // forward declare overload for std::pair: template <typename T, typename U> size_t mem_usage1(const std::pair<T, U>& v); // overload for std::vector: template <typename T> size_t mem_usage1(const std::vector<T>& v) { size_t ans = sizeof(v); for (const T& e : v) ans += mem_usage1(e); ans += (v.capacity() - v.size()) * sizeof(T); return ans; } // overload for std::pair: template <typename T, typename U> size_t mem_usage1(const std::pair<T, U>& v) { return mem_usage1(v.first) + mem_usage1(v.second); } }

Now, suppose we want to provide an overload for type boost::optional . This task is somewhat easier, because namespace boost is not special in any way, and we are allowed to add declarations to it:

namespace boost { template <typename T> size_t mem_usage1(const optional<T>& v) { using framework::mem_usage1; size_t ans = sizeof(v); if (v) ans += mem_usage1(*v) - sizeof(*v); return ans; } }

Memory occupied by boost::optional is its sizeof (the initialized-flag, and the storage for T ) plus, if optional contains a value, the memory usage of the remote parts (because the handle of T is already included in the sizeof ).

Now, because we define this overload in the same namespace as the argument type, we can put this declaration after every template that may be using this, because it will be picked by the ADL in the second phase of the overload resolution. However, we have to make sure, that this overload is defined after the overload for std::vector , because otherwise the former will not see the latter if we use the framework with type boost::optional<std::vector<int>> . By now it looks convoluted from the framework implementer’s perspective; but for the users we allow the flexible header inclusion model. That is, the following two orders will work:

#include <framework.hpp> #include <boost/optional.hpp> #include "glue_between_framework_and_optional.hpp"

#include <boost/optional.hpp> #include <framework.hpp> #include "glue_between_framework_and_optional.hpp"

Also note that in the implementation of the last overload, I used the using -declaration. This is in order for the overload resolution to consider both namespace framework and the argument-dependent namespaces. If I forgot that, I would get a compile time error. Similarly, if I just called framework::mem_usage1() , I would have disabled the ADL, and got another compile-time error in other cases.

Now, anyone who is going to use our function overloads mem_usage1 will have to do the same: put a using -declaration, and then call without namespace qualifications. In order to spare the users this trouble, we can provide a convenience function that already does this:

namespace framework { template <typename T> size_t mem_usage(const T& v) { return mem_usage1(v); } }

Because I am declaring it also in namespace framework , I can skip the using -declaration: it is implied. But the users can now call it qualified:

int main() { boost::optional<std::vector<int>> ov; framework::mem_usage(ov); // works! }

Going back to the overload for boost::optional , it works because optional in the current Boost version (1.59) is declared directly in namespace boost :

namespace boost { template <typename T> class optional; }

If it was changed to:

namespace boost { namespace optional_ns { template <typename T> class optional; } using namespace optional_ns; }

My overload would cease working, even though boost::optional would still work. (And there are good reasons to add such additional namespace optional_ns ; and at some point it might in fact just happen.) I do not know how to have this framework solution be prepared for such a namespace change.

Another drawback of this overload-based solution is that it is easy to mis-spell the name of one of the overloads. The compiler will not protest at the point where you define your framework. It will only protest when the users try to use it.

This framework design has been chosen for std::swap (and boost::swap is an equivalent of wrapper framework::mem_usage from our example). Our example differs from std::swap , though. This is for two reasons. First, we cannot afford to define our framework in namespace std . Second, we do not provide a default implementation that works for any type T until the user provides her customization. This way we avoid a class of ODR violation problems that std::swap comes with.

For a full working example of this framework design, see here.

Function overloads with ADL tag

A lot of complications in the previous design comes from the fact that we cannot declare overloads in namespace std . Declaring overloads in foreign namespaces (like boost ) works, but is susceptible to injecting in-between namespaces (like boost::optional_ns in the above examples); and also looks a bit inelegant and confusing: why do we want to declare something in somebody else’s namespace?

These problems can be avoided with a clever trick. Change the interface of our function (template), so that it takes a second argument, which does not change in the overloads, and is declared in our namespace framework :

namespace framework { struct adl_tag {}; // empty class // primary overload: template <typename T> std::enable_if_t<std::is_trivial<T>::value, size_t> mem_usage2(const T& v, adl_tag) { return sizeof v; } }

Can you see what it buys us?

If we now define an overload for std::vector in namespace framework , and call it without scope qualifications:

namespace framework { template <typename T> size_t mem_usage2(const std::vector<T>& v, adl_tag tag) { size_t ans = sizeof(v); for (const T& e : v) ans += mem_usage2(e, tag); // pass the tag down ans += (v.capacity() - v.size()) * sizeof(T); return ans; } } int main() { std::vector<int> v; mem_usage2(v, framework::adl_tag{}); }

It just works! It works because now we have two function arguments: one from namespace std , the other from namespace framework . The second phase of overload resolution in templates (as well as the overload resolution outside templates) performs an argument dependent lookup. And because we have two arguments, two namespaces are looked up. This way we can force the ADL to search namespace framework regardless of in which namespace the first argument is defined, and because in the second phase we consider the overloads declared even after the template that is calling them, we do not have to be concerned about the order of overloaded declarations.

To some extent, this is the approach taken by Boost.Serialization library: it expects that one of the arguments in the overloads is always a serialization ‘archive’, which corresponds to our ADL-tag; but because the ‘archive’ has a meaningful state, the solution does not look that tricky.

We can conceal the confusing tag from the users by defining a wrapper function:

namespace framework { template <typename T> size_t mem_usage(const T& v) { return mem_usage2(v, adl_tag{}); } }

For a full working example of this framework design see here.

Class template specializations

As we have seen in the previous post, the natural choice for such framework customizations — function template specializations — do not work because one is not allowed to partially specialize a function template. However, this restriction does not apply to class templates, so we might as well use those. It is going to be a bit artificial, as we do not really need any class with state, but it happens to work. We will be specializing and instantiating classes only to call one static member from their scope.

So, first task: how to make a generic function that will return sizeof(X) for a trivially copyable type (or trivial, due to GCC shortcoming) and compile-fail for other types, and implement it all as classes?

namespace framework { template <typename T> struct mem_usage3 { static_assert (std::is_trivial<T>::value, "customize!"); static size_t get(const T& v) { return sizeof v; } }; }

Our master template (unless specialized) binds to any type; except that for non-trivial types it fires a compile-time error with static_assert . The usage is a bit clumsy:

int main() { int i = 0; framework::mem_usage3<int>::get(i); }

But again, we can wrap it into a convenience function:

namespace framework { template <typename T> size_t mem_usage(const T& v) { return framework::mem_usage3<T>::get(v); } } int main() { int i = 0; framework::mem_usage(i); }

We customize the framework by declaring (partial or full) class template specializations. Here is an example for std::pair :

namespace framework { template <typename T, typename U> struct mem_usage3< std::pair<T, U> > { static size_t get(const std::pair<T, U>& v) { return mem_usage3<T>::get(v.first) + mem_usage3<U>::get(v.second); } }; }

Specializing for other types is quite simple: you do it exactly in the same namespace as the master template. The additional safety feature that comes with this technique is that if you do a spelling mistake while customizing the framework, the compiler will immediately send a compile-time error because you are specializing a nonexistent class template.

Another prominent difference from the overload-based solutions is that a class specialization for X does not automatically work for types publicly derived from X . Let me explain. If you have two types related by inheritance:

namespace ns_x { struct X {}; } namespace ns_y { struct Y : ns_x::X {}; }

And you define a function (overload) for X reachable via ADL:

namespace ns_x { size_t mem_usage1(const X&) { return 1; } }

It becomes immediately reachable via ADL for Y , even if the two types reside in unrelated namespaces:

int main() { ns_y::Y y; mem_usage1(y); // works }

This may be a desired or an adverse feature depending on the types X , Y , and the logic of the overloaded function; but the thing is: you get it when using the overload based techniques, and you do not get it when using class template specialization technique.

For a full working example of this framework implementations see here. This technique has been chosen for std::hash , although it may not be visible at first, because in case of std::hash , a non-static member function is used (the function call operator), which requires creating a temporary object:

int main() { int i = 0; std::hash<int>{}(i); }

But the idea stays the same.

This technique becomes an attractive choice when a framework requires two or more operations to be available on the type. The class scope becomes a convenient way of bundling the operations together.

Conclusion

In all tree techniques, there is one common aspect: the customization point (names mem_usage1 , mem_usage2 and mem_usage3 ) are separate from the exposed interface: function mem_usage . This is a particular application of a general good practice: separate implementation from customization points. This applies not only to templates and overloads. For another application see article “Virtuality” by Herb Sutter.

While std::swap could be used as both customization point and the interface, there are many problems with it (forgetting to include some overloads being one of them). There are attempts at providing a separate interface and customization points but with the same name as proposed by Eric Niebler in N4381.

I am very grateful to Tomasz Kamiński for sharing his insights on the subject with me, and helping improve this post.

Referneces