Recently I came across an interesting gotcha with Boost.Pointer Container library in my project. Making some incorrect assumptions as to what the library does could cause a bug.

What would you use boost::ptr_vector for? Why would you need to have a vector of pointers, which you want to delete yourself? Is it because:

You want the objects to remain at the same address even if you re-allocate the array under the vector? You want to inter-operate with a library that already deals with owing pointers? You want it to be faster than if you were storing values in std::vector ? You want the “polymorphic behavior” of your objects?

If your reason is (1) or (2) and you are not concerned with performance too much, you would probably do the right thing.

If your reason is (3), it is likely that you would be picking the slower solution. But do not trust me on that: measure the two solutions and check if ptr_vector is really faster.

If your reason is (4) and your familiarity with ptr_vector is superficial (as was mine when writing this post), it is likely that you would be implementing a bug. In this post we will be exploring this use case.

Suppose we have the following base class representing an interface (in OO sense):

struct Consumer { virtual void consumeData(const Data&) {} // no-op by default virtual void consumeTime(const time&) {} // no-op by default virtual ~Consumer() {} };

It can be passed to any piece of work and do either of the two the two things:

Collect points in time, e.g. for measuring time, velocity, etc. Collect any arbitrary data, e.g. for logging

Any implementation of the interface can do both these things or only one. This is why we provide the default no-op implementation. For instance, if we want to implement a simple logger that adheres to this interface, we can do it like this:

struct CoutLogger : Consumer { void consumeData(const Data& d) override { std::cout << d; } };

No need to override consumeTime , the default no-op implementation will be used.

If, for some reason, we need to store these Consumer s in a collection, we could use a ptr_vector :

boost::ptr_vector<Consumer> make_consumers() { boost::ptr_vector<Consumer> ans; ans.push_back(new CoutLogger); // leak-safe, even on realloc // push_back more... return ans; }

Returning it by value works fine and, as we know, it does no (or at least does not have to do any) copying. Now, suppose that at some point we need to copy such vector. For instance, because we want to use it in multiple threads, and we want each thread to have a copy in order to avoid any data races, or locking problems. What happens if we copy a ptr_vector ?

The library knows that you expect elements to appear as though they were stored by value, so in order to fulfill this expectation, it will attempt to make a deep copy. But because it has no means of telling what the most derived type is stored under the pointer to Consumer , it will be copying elements assuming that their real type is Consumer . In other words, it will slice the objects. Thus a resulting copy, will only be storing pointers to Consumer , with trivial implementation of either member function! The worst thing about this is that such copy will compile, the program will be doing something that may even look correct. And likely we will learn about the problem only from the users!

This is perhaps one of the reasons why we should strive to make our interfaces abstract or at least non-copyable. My interface with default implementations may look fishy, but even if I want to preserve this idea I could have made the class abstract:

struct Consumer { virtual void consumeData(const Data&) {} virtual void consumeTime(const time&) {} virtual ~Consumer() = 0; // pure virtual destructor }; inline Consumer::~Consumer() {} // default implementation!

This exploits an interesting feature of C++: pure virtual functions can still have a body. One other thing I could have done with my interface is to inherit it from boost::noncopyable .

If I did either of that, the attempt to copy a ptr_vector<Consumer> would result in a compilation failure. That would be an improvement, but an ideal situation would be to have the copying do the right thing. It is doable, but it requires of us some effort. The procedure is described in detail here, in the library documentation. In short, you would have to augment the interface with some member function, like clone :

struct Consumer { virtual void consumeData(const Data&) {} virtual void consumeTime(const time&) {} virtual Consumer* clone() const = 0; // welcome to Java virtual ~Consumer() {} };

And force every implementation, to also implement it. Then, you would have to define one function, in order to teach the library how objects in your hierarchy are cloned:

inline Consumer* new_clone(const Consumer& c) { return c.clone(); }

But frankly, when you need to do this, I would consider using a value-semantic polymorphism instead as described in this post.