I was investigating a crash in my application recently. Despite working in the production environment, I was able to launch the debug version in a debugger (yay). Soon enough, I got a promising SIGSEGV in the expression implementing the erase-remove idiom:

vec. erase ( std :: remove_if ( vec. begin ( ) , vec. end ( ) , [ & orig ] ( auto const & msg ) { return orig. id == msg. id ; } ) ) ; vec.erase(std::remove_if(vec.begin(), vec.end(), [&orig](auto const& msg){ return orig.id == msg.id; }));

Can you spot the bug already?



If not: worry not, apparently this part of the standard library wasn’t designed to be readable or error-resilient.

First of all: remove rearranges the passed-to sequence, moving the elements fitting the predicate at the end of said sequence. The return value is the new end iterator for this range — past the last item that did not fit the removal criteria. An example:

vector < int > foo = { 1 , 2 , 3 , 4 , 5 } ; // remove even numbers auto it = remove_if ( foo. begin ( ) , foo. end ( ) , [ ] ( int a ) { return a % 2 == 0 ; } ) ; vector<int> foo = {1,2,3,4,5}; // remove even numbers auto it = remove_if(foo.begin(), foo.end(), [](int a){ return a%2 == 0; });

foo can now contain {1,3,5,2,4} (actually, elements past the new logical end of the container may have unspecified values, as allowed by MoveAssignable concept). it points to an element past the last valid one (3); so, in this example, *it == 4. If no element has been removed, nothing is changed and the actual end iterator is returned.

That’s about it for remove/remove_if. Now, the problem with std::vector‘s or most others standard containers’ erase is that it has two overloads:

iterator erase ( const_iterator pos ) ; // 1) iterator erase ( const_iterator first, const_iterator last ) ; // 2) iterator erase(const_iterator pos); // 1) iterator erase(const_iterator first, const_iterator last); // 2)

1) removes element pointed-to by the iterator. 2) removes a sequence of elements minus last (as is usual in the standard library).

Do you see the bug now?

The original code calls the first overload of erase, erasing a single element. Unfortunately, if no element matched remove_if‘s predicate, the original end is returned and the code is equivalent to:

vec. erase ( vec. end ( ) ) ; vec.erase(vec.end());

It is undefined behaviour.

Edit: as pointed out by a commenter, in the opposite case of more than one element being “removed” by the remove function, only one of them will be actually removed from the container. In my case, id’s were supposed to be unique, so that problem slipped my mind.

The fix is simple: either check if remove returns the end iterator or use the second overload of the function.

After my fix, my function looked more or less like the following:

auto removeIt = std :: remove_if ( vec. begin ( ) , vec. end ( ) , [ & orig ] ( auto const & msg ) { return orig. id == msg. id ; } ) ; vec. erase ( removeIt, vec. end ( ) ) ; auto removeIt = std::remove_if(vec.begin(), vec.end(), [&orig](auto const& msg){ return orig.id == msg.id; }); vec.erase(removeIt, vec.end());

This prompted me to write a helper function template to prevent this from ever happening again:

template < typename Container, typename Predicate > void erase_if ( Container & c, Predicate p ) { using std :: begin ; using std :: end ; using std :: remove_if ; auto realEnd = end ( c ) ; auto removedIt = remove_if ( begin ( c ) , realEnd, p ) ; c. erase ( removedIt, realEnd ) ; } template<typename Container, typename Predicate> void erase_if(Container& c, Predicate p) { using std::begin; using std::end; using std::remove_if; auto realEnd = end(c); auto removedIt = remove_if(begin(c), realEnd, p); c.erase(removedIt, realEnd); }

That being said, the real problem here is that it’s pretty easy to make such a mistake, and not as easy to spot it (especially for a newbie). As a C++ programmer, I’ve come to rely on my compiler (and occasional static analysis) to point out my mistakes and refuse to compile erroneous code and all those tools have failed me here.

Hopefully, this will change in C++17 and we’ll finally be able to use single function to easily remove elements from a range.