Recently, I was hit by one C++11 gotcha. It is funny: I know about it, I have blogged about it, and nonetheless I still fell into the trap.

Do you remember my other post on efficient optional values? I am using the tool at work, and I tried to define an empty-state policy for std::string . Which value of std::string can be “spared” to represent the non-value? I am pretty sure it cannot be the empty string. Empty strings are used too often for various purposes, and I can easily imagine that in many applications one may want to distinguish between an empty string an not-a-string. Fortunately, there exist better candidates. For instance, in my programs I never need to use character '^' and even if other people use it, they most likely never need the control character of numeric value 2. Or a string composed of three characters of numeric value 0. (Remember, std::string can contain many zeros). I decided to give the users a choice: my policy is a template, and one can specify which special character to use, and how many times it is to be repeated:

template <char CH, size_t SIZE> struct empty_string_policy { static std::string empty_value() // may allocate { return std::string{SIZE, CH}; } static bool is_empty_value(const std::string& v) // no alloc { return v.size() == SIZE && std::all_of(v.begin(), v.end(), [](char c){ return c == CH; }); } };

Suppose, you want to represent the not-a-string value as three null characters. You just define an alias that reflects that:

using Null3Policy = empty_string_policy<'\0', 3>;

When I used it in my program, I observed that it had a bug. After a while of investigation, the problem boiled down to the following assertion:

Null3Policy p; assert (p.is_empty_value(p.empty_value()));

Apparently, function is_empty_value checks something else than what function empty_value creates. But which is wrong and how?

Note that I used braces to initialize the returned string. As indicated in this post, brace initialization is intended to be a superior alternative to the old-style function-call-like syntax. We are also aware (or are we?) about the container initialization gotcha related to this feature: namely, the sequence constructor (the one with std::initializer_list ) could be inadvertently selected. But it is not our case. We are passing an object of type size_t as the first argument. While size_t is convertible to char , it is definitely a narrowing conversion, and as we know, triggering a narrowing conversion in brace-initialization would result in compile-time failure (and our test compiles fine). Here is a relevant quote from the C++ (11) standard. 8.5.4/3:

List-initialization of an object or reference of type T is defined as follows: […] if T is a class type, constructors are considered. The applicable constructors are enumerated and the best one is chosen through overload resolution. If a narrowing conversion is required to convert any of the arguments, the program is ill-formed.

So, empty_value looks fine; maybe, then, the problem is in is_empty_value . The natural way (for me) to check which one it is, is to inspect the value after creation with the debugger. But the debugger displays an empty string. No wonder, it displays it as a C-string, and I wanted all zeros. But even when I change my policy to empty_string_policy<'^', 3> I get the same bug, and the debugger still renders an empty string. Only when I inspect the underlying array in a binary mode, do I observe that the first character is '\3' . So, it looks, it might have been the sequence constructor after all. I check the string size: it is 2. The sequence constructor; but if this be the case, how did it survive the narrowing conversion?

The answer lies in another surprising C++ behavior. There is no narrowing conversion in our example! If we read the standard, it says 8.5.4/7:

A narrowing conversion is an implicit conversion […] from an integer type or unscoped enumeration type to an integer type that cannot represent all the values of the original type, except where the source is a constant expression and the actual value after conversion will fit into the target type and will produce the original value when converted back to the original type.

This means that whether a conversion is narrowing or not depends not only on the source and target types, but also on the converted value! In a way this makes sense. As long as the compiler knows the value, it can see that there is no risk of loosing the information about the value after the conversion, so it can safely allow it.

My takeaway from this experience is this. I still maintain that the containers’ constructor where you specify the size is bug-prone (as indicated in this post). If you have to use it (I had to, in my example) never, ever, ever use braces. Even when you know it is absolutely safe. Do you think you know C++?