C++17 added support for non-member std::size, std::empty and std::data. They are little gems for generic programming. Such functions have the same purpose of std::begin and the rest of the family: not only can’t you call functions on C-arrays (e.g. arr.begin() or arr.size()), but also free-functions allow for more generic programming because they can be added afterwards on classes you cannot modify.

This post is just a note about using std::size and std::empty on static C-strings (statically sized). Maybe it’s a stupid thing but I found more than one person others than me falling into such “trap”. I think it’s worth sharing.

To make it short, some time ago I was working on a generic function to compare strings under a certain logic that is not important to know. In an ideal world I would have used std::string_view, but I couldn’t mainly for backwards-compatibility. I could, instead, put a couple of template parameters. Imagine this simplified signature:

template<typename T1, typename T2> bool compare(const T1& str1, const T2& str2);

Internally, I was using std::size, std::empty and std::data to implement my logic. To be fair, such functions were just custom implementations of the standard ones (exhibiting exactly the same behavior) – because at that time C++17 was not available yet on my compiler and we have had such functions for a long time into our company’s C++ library.

compare could work on std::string, std::string_view (if available) and static C-strings (e.g. “hello”). While setting up some unit tests, I found something I was not expecting. Suppose that compare on two equal strings results true, as a normal string comparison:

EXPECT_TRUE(compare(string("hello"), "hello"));

This was not passing at runtime.

Internally, at some point, compare was using std::size. The following is true:

std::size(string("hello")) != std::size("hello");

The reason is trivial: “hello” is just a statically-sized array of 6 characters. 5 + the null terminator. When called in this case, std::size just gives back the real size of such array, which clearly includes the null terminator.

As expected, std::empty follows std::size:

EXPECT_TRUE(std::empty("")); // ko EXPECT_TRUE(std::empty(string(""))); // ok EXPECT_TRUE(std::empty(string_view(""))); // ok

Don’t get me wrong, I’m not fueling an argument: the standard is correct. I’m just saying we have to be pragmatic and handle this subtlety. I just care about traps me and my colleagues can fall into. All the people I showed the failing expectations above just got confused. They worried about consistency.

If std::size is the “vocabulary function” to get the length of anything, I think it should be easy and special-case-free. We use std::size because we want to be generic and handling special cases is the first enemy of genericity. I think we all agree that std::size on null-terminated strings (any kind) should behave as strlen.

Anyway, it’s even possible that we don’t want to get back the length of the null-terminated string (e.g. suppose we have an empty string buffer and we want to know how many chars are available), so the most correct and generic implementation of std::size is the standard one.

Back to compare function I had two options:

Work around this special case locally (or just don’t care), Use something else (possibly on top of std::size and std::empty).

Option 1 is “local”: we only handle that subtley for this particular case (e.g. compare function). Alas, next usage of std::size/empty possibly comes with the same trap.

Option 2 is quite intrusive although it can be implemented succinctly:

namespace mylib { using std::size; // "publish" ordinary std::size // on char arrays template<size_t N> constexpr auto size(const char(&)[N]) noexcept { return N-1; } // other overloads...(e.g. wchar_t) }

You can even overload on const char* by wrapping strlen (or such). This implementation is not constexpr, though. As I said before: we cannot generally assume that the size of an array of N chars is N – 1, even if it’s null-terminated.

mylib::empty is similar.

EXPECT_EQ(5, mylib::size("hello")); // uses overload EXPECT_EQ(5, mylib::size(string("hello")); // use std::size EXPECT_EQ(3, (mylib::size(vector<int>{1,2,3})); // use std::size

Clearly, string_view would solves most of the issues (and it has constexpr support too), but I think you have understood my point.

[Edit] Many people did not get my point. In particular, some have fixated on the example itself instead of getting the sense of the post. They just suggested string_view for solving this particular problem. I said that string_view would help a lot here, however I wrote a few times throughout this post that string_view was not viable.

My point is just be aware of the null-terminator when using generic functions like std::size, std::empty, std::begin etc because the null-terminator is an extra information that such functions don’t know about. That’s it. Just take actions as you need.

Another simple example consists in converting a sequence into a vector of its underlying type. We don’t want to store the null-terminator for char arrays. In this example we don’t even need to use std::size but just std::begin and std::end (thanks to C++17 template class deduction):

template<typename T> auto to_vector(const T& seq) { return vector(begin(seq), end(seq)); }

Clearly, this exhibits the same issue discussed before, requiring extra logic/specialization for char arrays.

I stop here, my intent was just to let you know about this fact. Use this information as you like.

Conclusions

TL;DR: Just know how std::size and std::empty work on static C-strings.

static C-strings are null-terminated arrays of characters (size = number of chars + 1),

std::size and std::empty on arrays simply give the total number of elements,

be aware of the information above when using std::size and std::empty on static C-strings,

it’s quite easy to wrap std::size and std::empty for handling strings differently,

string_view could be helpful.