A reader asked:

In C++17, for std::string::data(), is the returned buffer valid for the range [data(), data() + size()), or is it valid for [data(), data + capacity())? The latter seems more intuitive and what I think most people would expect reserve() to create given the non-const version of data() since C++17.

… and then helpfully included the answer, but in fairness clearly they were wondering whether cppreference.com was correct:

Relevant quote from cppreference.com: … “Returns a pointer to the underlying array serving as character storage. The pointer is such that the range [data(); data() + size()) is valid and the values in it correspond to the values stored in the string.”

Yes, cppreference.com is correct. Here’s the quote from the current draft standard:

2 A specialization of basic_string is a contiguous container (22.2.1).

3 In all cases, [data(), data() + size()] is a valid range, data() + size() points at an object with value charT() (a “null terminator”), and size() <= capacity() is true.

Regarding this potential alternative:

or is it valid for [data(), data + capacity())?

No, that would be strange, because it would mean intentionally supporting reading uninitialized characters in any extra raw memory at the end of the string’s current memory block.

Note that the first part of the above quote from the standard hints at the consistency issue: A string is a container, and we want containers to be consistent. We certainly wouldn’t want vector<widget>::data() to behave that way and let callers see raw memory with unconstructed objects.

The latter [… is …] what I think most people would expect reserve() to create

c/reserve/resize/ and I’ll agree :)

Any container’s size()/resize() is about the data you stored in it and that it’s holding for you. Any container’s capacity()/reserve() is about the underlying raw memory buffer just to let you help the container optimize its raw memory management, but it isn’t intended to give you access to the allocated-but-unused memory.