This post is about compacting storage for optional<T> , and how we can’t do it perfectly.

In general, implementing storage for optional<T> is straightforward, and falls out directly from its requirements. We need a T , we need to know whether or not we have a T , and we need to be able to handle not having to construct a T if we don’t have one — particularly if T isn’t default-constructible. Put all those requirements together and we end up with:

template <typename T>

struct optional_storage {

struct empty { };

union {

empty _;

T value;

};

bool has_value;

};

No surprises there. Really, the most complicated part of implementing storage for optional<T> is the need to make it conditionally trivially copyable and trivially destructible (a situation that Casey Carter and I are hoping to improve). But that’s not what this post is about — this post is just about the actual storage. How much overhead are we adding on top of a T to be able to make it an optional<T> ?

It depends on the alignment of T . In the best case, sizeof(optional<T>) == sizeof(T) + 1 (e.g. when T is char ), but in the worst case sizeof(optional<T>) == 2 * sizeof(T) (e.g. when T is int )! But we need some overhead to store the flag, right? This is just the price of the extra semantics that optional gives us? You can’t do better for optional<int> than having an int and a bool .

Let’s start with something that the standard library doesn’t currently support: optional<T&> . (And, it really should.) How do we provide storage for such a type? Do we need a union with a T& and an extra bool member? No! We can just do:

template <typename T>

struct optional_storage<T&> {

T* value;

};

This is because we can’t have null references in C++, so there is a bit pattern in the value representation that does not represent a valid value: the null pointer value. A T* can completely encapsulate every possible value of T& along with a value that indicates that we don’t have a value. All with zero size overhead. Wonderful.

What other types have object representations that aren’t part of their value representation? For lots of enum s, there’s some value that is out of the range of the values intended to be used. So we could provide some customization point for optional<E> to provide some invalid value in E 's range that we could use as a sentinel. That’s a pretty easy win, even if we have to ask the user to pick such a sentinel value for us.

A more complicated example of this kind of type is bool . A bool takes up one byte, which has 256 possible values (token pedantic parenthetical about CHAR_BIT here), but a bool only has two possible values: true and false. There’s no sentinel value that we could use for bool in the same way we could for null references and non-exhaustive enum s. But if we store the value as a char instead, we could use 0 and 1 to indicate an engaged optional with that particular value, and still leave us with 254 other values that we can use as a sentinel to indicate that we don’t have a bool. For simplicity, let’s say 2 indicates a disengaged optional. Hence:

template <>

struct optional_storage<bool> {

char storage;

};

We’re done right? Ship it, put it in production, we have sizeof(optional<bool>) == 1 and the silly standard library’s implementation has a size of 2! Well, let’s actually try to implement the rest of the functions we might need to make this storage work. Namely, at least a constructor, an operator bool , and a getter:

// for reference types, as a comparison

template <typename T>

struct optional_storage<T&> {

T* storage; optional_storage() : storage(nullptr) { }

optional_storage(T& r) : storage(&r) { } explicit operator bool() const { return storage; }

T& lvalue() const { assert(*this); return *storage; }

}; template <>

struct optional_storage<bool> {

char storage; optional_storage() : storage(2) { }

optional_storage(bool b) { new (&storage) bool(b); } explicit operator bool() const { return storage != 2; }

bool& lvalue() { assert(*this); return *reinterpret_cast<bool*>(&storage); }

bool const& lvalue() const { assert(*this); return *reinterpret_cast<bool const*>(&storage); }

};

So, the bool storage is… a little ugly compared to the reference storage. But that’s fine, we’re not going for beauty here, we’re going for compact storage. The placement new and reinterpret_cast are necessary in order to have lvalue() return a bool& — we need to have a bool object there in order to perform that cast, and [basic.lval] allows us to still read the storage as a char for the purposes of checking to see if we have a value. I’m fairly sure everything in the above example is conforming.

But there’s one thing missing here as compared to the standard library: constexpr . No problem right, just go back and add constexpr in front of every function? That works fine for the reference storage, but we’re violating two restrictions in the boolean storage implementation: we have a new-expression and we have a reinterpret_cast . We need both to have a compact optional<bool> that gives us a bool& . In order to make it constexpr -friendly, we’d have to drop both of those:

template <>

struct optional_storage<bool> {

char storage; constexpr optional_storage() : storage(2) { }

constexpr optional_storage(bool b) : storage(b) { } constexpr explicit operator bool() const { return storage != 2; }

constexpr bool lvalue() const { assert(*this); return storage; }

};

The above works great, but we just get a bool instead of a bool& . As a result, we’re not consistent with our other optional specializations, which isn’t great for generic code. This arguably isn’t as bad as vector<bool> , but it’s in the same vein of decision making and isn’t a habit we want to get into.

This demonstrates the unfortunate choice we have to make. Do we:

Have a compact optional<bool> that isn’t constexpr -friendly?

that isn’t -friendly? Have a compact optional<bool> that doesn’t have reference access?

that doesn’t have reference access? Have an optional<bool> that is two bytes?

And this choice sucks. Truly. Certainly, compactness is important and can have a big impact on performance due to its knock-on effects on cache locality. Runtime is, after all, when problems get solved. But constexpr programming is important too, so not being able to have it at all for just this one type is for many people a non-starter. And being more functionally oriented, you might not even need a reference in that context. Others still might consider having specializations behave differently a complete non-starter that’s worth using the extra byte over.

None of these opinions is wrong. None of these choices is great. Making tradeoffs kind of sucks. I want to have my cake and eat it too. Because cake is delicious. I just hope as support for constexpr continues to expand, we may eventually not have to make this choice at all — we’d simply be able to implement std::optional<bool> to fit in one byte.

Until then, std::optional satisfies that 3rd bullet and there are several compact optional implementations out there that satisfy the 1st one as a distinct type (e.g. foonathan’s fantastic type_safe library provides an optional<bool> that is two bytes and a compact_optional<bool> that is one byte).