In this post we will explore what I consider the most significant language change in C++17. I call it the most significant because it changes the way you design your resource-managing types and how you think about initialization. It is often called “guaranteed copy elision”, but I will not use that name (except for this single time) because it does not reflect what this feature is. C++ has completely changed the meaning of rvalue (actually, prvalue).

In C++98 the meaning of the following definition

X f() { return X(0); }

is that we are creating, inside the function, a temporary object of type X . Next, we use this object, to copy-initialize another temporary object that will outlive the function. Once the function has finished, the remaining temporary object can be used to copy-initialize the destination object. The compiler is allowed to elide the copying and behave as if the three mentioned objects were actually one and the same. But conceptually the temporaries and the copying are there, as we can observe by declaring the copy constructor as private.

C++11’s move constructor changes the game here, because you can have something enough copy-like to be used in transferring guts from a temporary to another object, but at the same time it does not have to clone the state of the object: it alters the source temporary so that it is in a no-resource state. But this solution comes with costs.

First, we are still dealing with temporaries and while moves can be elided, conceptually they are still there, and you can see it when the move constructor is declared as deleted: returning by value will not compile.

Second, while moving is often faster than copying, it still takes time to form the no-resource state in the source object, and sometimes the move cannot be elided.

Finally, moving requires the existence of the moved-from state (or the no-resource state), which weakens the class invariants, as described in this post. An object of a type without move semantics can always represent a session with acquired resource, an object of type with move semantics either represents a session with an acquired resource or a no-session state, and we may have to check which one it is.

To some extent we can work around it with hacks. For instance, in C++11 there is a way to sort of return by value a mutex from a function. std::mutex is a non-movable type — the most relevant part in a mutex is its address, so we cannot change it with a potential move. But we can do this:

std::mutex make_mutex() { return {}; } std::mutex&& m = make_mutex();

In line 3 we do not have a return with an object, but a return with initializer. Syntax {} does not designate a temporary. It only means how the temporary object outside the function will be initialized. Then, in line 6, we do not initialize an object, but an rvalue reference: it can bind to a temporary, but is itself an lvalue. The reference also extends the lifetime of the temporary, so it is almost as if we were declaring an object.

But it is still a hack, with its limitations.

C++17 extends this to an elegant solution. Now, the definition as the one from the initial example:

X f() { return X(0); }

has a different meaning: an object will be created, with 0 as argument, but it is not yet clear what object. The function does not return a temporary. It does not create a temporary inside. It simply returns a “recipe” telling how the final object should be constructed. And if we call it like this:

X x = f();

It will be object x that will be created using this recipe. It is the only object of type X that will be created. It is equivalent to calling:

X x(0);

There are no temporaries involved. The type does not have to be movable. similarly, we can return a mutex like this:

std::mutex make_mutex() { return std::mutex{}; } std::mutex m = make_mutex();

Thus, we can return by value objects that are not movable. Or conversely, we can make our objects non-movable, and still retain the possibility to return them by value in some cases. Why would we want to do it? To make the invariants of our resource-managing classes stronger. Going back to the example from the other post, if a type is movable, it will look more-less like this:

#include <sys/socket.h> // Linux header #include <unistd.h> // Linux header #include <stdexcept> class Socket { int socket_id; public: explicit Socket() : socket_id{ socket(AF_INET, SOCK_STREAM, 0) } { if (socket_id < 0) throw std::runtime_error{translate(socket_id)}; } Socket(Socket&& r) noexcept : socket_id{ std::exchange(r.socket_id, -1) } {} bool is_valid() const { return socket_id != -1; } // class invariant: !is_valid() || id() >= 0 ~Socket() { if (is_valid()) close(socket_id); } int id() const { return socket_id; } // precondition: is_valid() // postcondition: return >= 0 Socket(Socket const&) = delete; };

In move constructor (line 11), since we are stealing the resource we have to give something in exchange so that the object knows it does no longer represent a session with a resource: we set value -1. Now objects of type Socket may or may not represent a session, so we have to add an observer function (line 21) that will tell which state we are in. The invariant (line 23) is weak: object’s life time is not identical with the duration of the session. Now, all the functions need to take into consideration what the object should do if it does not represent a session. We can see this in lines 27 and 32. In the destructor we have an if-statement. In function id() we have a precondition: the function will trust us that we will never call it on a no-session object, but there is a risk that we might.

In contrast, if we drop the support for moving, the class design is simpler and less bug-prone:

class Socket { int socket_id; public: explicit Socket() : socket_id{ socket(AF_INET, SOCK_STREAM, 0) } { if (socket_id < 0) throw std::runtime_error{translate(socket_id)}; } Socket(Socket&& r) = delete; // class invariant: id() >= 0 ~Socket() { close(socket_id); } int id() const { return socket_id; } // postcondition: return >= 0 };

No move constructor: no way to get value -1. The invariant is strong: if you have access to the object, the session with the socket is in progress; always. No need to check it in destructor, no precondition on function id() , and you just cannot call it on object not bound to a session. Hardly anyone designed their resource-managing classes like this, because until C++17 they could not be returned from functions by value. Now we can do it!

This does not make non-movable types work for every factory function, though. We can initialize and return our Socket instance like this:

Socket make_socket() { return Socket{}; }

But we cannot do this:

Socket make_socket() { Socket s {}; prepare_socket(s); return s; }

Because it requires a prvalue. A prvalue, informally, is an rvalue in C++03 sense: usually either a literal, or a call to function returning by value, or type name followed by parentheses or braces with arguments: a recipe specifying how some future object will be initialized. A const recipe is same as a non- const recipe, therefore initializing a const object form a non- const prvalue or vice versa works fine, like here:

Socket make_socket() { return Socket{}; } const Socket new_socket() { return make_socket(); // func call returning by value } // is still a prvalue Socket s = new_socket();

You can return more than one recipe from a function:

const Socket select_socket(bool cond) { if (cond) return Socket{}; return make_socket(); }

In case you are wondering how this can be implemented on a compiler. Function select_socket() when called, will be passed an additional pointer that indicates at which location the destination object is going to be created, and initializes the object in that place using the recipe from prvalue. Whoever calls select_socket() to initialize his object will pass the address of this to-be-object to function select_socket() .

A recipe can be transferred up, like inside function select_socket() but ultimately some object will be initialized with it. If you do not designate one, like here:

int main() { make_socket(); }

a temporary object will be created. Similarly here:

int main() { return make_socket().id(); }

More than just return by value

This feature, which could be called “prvalues without temporaries”, can be used to solve another problem: conditional initialization. In order to illustrate it, we need to first change our Socket class once more. Because we can now afford to return by value without any move constructor, instead of providing a constructor, we will only allow to create the instances through factory functions:

class Socket { private: explicit Socket(int AddressFamily); Socket() = delete; Socket(Socket&&) = delete; public: static Socket make_inet() { return Socket{AF_INET}; } static Socket make_unix() { return Socket{AF_UNIX}; } // ... };

This is superior to constructors, because now we can have two functions with identical set of parameters (empty set in our case) that perform different initialization. Now, suppose we want to use our Socket inside class Client :

class Client { Socket _socket; public: explicit Client (Params params); };

Params contains member datum isUnixDomain . Based on this parameter, we want to use one factory function or the other. We can do it like this:

Client::Client(Params params) : _socket(params.isUnixDomain ? Socket::make_unix() : Socket::make_inet()) {}

And this just works: no move constructor is needed: only one object is initialized: _socket . This syntax is correct before C++17, but previously it required a move.

Unfortunately, while it works for initializing member subobjects, the Standard is not clear whether the same thing should work for initializing base classes and delegating constructors. GCC does implement rvalue references without temporaries in constructor delegation, but this may turn out to be non-portable.

What if we wanted to emplace our Socket in a std::vector ? This would not work because adding a new element might cause the vector to grow, and this requires moving the elements around. But what if we wanted to emplace a Socket in a container that doesn’t grow in this way? Let’s try to implement our own: a simplified version of std::optional : we provide a raw storage for an object of type T . By default no object is allocated, and then later we can emplace data inside the storage:

template <typename T> class Opt { std::aligned_storage_t<sizeof(T), alignof(T)> _storage; bool _initialized = false; void* address () { return &_storage; } T* pointer() { return static_cast<T*>(address()); } public: Opt() = default; Opt(Opt&&) = delete; ~Opt() { if (_initialized) pointer()->T::~T(); } template <typename... Args> void emplace(Args&&... args) { assert (!_initialized); new (address()) T(std::forward<Args>(args)...); _initialized = true; } };

For the purpose of our discussion we make Opt non-movable, because we intend to store a non-movable T . In-place construction in line 20 also works without creating temporaries when passed a prvalue. However, function emplace() takes arguments by reference, so a temporary needs to be created, and there will need to be a move. So, the following will not work:

Opt<Socket> os; os.emplace(Socket::make_inet()); // error

But, we can work around this by creating a temporary of a different type than Socket : with a conversion operator to Socket that will create a prvalue directly in the in-place construction. Here is how we can implement it:

template <typename F> class rvalue { F fun; public: using T = std::invoke_result_t<F>; explicit rvalue(F f) : fun(std::move(f)) {} operator T () { return fun(); } };

Metafunction invoke_result_t is a replacement for std::result_of in c++17. Construct invoke_result_t<F> means the result of invoking a function-like object of type F with no parameters. With this tool in place, we can emplace a Socket in our container like this:

Opt<Socket> os; os.emplace(rvalue{&Socket::make_inet});

Let me explain. We are creating an object of type rvalue<F> . F is deduced from the argument. This is another feature of C++17 called class template argument deduction. The initialization only stores a pointer to a function. We can create temporaries of this type as they are cheap and movable. But in the in-place initialization that takes place inside emplace() an object of this type is converted to Socket . Only inside this conversion do we call the factory function and produce a prvalue that is only used to initialize the object in the raw storage of the optional object.

We can get away with passing only a pointer because the function does not take additional parameters. In general, rather than passing a pointer we would pass a closure object:

Opt<Socket> os; os.emplace(rvalue{[&]{ return Socket::make_inet(); }});

But there is more. We have said, it is impossible to emplace into a vector because it might move elements around while growing. But moving elements around would not require a move constructor if we had a destructive move. But with C++17’s prvalues, we can implement the library part of the destructive move.

In order to do this, we require of all the types T that want to be destructively moved to provide function that can be found through ADL:

T destructive_move(T& old) noexcept;

(This is somewhat similar to how swap is used: if you want your type to be swapped efficiently, provide an overload for swap for your type.)

The semantics of destructive_move are: once this function is called on a piece of storage representing an object of type T , the object is considered destroyed: no destructor must be called for it, and another prvalue (“recipe for creating an object”) is returned.

Whereas a move constructor for some types may need to throw exceptions, it is never the case that destructive move operation should throw. We require that it never throws exceptions.

The requirement on not calling the destructor only makes sense for container-like types that manage the life-time of objects manually. This is the case for our type Opt . Let’s add member function eject to its interface. It will return the contained object by value, and leave the optional object valueless:

template <typename T> class Opt { // ... public: // ... T eject() { assert (_initialized); _initialized = false; return destructive_move(*pointer()); } };

Function eject() returns a prvalue by value, that is, it returns a recipe. It marks the optional object as not containing a value. T ’s destructor is not called. It is assumed that destructive_move() does anything that is required to consider the object destroyed. From now on, the life-time of the contained object is finished.

How may the implementation of destructive_move() for our Socket look like? Let’s see the rewritten class Socket . We will then explain what is going on:

class Socket { int socket_id; // class invariant: id() >= 0 struct destructive_t {}; // for tagging a special ctor explicit Socket(int AddressFamily) : socket_id{ socket(AddressFamily, SOCK_STREAM, 0) } { if (socket_id < 0) throw std::runtime_error{""}; } explicit Socket(Socket& s, destructive_t) : socket_id{std::exchange(s.socket_id, -1)} { s.Socket::~Socket(); } public: Socket(Socket&& r) = delete; ~Socket() { if (BOOST_LIKELY(socket_id != -1)) close(socket_id); } int id() const { return socket_id; } // postcondition: return >= 0 static Socket make_inet() { return Socket{AF_INET}; } static Socket make_unix() { return Socket{AF_UNIX}; } friend Socket destructive_move(Socket& s) { return Socket{s, destructive_t{}}; } };

Empty class destructive_t (line 6) is a tag that we will use for tagging a new constructor.

The new “destructive” constructor (line 15) takes another Socket by lvalue reference. It is to some degree similar to a move constructor, but it goes further. It steals the contents from s (in this case the contents are only socket_id ), it puts a not-a-socket id instead (much like move constructor would do), and immediately calls the destructor of s , which ends its lifetime. The destructor again needs to check for the special not-a-socket value before it calls close() (line 25). This looks like we are back to the type with a moved-from state, but this time it is different. The not-a-value state can only be set by the special “destructive” constructor, which is private, and the next thing it does is to destroy the object with the not-a-socket value. So apart from the destructor no-one will observe this state. This safe-to-destroy state offers less guarantee than the moved-from valid but unspecified state.

We annotate the check with BOOST_LIKELY (this is a macro over GCC’s and clang’s __builtin_expect ) which hints the compiler that unless there are other indications it should assume that the condition will evaluate to true . A similar annotation [[likely]] is a likely (pun intended) addition to the future revisions of C++ (see here).

In the destructive-move case the check will be optimized out by the compiler as it is performed a couple of instructions before the socket id is set to -1. Our invariant still is declared as strong, although technically this is incorrect, because sometimes it will not hold when destructor starts. This would have been cleaner if the language offered a native support for destructive moves. In that case invoking a “destructive” constructor would be recognized as ending the life-time of the object, and we would not need to call the destructor manually, and would not kave to set the special value -1.

Our friend function destructive_move (line 35) uses the “destructive” constructor in its returned prvalue. The contract of function destructive_move is: after it has been called, no attempt will be made to destroy the object referred by the argument reference.

This is all we have to do to be able to eject a non-movable type from our optional:

Opt<Socket> os; os.emplace(rvalue{&Socket::make_inet}); Socket s = os.eject();

And we can also emplace the ejected socket:

Opt<Socket> os, ot; os.emplace(rvalue{&Socket::make_inet}); ot.emplace(rvalue( [&]{ return os.eject(); } ));

This shows how we can move around (to some extent) a non-movable object with a strong invariant. A similar technique could be used in stl2::vector .

And that’s it for today. I would like to thank Tomasz Kamiński for explaining to me the significance and the potential of “prvalues without temporaries” feature.