Moving and Rvalue References The C++ type system helps the compiler figure out whether to move or copy an object. How does it do it?



Last week, I said that the C++ type system helps the compiler figure out whether to move or copy an object. I'd like to continue by saying more about how it does so.

In general, C++ doesn't treat whether an expression is an lvalue as being part of its type. So, for example, if we define a variable:

int n;

then the type of n is int . Moreover, the type of 3 is also int . Nevertheless, it is not always possible to use n and 3 interchangeably, even if they have the same type. Although we can write

int& r = n;

we cannot write

int& s = 3;

because of the rule that says that when we define a reference, we must initialize it from an lvalue. In this case, n is an lvalue and 3 is not; so we cannot use 3 to initialize s . This rule makes it possible for compilers to catch errors such as

std::cin >> 3;

This call is equivalent to

std::cin.operator>>(3);

In other words, it is calling a function named operator>> that is a member of the object std::cin , and giving that function 3 as an argument. That function, in turn, is overloaded, but although one of its overloaded versions has an int& parameter, none of them has a plain int parameter or a parameter of any type that will accept an int rvalue argument. Therefore, the compiler will determine that it is not possible to call operator>> with an int rvalue as its argument.

C++11 extends the notion of references to include rvalue references. An rvalue reference is still a kind of reference, but unlike a plain reference, it can be bound only to an rvalue. We write an rvalue reference this way:

int&& t = 3;

A plain reference can be bound only to an lvalue; an rvalue reference can be bound only to an rvalue. So, for example, if we had written

int&& t = n; // Error: n is an lvalue

the compiler would reject it because n is an lvalue.

An rvalue reference is an interesting kind of hybrid, because although you can bind it only to an rvalue, you use it as an lvalue. The point is that because you must have bound it to an rvalue, it is no longer possible to use that value outside your code. This fact means that you can clobber the rvalue as you wish, effectively treating it as an lvalue, and the rest of the program will not notice. For example:

int&& t = 3; ++t;

Ordinarily, a program can't change an rvalue. However, in this case, binding t to the rvalue 3 essentially gave lvalue access to it. The program could clobber the value of 3 because no other part of the program is going to be able to access that particular instance of 3 ; so clobbering it is harmless. The knowledge that the rvalue will not be used again gets passed along as part of the "rvalue reference" property of variables, which, in turn, allows those variables to be moved instead of copied.

Defining a function to take an rvalue reference parameter allows that function to move its argument rather than copying it. However, it does so at the cost of prohibiting lvalue arguments:

void foo(int&&); foo(42); // OK; int n; foo(n); // Error: n is an lvalue

We can solve this problem by overloading the function:

void bar(int&&); void bar(const int&); bar(42); // calls bar(int&&) int n; bar(n); // calls bar(const int&)

This technique is worth remembering: by overloading a function with a reference to const and an rvalue reference, you give the compiler the information it needs to tell your function when it is safe to modify its parameter and when it is not safe. Moreover, the compiler makes this safety decision during compilation, thereby avoiding any runtime overhead that might come along with it.

Next week, we'll start looking at just what a function has to do in order to move a value instead of copying it.