How Objects Move

In our previous article, we said that the ability to move a value from one place to another is one of C++0x's most important features, but one that many programmers will rarely use directly. We shall now explain who does need to use this feature directly, and why it works as it does.

Fundamental to C++ is the idea that you can bind a non const reference only to an lvalue . The reason for this rule is that people bind references to objects for one of two reasons: Either they want to use the reference to change the object, or they just want to avoid copying the object. Programmers distinguish between these cases based on whether the reference in question is const .

C++0x caters to a third case: The programmer intends to change the object's value, but does not intend to use the object again. This case typically comes up when a programmer wants to move an object's value from one object to another, and then destroy or give a new value to the original object.

Let's look again at an example from our last article:



string s = "I am a string!"; string t = s; // Make a copy of s </pre <p>Copying <code>s</code> to <code>t</code> requires allocating memory for a copy of <code>s</code>'s characters, then copying those characters into the new memory, and finally putting a pointer to the new memory into <code>t</code>. This work is needed so that changing the value of <code>s</code> does not change the value of <code>t</code>, or vice versa. <p>What if we know that <code>s</code>'s value will never be used again? In that case, we'd like to copy the pointer to <code>s</code>'s memory into <code>t</code>, and then clear the pointer in <code>s</code>. We need to clear <code>s</code>'s pointer because that memory is now controlled by <code>t</code>, not by <code>s</code>. Clearing the pointer also ensures that destroying <code>s</code> does not free the corresponding memory. <p>In our last article, we saw that we could tell the compiler that we will not use <code>s</code>'s value again by writing <br><pre class="brush: cpp; html: collapse;" style="font-family: Andale Mono,Lucida Console,Monaco,fixed,monospace; color: rgb(0, 0, 0); background-color: rgb(238, 238, 238); font-size: 12px; border: 1px dashed rgb(153, 153, 153); line-height: 14px; padding: 5px; overflow: auto; width: 100%;"> string t = std::move(s); // "Steal" state from s

The std::move function somehow tells the compiler that the programmer can change the value of move 's argument, and that it's OK to do so because the former value of the object in that argument will not be used again. The way std::move works is by using a new type called an rvalue reference, which corresponds to the third reason for binding a reference to a value:

A plain reference refers to an object that you intend to change and use again. A reference to const refers to an object that you will not change. An rvalue reference refers to an object that you intend to change and then discard.

If T is an object type (that is, not a reference or function type), we write " rvalue reference to T " as T&& . In general, we can bind an rvalue reference to an object only if that object is an rvalue  that is, only if we can't use that object again. For example:



int&& n = 42; // OK; 42 is an rvalue int k = n; int&& j = k; // Error; k is an lvalue int&& m = k+1; // OK, k+1 is an rvalue

When we use an rvalue reference, it acts just like any other reference: It is an lvalue (even though it is called an rvalue reference!) that is just another name for the object to which it refers. So, for example, we can write



n = 24; // OK, but it doesn't change the value of 42 m = 24; // OK, but it doesn't change the value of k

Changing n changes a temporary that started out with a copy of 42. Changing m changes a temporary that started with a copy of k+1 . Neither the value of k nor the value of 42 changes.

If we overload a function with T&& and const T& , the compiler will pick the T&& version when the argument is an rvalue  that is, when the compiler can see that the argument will not be used again  and will call the const T& version otherwise. This kind of overloading allows us to write code such as the following:



// This version copies a vector and sorts the copy template<class T> vector<T> vec_sort(const vector<T>& v) { vector<T> result = v; sort(result.begin(), result.end()); return result; } // This version sorts a vector in place template<class T> vector<T> vec_sort(vector<T>&& v) { sort(v.begin(), v.end()); return v; }

This example is a useful way of sorting a vector without destroying the original, with the added wrinkle that it preserves the original vector only when it needs to do so. If we give this function a variable as its argument, overloading will select the first version; the effect will be to copy the variable, sort the copy, and return the sorted copy. If, on the other hand, we give this function a temporary, it will sort the temporary without copying it, and return the sorted temporary:



vector<int> read_data(); // Read from a file, return values as a vector vector<int> v = read_data(); vector<int> sv = vec_sort(v); // Copy v, sort the copy, put it in sv vector<int> w = vec_sort(read_data());

In the last line of this example, read_data() is an rvalue . Therefore, overloading selects the second version of vec_sort , which, in turn, sorts the vector that read_data returned without copying it.

We can now understand a little more of the magic behind the example from our last article:



string t = std::move(s); // "Steal" state from s