When Is It Safe to Move an Object Instead of Copying It? How the compiler can figure out during compilation when to move objects instead of copying them.



Last week, I introduced the idea of moving objects, and explained that moving an object is usually better than copying it if the original is not going to be used again. Now I'd like to explain how the compiler can figure out during compilation when to move objects instead of copying them.

As is so often the case with such problems, the obvious solution is wrong. Consider this code fragment:

Thing t; work_on(t);

Suppose work_on is a function that takes a Thing parameter. Suppose further that t is not used again in this program. One would think that the compiler would be able to figure out that:

Calling work_on is supposed to copy t to work_on 's parameter.

is supposed to copy to 's parameter. However, t is not used again.

is not used again. Therefore, it is acceptable to move t to work_on 's parameter instead of copying it.

Alas, the situation is not that simple. A Thing is a user-defined class type that presumably has one or more constructors. These constructors have access to the address of the Thing object that they are constructing. If their authors so desire, the constructors can save the address of the object being constructed in another data structure — perhaps one that keeps track of all the Things in existence.

In such a case, the call to work_on is not necessarily the last use of t . At any point between where the program calls work_on and when t is destroyed, the program might call another function that uses the "all Thing s" data structure to locate t and use it. This call might be invisible to the compiler — indeed, it might even be separately compiled. Therefore, this seemingly simple compilation technique fails because in general, a compiler cannot reliably determine the circumstances in which an object "is not used again."

However, there is one circumstance in which a compiler can determine that an object is not used again, namely when that object is the result of an expression that is passed directly as a function argument. For example, if we replace our definition of t and call to work_on with

work_on(Thing());

then the argument to work_on is a temporary object that will be destroyed at the end of the statement. Because it will be destroyed, the compiler can be confident in this example that the argument to work_on will not be used again.

In C++, the idea of an object that is used only at the point in the program at which it is created is basically that of an rvalue. For example, a variable is not an rvalue, because it is generally possible to take the variable's address and remember it for future reference. On the other hand, a nontrivial expression that yields a nonreference type (such as Thing() ) is an rvalue, as is a numeric literal such as 42. Because an rvalue is always thrown away shortly after being used, the compiler can often change copy operations to move operations when the object being copied is an rvalue. If the object being copied is an lvalue, the compiler can never change a copy into a move.

Next week, we'll start exploring how C++ uses the type system to allow class authors to write code that takes the rvalue property of objects into account.