Moving Is Not Copying Like many subtle ideas, the idea of moving data in C++ is built on a simple concept.



Last week, I noted that reallocating a vector involves moving its elements into newly allocated memory, and that moving these elements is different from copying them. This note begins to get at the subtleties behind that remark. The C++ standards committee first met at the end of 1989; the notion of moving as a distinct operation from copying did not enter the C++ Standard until 2011. Therefore, these subtleties have taken more than 20 years to nail down. Like many subtle ideas, the idea of moving data in C++ is built on a simple concept. This week, we'll look at the basic concept, after which we'll start exploring implications in more detail.

The fundamental idea comes from three straightforward statements:

Sometimes, a data structure A might refer to another data structure B.

In such a case, copying A without copying B causes A and its copy to be aliases.

If we copy A and also copy B, we have now given ourselves the new problem of copying B, which may be more difficult or expensive than copying A.

As an example, consider a vector. Typically, a vector is implemented as a data structure that includes a pointer to dynamically allocated memory. This memory contains the elements of the vector. In general, vectors act as values: Copying a vector copies its elements. For example:

vector<int> v; v.push_back(42); vector<int> w = v; w.push_back(24);

After executing these statements, v will have one element with value 42, and w will have two elements with values 42 and 24, respectively. Defining w as a copy of v copied v 's element.

This behavior makes sense for value-like containers. Data structures that represent files are another matter entirely. For example:

ifstream in(myfile); ifstream in2 = in; // What should this do?

When we define in2 as a copy of in , how should the system behave? Even if we were willing to accept the overhead of copying the entire input file, what should happen if the file is on an interactive device on which all the input is not yet available? Are we expecting the operating system to cause in and in2 to refer to the same data stream somehow? If so, what do we do about closing the file automatically? For example, what if in goes away while in2 is still around? For these reasons, among others, the C++ library prohibits making copies of the standard-library I/O data structures such as istream . It does not want to have to deal with the complexity of defining aliases to such data structures.

We see, then, that there are some data structures that it is undesirable to copy. Now let's look at what I think is a key example:

ifstream make_tempfile() { ifstream result; // … return result; }

This example is illegal in C++ 03, because returning an object type (as opposed to a pointer or reference type) from a function implies copying the result as part of the return statement. However, although technically speaking this function copies the local variable result, it does so in a way that doesn't cause aliasing problems. The reason is that as soon as it makes the copy, it destroys the original. Consequently, there is never any point in the program in which two objects represent the same file. In this case, the objections to allowing objects of types such as ifstream to be copied go away.

Thinking about many examples such as this one gave rise to a new rule in C++11:

Creating a copy of an object in a context in which the original will never be used again except to destroy it is called moving the object. It is possible for classes to allow their objects to be moved independently of whether those objects can be copied.

So, for example, C++11 allows ifstream objects to be moved; in consequence, the make_tempfile example above is permitted. As another example, C++11 (unlike C++03) permits vectors of ifstream objects.

Next week, we'll start looking more deeply into how C++ makes it possible for the compiler to tell whether what looks like a copy operation is really a move operation.

[Koenig explores the remaining subtleties of copying data elements in his next two posts: When Is It Safe To Move An Object Instead of Copying It? and More Thoughts About Moving Objects Safely  Ed.]