Some Optimizations Are More Important Than Others Whenever a human is waiting for a computer to do needless work, it is not just the computer's time that is being wasted.



Sometimes in this space I talk about how to write programs that avoid needless work. Other times I argue that computers should work for people, not the other way around. Isn't there a contradiction here? If you rewrite your program to avoid needless work for the computer, aren't you doing work to make things easier for the computer?

The way to resolve this apparent contradiction is to realize that whenever a human is waiting for a computer to do needless work, it is not just the computer's time that is being wasted. If the work is finished quickly enough that no one is inconvenienced by waiting, then it doesn't matter how much computer time is wasted. On the other hand, there are times when even a fraction of a second is too long to wait. For example, a program that displays video too slowly to keep up with the frame rate is as useless as a program that takes 48 hours to issue a 24-hour weather forecast.

In other words, every program optimization is a tradeoff between human effort and computer time. I believe that human effort is usually much more important than computer time, but not infinitely so and not under all circumstances. Doing a lot of rewriting in order to make a program slightly faster is much less important than doing a small amount of rewriting to make a program a lot faster.

Sometimes an optimization saves relatively more when the program has more work to do. A classic example of such an optimization is replacing a bubble sort by a better sorting algorithm. Bubble sorts are typically O(n2), where n is the number of elements being sorted. There are other algorithms that are O(n log n). The ratio between O(n2) and O(n log n) increases as n increases, to the extent that O(n2) algorithms are usually impractical for larger than toy programs.

Another factor to keep in mind when you think about optimizing a program is whether your optimization will be dominated by another part of the program's execution time. For example, if a program uses a bubble sort on n elements in several places in its code, and you replace some of those bubble sorts by faster algorithms but leave others as they were, the remaining bubble sorts will take an ever greater share of the total program's execution time as n increases.

In short, the key to finding effective ways to speed up a program is to look at the parts of the program that dominate its execution time and find ways of speeding up those parts that require relatively little programmer effort to implement.

With this background in mind, let's think about moving data rather than copying it. Suppose, for example, that we have two functions, each of which takes a string argument:

void val(string s) { /* … */ } void refc(const string& s) { /* … */ }

As their names imply, val 's parameter is a simple value, and refc 's parameter is a reference to const . Accordingly, when we give a variable to each of these functions as its argument:

string s; val(s); refc(s);

calling val copies s , and calling refc(s) does not do so. Much of the time, we would expect refc to be faster than val .

How much faster is refc ? It depends. One extreme is that each of these functions looks at only a few characters in the string. In that case, copying the string's characters would dominate the functions' execution time, so avoiding copying the characters would make a relatively large difference. The other extreme is that the functions do a lot of work regardless of the string's length, and the amount of work grows at least as rapidly as the string's length. In that case, the time to copy the string never dominates, and there's not much to gain.

There's a third case to consider: Perhaps the functions use the string in a way that effectively copies it anyway. For example, consider a function that returns the reverse of its argument. We might write it this way:

string rev(string s) { reverse(s.begin(), s.end()); return s; }

Alternatively, we might write it this way:

string rev(const string& s) { string t = s; reverse(t.begin(), t.end()); return t; }

The first of these functions takes advantage of the knowledge that its parameter, s , is a copy of its argument. Because of this knowledge, the function is free to modify the parameter. The second of these functions does not have that freedom, so it has to copy its parameter to modify the copy. Both of these functions wind up doing pretty much the same thing.

Note the weasel words "pretty much" in the previous paragraph. They are there because C++11 standard-library strings have move constructors. Consider the expression rev(get_string()) , where get_string is any function that returns a string . The subexpression get_string() is an rvalue, so the compiler knows that that subexpression is not going to be used again. As a result, if we are using the first version of rev , the result of get_string() is moved directly to rev 's parameter s . No characters are copied. In contrast, if we are using the second version of rev , the function is explicitly making a copy of s when it creates the local variable t . In other words, in C++11 it is actually possible for the second version of rev to be slower than the first.

If you really care enough about execution time that the difference between the two versions of rev matters to you, you should probably be overloading it:

string rev(string&& s) { reverse(s.begin(), s.end()); return s; } string rev(const string& s) { string t = s; reverse(t.begin(), t.end()); return t; }

Overloading an rvalue reference with an ordinary reference to const ensures that calling it with any kind of argument will match exactly one of these functions. If the argument is an rvalue, the compiler will choose the first function, which will reverse that rvalue in place. If the argument might be used again later, the compiler will call the second function, which will safely copy the string's characters.

So far, what we have seen is an assortment of small, mundane optimizations. They aren't terribly hard to code; they sometimes help and sometimes don't; and sometimes they even make things worse unless you use overloading to ensure that the compiler always does what you intend. Among the most useful of these optimizations is the one that you don't write at all; namely the one that happens when the compiler turns a copy into a move for you.

If this were all there were to it, you would be justified in wondering what the fuss was about. However, all of the examples so far have made an important, unstated assumption: Copying a string usually does not dominate the execution time of the program that does so. When we use more complicated containers, this assumption falls apart. We shall look more closely at that phenomenon next week.