At GoingNative back in September, Andrei Alexandrescu posed an interesting question about API design and C++11 that has had me scratching my head for a month. It was about the design of std::getline :

// Read a line from sin and fill in buf. Return sin. std::istream& getline(std::istream& sin, std::string& buf) { buf.clear(); // ... fill in buf return sin; }

Seasoned programmers recognize this pattern: The function takes the buffer by non-const reference and fills it in. They also know why the interface is designed this way: Because containers like std::string are too expensive to copy to consider returning one by value. APIs designed like this have traditionally had the benefit of being efficient, at the expense of some awkwardness at the call site:

std::string buf; std::getline(std::cin, buf); use_line(buf);

In C++11, standard containers like std::string are moveable, so returning one by value is darn near free. So, perhaps a better API design would look like this:

// Should getline look like this instead? std::string getline(std::istream& sin) { std::string buf; // ... fill in buf return buf; // This gets moved out efficiently }

That allows a more concise, natural usage, and doesn’t force the user to create a named variable:

use_line(getline(std::cin));

That’s nice, right? I mean, aside from the obvious shortcoming that now you can’t tell whether getline succeeded or not. Oops. But even overlooking that, there’s an issue here.

Performance, Performance, Performance

You might think that because of move semantics, we don’t have to worry about the lousy performance of returning expensive collections by value, and you’d be right. Sort of. But consider this use of getline :

std::string buf; while(std::getline(std::cin, buf)) use_line(buf);

Now consider what this code would be doing if, instead of taking buf as an out parameter, getline created a new string each time and returned it by value. Well, it’s creating a new string each time, duh. But the code above doesn’t do that. After a few times through the loop, buf will probably be big enough to hold whatever lines will be read next, and that space can be reused with no further allocations. Much, much faster.

Back To The Drawing Board

During GoingNative, Andrei left getline there. (It turns out he prefers a different design, and we’ll be arriving at a similar conclusion.) I wanted to continue the discussion. Out parameters are ugly and awkward to use, they hurt API composability, they force you to declare objects and initialize them in separate steps, they cause acne, etc. Surely something could be done!

I studied the problematic code some more:

std::string buf; while(std::getline(std::cin, buf)) use_line(buf);

What is this code doing? It’s reading a bunch of lines and processing them one at a time, right? You might even say, it’s returning a range of lines. Then it hit me: std::getline is the wrong API! It should be called getlines (plural), and it should return a range of strings. Take a look:

for(std::string& buf : getlines(std::cin)) use_line(buf);

This API feels right-er to me. Not only is it easier to use (look ma! one fewer line!), it doesn’t force a two-step initialization of any objects, and ranges and range operations compose. (More on that later.) It also doesn’t suffer from the performance problems of my first attempt, although it takes some work to see why.

Lazy Ranges

What does my getlines function return? Surely it doesn’t fill in a std::vector of string ‘s and return that. That would be (a) dumb, (b) expensive, and (c) impossible in practice since a potentially infinite number of lines could be read from an istream . Instead, getlines does something smarter: it returns a lazy range.

A lazy range is something that generates elements on demand. The STL already has such a thing: std::istream_iterator . You can create a range out of istream_iterator s that pulls characters — or ints or whatever — from an istream on demand. We need something like that, but for lines.

Unfortunately, we can’t press istream_interator into service for us. Instead, we need to write our own iterator type, and build a valid range out of that. This is a painful and verbose programming exercise, but Boost.Iterator can help. It has some helpers that let you build iterators from a fairly minimal interface. Without further ado, here is the lines_iterator :

struct lines_iterator : boost::iterator_facade< lines_iterator, std::string, // value type std::input_iterator_tag // category > { lines_iterator() : psin_{}, pstr_{}, delim_{} {} lines_iterator(std::istream *psin, std::string *pstr, char delim) : psin_(psin), pstr_(pstr), delim_(delim) { increment(); } private: friend class boost::iterator_core_access; void increment() { if(!std::getline(*psin_, *pstr_, delim_)) *this = lines_iterator{}; } bool equal(lines_iterator const & that) const { return pstr_ == that.pstr_; } std::string & dereference() const { return *pstr_; } std::istream *psin_; std::string *pstr_; char delim_; };

The magic happens when you increment a lines_iterator , which happens in lines_iterator::increment . std::getline is called, and it fills in a buffer referred to by pstr_ . Note that it uses the same buffer every time. And when you dereference a lines_iterator , it returns a reference to that buffer. No copying, no unnecessary allocation.

Where does the buffer referred to by pstr_ live? In the lines_range object, which is returned by getlines .

using lines_range_base = boost::iterator_range<lines_iterator>; struct lines_range_data {std::string str_;}; struct lines_range : private lines_range_data, lines_range_base { explicit lines_range(std::istream & sin, char delim = 'n') : lines_range_base{ lines_iterator{&sin, &str_, delim}, lines_iterator{}} {} }; inline lines_range getlines(std::istream& sin, char delim = 'n') { return lines_range{sin, delim}; }

lines_range is really just a boost::iterator_range of lines_iterator s. Some contortion was needed to initialize the str_ member before the iterator_range constructor was called (hence the need for lines_range_data ), but that’s just an implementation artifact.

The long and short of it is this: when you call getlines , you get back a lines_range object, which is basically a free operation. Now you can call .begin() and .end() on it, or directly iterate over it using a range-based for loop, like I showed. No more memory allocations are done using this interface than with the original std::getline API. Nice, eh?

Composability of Ranges and Range Algorithms

There’s lots of reasons to prefer the range-based getlines API — and range-based interfaces in general. The most immediate benefit is that people can use range-based for loops, as I showed above. But the real power comes once you start using range algorithms and range adaptors. Both Boost and Adobe’s ASL provide powerful utilities for working with ranges, and the C++ Standardization Committee has a working group dedicated to ranges for some future version of the standard. And for good reason! Range operations compose, so for instance you could do something like this:

// Read some lines, select the ones that satisfy // some predicate, transform them in some way and // echo them back out boost::copy( getlines(std::cin) | boost::adaptors::filtered(some_pred) | boost::adaptors::transformed(some_func), std::ostream_iterator<std::string>(std::cout, "n"));

That’s strong stuff. I shudder to think what the equivalent code would look like with straight iterators and STL algorithms.

But what if you just want to read a single line? Doesn’t the new getlines hurt you for this simple usage scenario? Nope! All we need is one perfectly general function that returns the first element of a range. Let’s call it front :

using std::begin; // return the front of any range template<typename Range> auto front(Range && rng) -> decltype(boost::make_optional(*begin(rng))) { for(auto x : rng) return x; return boost::none; }

Since a range might be empty, we need to return an optional . Now you can read a single line from an istream like this:

if(auto s = front(getlines(std::cin))) use_line(*s);

Compare this to the original and I think you’ll see it’s no worse:

std::string str; if(std::getline(std::cin, str)) use_line(str);

Stateful Algorithms

So have we completely addressed all of Andrei’s concerns with getline ? Yes and no. Certainly we’ve fixed getline , but Andrei’s point was bigger. He was showing that you can’t just blindly pass and return by value, hoping that move semantics will magically make your programs faster. And that’s a valid point. I can’t say anything that changes that fact.

I think getline is a curious example because what looks at first blush like a pure out parameter is, in fact, an in/out parameter; on the way in, getline uses the passed-in buffer’s capacity to make it more efficient. This puts getline into a large class of algorithms that work better when they have a chance to cache or precompute something. And I can say something about that.

If your algorithm needs a cache or a precomputed data structure, then your algorithms are inherently stateful. One option is to pass the state in every time, as getline does. A better option is to encapsulate the state in some object that implements the algorithm. In our case, the state was the buffer and the object was the range. To take another case, Boyer-Moore search is faster than strstr because it precomputes stuff. In the Boost implementation, boyer_moore is a stateful function object that keeps its precomputed part private.

Summary

Here are the key take-aways:

If your algorithm runs faster with a cache or a precomputed data structure, encapsulate the state in an object that implements the algorithm, rather than forcing your users to pass the state in.

API design must be guided by the expected usage scenarios of the API, and also the common idioms of modern C++11.

Ranges are a powerful abstraction because operations on them compose.

Boost.Iterator and Boost.Range greatly simplify the job of implementing custom ranges.

Thanks for reading!