In the last post, I tried to make delimited ranges fit into the STL and found the result unsatisfying. This time around I’ll be trying the same thing with infinite ranges and will sadly be reaching the same conclusion. But the exercise will point the way toward an uber-Range concept that will subsume delimited ranges, infinite ranges, and STL-ish pair-o’-iterator ranges.

Infinite Ranges

Building motivation for delimited ranges was fairly simple; we’re all familiar with the idea from null-terminated strings. The case for infinite ranges is bit harder to make. As C++ programmers, we don’t regularly bump into infinity. In other languages infinity is all in a day’s work. Haskell programmers can create an infinite list of integers as simply as typing [1..] . Does that break your brain? It shouldn’t. It’s a lazy list — the elements are generated on demand. All infinite ranges are necessarily lazy.

What’s the use of that? Consider the take algorithm which constructs a new list from the first N elements of another list. It handles infinite lists with aplomb. Or consider what should happen when you zip an infinite list with a finite one. You end up with a finite list of element pairs. That’s a perfectly sensible thing to do.

Supporting infinite ranges in a generic range library would be a boon, so it’s worth looking at what it does to the concepts.

Infinite Ranges in the STL

We might think of infinite ranges as a kind of degenerate delimited range where the delimiting predicate always returns false. When we’re trying to reach infinity, our work is never done. With that in mind, let’s implement an infinite range of integers starting at some value and ending never. It’s described below.

struct iota_range { private: int i_; public: using const_iterator = struct iterator : boost::iterator_facade< iterator, int const, std::forward_iterator_tag > { private: bool sentinel_; int i_; friend class boost::iterator_core_access; friend struct iota_range; iterator(int i) : sentinel_(false), i_(i) {} bool equal(iterator that) const { return sentinel_ == that.sentinel_ && i_ == that.i_; } void increment() { ++i_; } int const & dereference() const { return i_; } public: iterator() : sentinel_(true), i_(0) {} }; constexpr explicit iota_range(int i = 0) : i_(i) {} iterator begin() const { return iterator{i_}; } iterator end() const { return iterator{}; } constexpr explicit operator bool() const { return true; } };

With this range, we can do this:

// Spew all the ints. WARNING: THIS NEVER ENDS! for( int i : iota_range() ) std::cout << i << 'n';

iota_range is a forward range; that is, its iterators model the ForwardIterator concept 1. They store both an integer and a Boolean signifying whether the iterator is a sentinel or not. The range’s begin iterator is not a sentinel, the end iterator is. Therefore, they will never compare equal, and we’ll count integers … forever!

A Funny Thing Happened on the Way to Infinity

What you’ll find when you use this range in your code is that some things will work as you expect and other things will spin off into hyperspace and never come back. Take a very simple example: std::distance . Presumably, you won’t be foolish enough to do this:

iota_range iota; // Oops! auto dist = std::distance(iota.begin(), iota.end());

What’s less clear is that you should never, ever, under any circumstance, pass this range directly or indirectly to any algorithm that does binary searching, including binary_search , lower_bound , upper_bound , and equal_range — despite the fact that iota_range is, in fact, a sorted forward range. Think about it: binary searching is a divide-and-conquer algorithm. Dividing an infinite range yields — surprise! — an infinite range. If you pass an iota_range to any of these algorithms, go get yourself a cup of coffee. You could be waiting a while.

Performance Problems

If you read the last blog post about delimited ranges, maybe you cringed a bit when you saw the implementation of iota_range::iterator::equal . It is our intention that an iota_range ‘s iterator will never, ever finish iterating, so the termination condition should be a constant expression. Instead, we have this:

bool equal(iterator that) const { return sentinel_ == that.sentinel_ && i_ == that.i_; }

That’s two runtime checks when it should be zero! As I showed last time, this can have a disastrous effect on the quality of the generated code.

Possibly Infinite Ranges

Infinite loops are one problem with infinite ranges, but there’s another more subtle problem, and unfortunately it already exists in the Standard Library. Take our old friend (and my favorite punching bag) std::istream_iterator . It is an input iterator, so it’s required to have an associated difference_type . In “Elements of Programming,” Alexander Stepanov (the father of the STL and of Generic Programming) says this about an Iterator’s difference type:

DistanceType returns an integer type large enough to measure any sequence of applications of successor allowable for the type. 2

For istream_iterator ‘s, the difference_type is std::ptrdiff_t . Now, consider the following code:

std::istream& sin = ...; std::istream_iterator<char> it{sin}, end; std::ptrdiff_t dis = std::distance(it, end);

This is perfectly reasonable and valid code. It pulls characters out of the istream , counts them, and discards them. Now, imaging sin is pulling characters from the network, and that this code runs for days, pulling billions and billions of characters off the net. What happens when a ptrdiff_t isn’t big enough to hold the result? Answer: undefined behavior. In practice, you’ll get garbage, but in principle, anything could happen.

To me, that’s a little disconcerting. An iterator’s difference_type should be big enough to hold the distance between any two iterators. Since input streams are unbounded in principle, there is no scalar signed integer type that’s big enough. Huh. We’re forced to conclude that the validity of istream_iterator ‘s increment operation is limited by the size of its difference_type , or that istream_iterator ‘s difference_type is wrong. Again: Huh.

Summary, For Now…

Infinite ranges are useful, but they have real problems given the current definition of the STL. You might think that disallowing infinite ranges avoids the problem, but it’s more fundamental than that. In fact, some problems exist today. It’s hard to fix the difference_type overflow issue in the STL today (apart from telling people to be careful), but it’s worth considering whether a new range-based interface can help. (So as not to raise expectations, I’ll say now that this is a vexing problem that I don’t yet have a great solution to.)

Summing up, here are the issues I’ve identified so far with STL-ish pair-o’-iterators-style ranges:

Delimited and infinite ranges generate poor code

They are forced to model weaker concepts than they might otherwise

Also, they’re awkward to implement

It’s too easy to pass an infinite range to an algorithm that can’t handle it

Possibly-infinite ranges can overflow their difference_type

In the next installment, I’ll describe the conceptual foundations of my new range library that strikes at the root of these problems. Stay tuned.

1. Actually, this is a bit of a lie. Forward iterators aren’t supposed to return references to objects inside them. Please ignore this for the sake of discussion.↩

2. Stepanov, A; McJones, P. Elements of Programming. Addison-Wesley. 2009.↩