The Pearly Gates club never closes. Its public entrance, a revolving door, just keeps on spinning. With each rotation some punters enter and others leave. The club’s owners would like to track this traffic. Specifically, they’d like to know the maximum increase in people entering the club over a given period.

The starting point is to track the people who enter/leave with each spin of the door. Here’s a 5 minute sample of that information. Negative numbers mean more people left than entered during a particular cycle.

Entries Log 0 1 2 -3 3 -1 0 -4 0 -1 -4 2 4 1 1 3 1 0 -2 -3 -3 -2 3 1 1 4 5 -3 -2 -1 ...

Here’s the same information plotted on a graph.

The archetypal problem we’d like to solve can be stated:

Given a sequence of numbers, find the maximum sum of a contiguous subsequence of those numbers.

As an example, the maximum sum contiguous subsequence of 0, -1, 2, -1, 3, -1, 0 would be 4 (= 2 + -1 + 3).

This problem is generally known as the maximum sum contiguous subsequence problem and if you haven’t encountered it before, I’d recommend trying to solve it before reading on. Even if you have encountered it before, I’ll invite you to read on anyway — it’s well worth another look.

The maximum sum contiguous subsequence problem appears in Jon Bentley’s “Programming Pearls”. He first presents a brute force solution which examines all possible contiguous subsequences of the initial sequence and returns the maximum sum of these subsequences.

A Python implementation might read:

def generate_pairs(n): "Generate all pairs (i, j) such that 0 <= i <= j < n" for i in range(n): for j in range(i, n): yield i, j def max_sum_subsequence(seq): "Return the max-sum contiguous subsequence of the input sequence." return max(sum(seq[i:j]) for i, j in generate_pairs(len(seq) + 1))

It’s a straightforward piece of code, though note the + 1 which ensures that we slice to the end of seq , and also that we include empty slices, which sum to 0 , handling the case when every item in the sequence is negative. The trouble is, the algorithm is of cubic complexity: to process just 6 hours of logged activity takes over 2 minutes on a 2GHz Intel Core Duo MacBook, and the cubic nature of the algorithm means we’d quickly fail to process more substantial log files in real time.

A simple optimisation eliminates the repeated calls to sum by accumulating the input sequence — the red line in the graph above. Subtracting element i-1 from element j of this cumulative sequence gives us the sum of elements in the range i, j of the original sequence. We won’t study the code for this quadratic solution — it doesn’t add much to our analysis. Again, some care is needed to avoid fencepost problems.

We won’t look at the divide-and-conquer NlogN solution either. It’s hard to understand, and we can do far better.

There is a linear solution. The idea is to scan the sequence from start to finish keeping track of maxsofar , the maximum sum of a contiguous subsequence seen so far, and maxendinghere , the maximum sum of a contiguous subsequence which ends at the current position. Bentley’s pseudo-code reads:

maxsofar = 0 maxendinghere = 0 for i = [0, n) /* invariant: maxendinghere and maxsofar are accurate are accurate for x[0..i-1] */ maxendinghere = max(maxendinghere + x[i], 0) maxsofar = max(maxsofar, maxendinghere)

This translates directly into Python.

def max_sum_subsequence(seq): maxsofar = 0 maxendinghere = 0 for s in seq: # invariant: maxendinghere and maxsofar are accurate # are accurate up to s maxendinghere = max(maxendinghere + s, 0) maxsofar = max(maxsofar, maxendinghere) return maxsofar

Now, this is a fabulous solution. Bentley describes it as subtle. Such a succinct code snippet hardly looks subtle, but I agree, the loop body does take a bit of understanding:

maxendinghere = max(maxendinghere + s, 0) maxsofar = max(maxsofar, maxendinghere)

Why does this work?

Well, essentially maxendinghere is what’s accumulating the subsequences — it keeps rolling the next element into itself. Should this accumulated sum ever become negative we know that the subsequence-which-ends-here we’re currently tracking is worse than the empty subsequence-which-restarts-here; so we can reset our subsequence accumulator, and the first clause of the loop invariant still holds. Combine this with the observation that maxsofar tracks peaks in maxendinghere and we’re done.

The loop-invariant comment provides a good example of how comments can help us understand an algorithm, even though the code is minimal and the variable names are well-chosen.

I prefer to think of this problem in terms of streams — lazily evaluated sequences. Think of our log file as generating a stream of numbers:

... 0 1 2 -3 3 -1 0 -4 0 -1 -4 2 4 1 1 3 1 0 -2 -3 -3 -2 3 1 1 4 5 -3 -2 -1 ...

The first thing we do is transform this stream to generate another stream, the cumulative sum of numbers seen so far. It’s an integration of sorts. You’ll remember we already used this stream, or an in-memory version of it, in our quadratic solution to the problem: the difference between points on it yields subsequence-sums.

We generate the accumulated stream from our original stream like this:

def stream_accumulate(stream): total = 0 for s in stream: total += s yield total

The graph below samples the first five minutes of this stream. The red line accumulates values from the pale grey line.

These accumulated numbers represent the number of members who have entered the club since we started tracking them. On our graph, the maximum sum contiguous subsequence is simply the greatest Y-increase between any two points on this graph. X’s mark these points on the graph above. (Note: it’s not the Y-range of the graph we want since our X-values are time-ordered, and we require X1 <= X2).

A second transformation yields the floor of the accumulated stream.

import sys def stream_floor(stream): m = 0 for s in stream: m = min(m, s) yield m

(Note that, for our purposes, the floor of the stream isn’t exactly the stream of minimum values taken by the stream — we enforce a baseline at zero. It would be better to allow clients of this function to supply an optional baseline value, but I wanted the simplest possible code that shows the idea.)

Here’s a graph plotting the accumulated entries alongside the floor of these entries.

We’re very close to what we want now. We can track Y-increases on the graph just by generating the difference between the accumulated stream and its floor — the shading on the graph.

Here’s an implementation of stream_diff . We can’t just plug a minus sign “-” into the mapping function, so we have to use the less wieldy operator.sub .

import itertools import operator def stream_diff(s, t): return itertools.imap(operator.sub, s, t)

Alternatively, we could generate the new stream with an explicit loop:

import itertools def stream_diff(s, t): for ss, tt in itertools.izip(s, t): yield ss - tt

The final graph shows us the difference between the accumulated entry count and its floor. I’ve also added the ceiling of this stream as a thick red line (I’m sure you can figure out how to implement stream_ceiling ), and this ceiling represents the stream of maximum sum contiguous subsequences.

We’ve re-labelled the lines Max-so-far and Max-ending-here because they’re the stream of values taken by the variables maxsofar and maxendinghere during Bentley’s clever solution to the maximum sum contiguous subsequence problem. I think we’re in a better position to understand how this solution works now.

Please don’t imagine these streams are bloated. They may be infinite (remember the Pearly Gates club never closes!) but that doesn’t mean they take up much space. The graphs shown represent snapshots of their activity, and at no point do our presented algorithms actually store a five minute buffer of entries.

A final solution to the maximum sum contiguous subsequence problem reads like this. We’ve pushed the general purpose stream transformation functions into a separate module, stream.py .

import itertools import stream def max_sum_subsequence_stream(ss): "Return the stream of max sum contiguous subsequences of the input iterable." accu1, accu2 = itertools.tee(stream.accumulate(ss)) return stream.ceil(stream.diff(accu1, stream.floor(accu2, baseline=0))) def max_sum_subsequence(ss): "Return the max sum of a contiguous subsequence of the input iterable." return stream.last(max_sum_subsequence_stream(ss))

The iterable supplied to max_sum_subsequence has its last value read, and should therefore be bounded if we want the function to return. We haven’t supplied arguments to extract a portion of this iterable (to generate maximum subsequences for the club on a particular day, for example) because that’s what itertools.islice is for.

Note that max_sum_subsequence_stream() may be more useful to clients than max_sum_subsequence() . Suppose, for example, we’re only interested when the maximum sum subsequence exceeds 100. We can do this directly by connecting itertools.dropwhile() to our function.

def max_subseq_exceeds(seq, limit=100): max_sub_s = max_sum_subsequence_stream(seq) return itertools.dropwhile(lambda s: s <= limit, max_sub_s)

Perhaps we’d like to know if the maximum sum subsequence reaches a plateau; that is, it stays on a level for a while.

Here’s the stream module.