Two recent comments on different topics got me thinking about averages, and why people like to talk about them more than they like hearing about them.

Toronto transit expert Steve Munro made this comment on the familiar perils of transit operations in that city:

In Toronto, the TTC reports that routes have average loads on vehicles, and that these fit within standards, without disclosing the range of values, or even attempting any estimate of the latent demand the route is not handling because of undependable service. Service actually has been cut on routes where the “averages” look just fine, but the quality of service on the street is terrible. Some of the planning staff understand that extra capacity can be provided by running properly spaced and managed service, but a cultural divide between planning and operations gets in the way.

When it comes to on-time performance, everyone understands that the average isn’t a useful concept. If your buses are ten minutes early half the time and ten minutes late the other half, then you could say that on average they’re right on time. We’re all smart enough not to fall for that.

But it’s still common to hear reporting of average load, as Steve mentions. Average load is the total number of passengers through a point divided by the number of buses/trains that carried them. It has some uses in transit planning as a way of talking about ridership patterns, but it’s not a good way of describing the customer experience. To do that, you’d have to look at how often you have incidents that the customer will hate, such as crush-loading and worse yet, passing customers at stops for lack of room. That’s how we talk about on-time performance: percentage of trips that are more than 5 minutes late. So you should also be reporting the percentage of trips that are crush-loaded or begin passing up customers. (How do you count passed-up customers? A good question for another post.)

Now obviously, if you want to describe how your system looks to your customers, you should also weigh those measurements of overloading by the number of people who experience them, just as one might expect agencies to do for lateness. If your statistics aren’t weighted that way, then, again, you may be describing your operations but you’re not describing your customer experience.

Recently, commenter Calwatch mentioned a similar scourge, the bizarre but commonplace notion of “average frequency,” which sometimes undermines the urgent work of frequent network mapping:

I’m not sure Calwatch is right about this. Here’s the exact disclaimer from the Los Angeles MTA map:

The bus routes on this map run at least every 12 minutes on weekdays throughout the day. Where Metro Rapid and Metro Local lines run together, service is available every 12 minutes or better at Metro Rapid stops. At intermediate local stops, service may operate less frequently.

I can understand that as meaning “The Metro Rapid really does run every 12 minutes, and if there are underlying locals, those locals may be less frequent.” That would be fair, though it raises the question: So why are the less-frequent locals shown on the map?

But Calwatch claims that when they say “every 12 minutes” they merely mean “five buses per hour.” If that’s true, and I haven’t verified it, then they would be guilty of implicit averaging. Five buses an hour could mean five buses bunched at the same time each hour, with hour-long gaps between them. And yes, you could say that on average, that’s a bus every 12 minutes!

So remember:

Nobody cares about average frequency, any more than they care about average lateness or average crowding!

The question customers ask is “what is the worst case I’ll typically experience?” When an agency says the buses come “every 12 minutes” this customer is going to hear “OK. I’ll never wait longer than 12 minutes.” Is that customer going to get burned by trusting the 12-minute map? I’m sure Los Angeles commenters will fill us in. A comment from someone at the agency would be even more helpful.

I’ve had this conversation with transit agency staffs when doing jobs that required me to map the frequency of the existing system. More than once, I’ve asked for frequency data and been given data on “scheduled trips per hour.” I’ve had to go back and say: Frequency is about maximum waiting time. By its very nature, it’s a maximum, not an average! So tell me the maximum, worst-case scheduled gap between consecutive trips!

So when you hear the word “average” in an agency’s statements, or even if you see an implicit average like “five trips per hour,” ask yourself: Is the average what I care about as a customer?