On September 30, 2014, the TTC’s Bloor-Danforth subway suffered a shutdown from just before 8:00 am until about 3:00 pm on the segment between Ossington and Keele Stations. The problem, as reported elsewhere, was that Metrolinx construction at Bloor Station on the Georgetown corridor had punctured the subway tunnel. While the weather was dry, this was not much of a problem because, fortunately, the intruding beam did not foul the path of trains. However, rain washed mud into the tunnel to the point where the line was no longer operable.

In the wake of the shutdown, there were many complaints about chaotic arrangements for alternate service, although any time a line carrying over 20k passengers per hour closes, that’s going to be a huge challenge. The point of this article is not to talk about that incident, but to something that showed up the next day.

According to the TTC’s internal measure of service quality, the BD line managed a 92% rating for “punctual service”. This is lower than the target of 97%, but that it is anywhere near this high shows just how meaningless the measurement really is.

The basic problem lies in what is being measured and reported. Actual headways at various points on the line and various times of day are compared to a target of the scheduled headway plus 3 minutes. This may look simple and meaningful, but the scheme is laden with misleading results:

On the subway during peak periods, service is “punctual” even if it is operating only every 5’20”, or less than half the scheduled level. Off-peak service, depending on the time and day, could have trains almost 8 minutes apart without hurting the score.

There is no measurement of the actual number of trips operated versus the scheduled level (in effect, capacity provided versus capacity advertised). Complete absence of service has little effect because there is only one “gap” (albeit a very large one) after which normal service resumes.

There is no weighting based on the number of riders affected, period of service or location. A “punctual” trip at 1 am with a nearly empty train at Wilson Station counts the same as a train at Bloor-Yonge in the middle of the rush hour. There are more off-peak trips than peak trips, and so their “punctuality” dominates the score.

An added wrinkle is that the TTC only includes in its measurements periods of operation when the headway is unchanged. With the service being so often off-schedule, it would be difficult to say just what the value of “scheduled headway plus 3” actually is at specific points along the route during transitional periods.

All the same, we have a measurement that has been used for years in Toronto and it gives a superficially wonderful score. Sadly, the formula is such that falling below 90% would require a catastrophic event, and some silt in the tunnel does not qualify.

I have talked with the TTC’s Deputy CEO Chris Upfold about this problem before, and do know that the TTC is working on a better way to measure service quality. Upfold is quite open about the challenges:

The numbers are credible in the way they are measured. They are the true reflection of the measure. It’s the measure that is . It’s a standard measure across all properties in N.A. But that doesn’t make it better. The easy way to interpret 92% (and I know you know this) is that 92% of the trains we ran were within 6 minutes of each other. That is credible as a result for a very bad measure. If we ran 1/2 our trains from 7-10 (or so) we could still get 99%. But we know how bad it would be. We’ve tal‎ked about this many times and, trust me, no one at the TTC slaps themselves on the back when we even hit our 96 or 97. [Private email, October 2, 2014, used with permission]

That may be the case, but the real issue is that this is what the TTC publishes for public consumption, and the numbers are taken as gospel by politicians including members of the Commission. A more subtle issue will be that any new measurement scheme will almost certainly result in lower (even if more meaningful) scores, and that will set off a round of questions about “why are we so bad now”.

It’s rather like reporting on the weather by scoring for days when the sun shone, somewhere, for at least five minutes, and then wondering why people complained about the rain.

Meanwhile Up On The Street

Things are rather different with surface routes where the scores for bus and streetcar operations were 60% and 66% respectively versus targets of 65% and 70%. Without question the bus network would have been affected on September 30th by the number of vehicles poached from various routes to provide subway shuttle service.

All the same, the idea that “reliable” service consists of being within three minutes (either way) of the scheduled headway roughly 2/3 of the time is quite laughable. Put another way, if the service can be outside of the target range 1/3 of the time, then over the course of a week’s commuting or 10 trips, the likelihood that at least one trip won’t be “reliable” is over 98%. (This does not allow for treating each link of a trip as a separate opportunity for unexpected gaps.) Indeed, the probability falls below 50% simply for one day’s travel.

[The math here is straightforward, akin to flipping a coin, but with an uneven possibility of heads or tails. Suppose that “heads” (acceptable service) comes up 2/3 of the time. The chance that this will happen for one trip each way is 2/3 times 2/3 or 4/9 (44%). For ten trips to all come up heads (2/3 to the tenth power), the result is under 2%. In other words, the reliability of every trip must be quite high to avoid a situation where the compound effect of a substantial probability of “tails” doesn’t guarantee at least one below-standard trip a week or more.]

Quarterly reports on route performance have been discussed and tracked on this site, and some of the numbers are very uncomplimentary to the TTC’s ability to manage service, even in the middle of the night.

There are numerous problems in the measurement of service quality that have been discussed before here, but the overwhelming point here is that the “standard” was chosen to match the numbers the service was achieving historically, not on the basis of carefully considered policy decisions about what the TTC’s goal should be or an analysis of what might be achieved.

Surface routes operate on a wide variety of headways, and they are also subject to short turns, something that is comparatively rare on the subway.

Riders are subjected to a variety of problems:

As with the subway, bus and streetcar routes with scheduled frequent service can be severely bunched or be missing large numbers of vehicles without violating the ±3 minute headway target.

Reported values are averaged over various time periods and locations on a route with the result that very poor service at certain times and locations can be masked by better service elsewhere.

A span of six minutes in acceptable headways can produce very wide ranges in vehicle crowding, with a good chance that some capacity will be wasted on vehicles running close behind their leaders.

Irregular vehicle spacing leads to slow, heavily loaded “gap” cars and buses, followed by one or more lightly filled vehicles. Riders always try to pack on the first car lest the second one be short-turned and leave them waiting in yet another gap.

Where headways are not so frequent, riders are much more susceptible to problems with on-time performance. However, the TTC does not measure this, nor does it make much effort to ensure that vehicles on infrequent services actually operate “on time”.

Scheduled departure times and vehicle spacing are commonly ignored, as are intermediate time points (something one might reasonably depend on for less frequent service).

The TTC route map advertises many routes as having 10 minute or better service (almost all streetcar lines and many major bus routes). This is not qualified by any indication that much service may not ever reach parts of these routes, and what does can be quite erratic.

The reasons for this situation are complex. Some arise from a long-standing abdication of the need to actively manage service in the face of “traffic congestion”, a long-standing TTC blanket excuse for all that ails humankind. Some are a direct result of “TTC culture” that does not monitor or penalize operators for providing erratic service. Some is bad scheduling with running times that bear no relationship to typical conditions. Some is the basic fact that the primary “job” of route supervisors is to keep vehicles, or more importantly, operators “on time” to minimize premium overtime pay.

In such an environment, any measurement system that does not produce information directly related to the quality of service seen by the riders is, at a minimum, worthless if not outright deceptive.

As with measures of subway service, the TTC is working on a system to provide more representative information about surface routes. However, the exact details have not yet been announced. One scheme they do look to adapt from London (UK) is a “Journey Time Metric” which would track the length of various sample trips on the network as an overall index of network behaviour.

That’s a start, but we need more. At a minimum:

Statistics should be available at a route by route level with breakdowns by route segment and time of day. Telling people that the “average” service scored 80% is meaningless to someone who just waited for over 20 minutes on a route with a scheduled 5 minute headway.

Statistics should remain online for a period so that more than “yesterday” is available for review.

Weekends, now omitted from the process, should be included. Some of the route analyses I have published (as well as much unpublished material) shows that weekend evenings have some of the worst service quality.

The statistics should include a comparison of the number of trips actually operated at various points against the scheduled count. The points should be chosen so that the figures are not distorted by trips that short turn just beyond the reference points.

On time performance should be measured for any service with a headway over 10 minutes, and this should be done not just at terminals but at major timepoints enroute.

This sounds like a vast amount of data, and if this were a daily printed report, it would be a lot to compile. However, with computer systems, all of this number crunching can be automated once the basic parameters are in place, and the information can be placed online for review by anyone who is interested.

The TTC owes its customers, the riders, much better information about the quality of service they receive, if only so that management’s feet are held to the fire to explain and correct poor performance.