The Mets’ greatest comparative advantage is down a run with one out and a runner on third (+16.5%), while their weakest scenario is down two with nobody out and a runner on third (-44.6%).

You may have noticed that a lot of the larger discrepancies for both teams take place when there is only a runner on third base, often with nobody out. Is there something inherent to these situations that makes our hometown teams particularly good or bad when compared to generic figures?

Yes and no. Are the Yankees or Mets, by some complex function of their franchises’ make-up and strategy, fundamentally unique teams when faced with a single runner on third base? Most likely not. But these scenarios do have an aspect that explains their extreme percentages: rarity.

Getting a runner to third base, especially without recording an out, is a difficult task. Either a batter must smack a triple to lead off an inning, far and away the most infrequent type of basehit, or he must get on base and manage his way to third through a combination of speed and guile, all without the aid of an out to push him there. The relative rarity of these scenarios allows unusual outcomes to swing a heavier stick, no pun intended, when influencing the final numbers. An unlikely win for the Mets moves the percentages more in a sparsely populated event pool.

Getting to third base with zero outs has been difficult for over a century.

Moreover, infrequent events put a greater emphasis on the factors that are otherwise sanded away by the immense volume associated with more commonplace developments. Because of the large sample of, let’s say, being down one run with nobody out and the bases empty, we don’t really have to consider a whole host of more granular details — the quality of hitter in each instance, how tired the pitcher is, the temperature, the angle of the sun as it reflects off a fan’s glasses in the third row, etc. — because they all come out in the wash during the endless spin cycle of baseball’s centuries-long history.

But if we only have a handful of events to work with, suddenly these situational factors become much more significant. If Babe Ruth was facing a drunken, one-eyed pitcher in one of those seldom runner-on-third situations, then of course the Yankees’ odds should be higher.

We can extrapolate this idea beyond just runner-on-third scenarios. We have data for the frequency of each scenario — we needed it to compute the probabilities — so we can examine the relationship between the number of times certain scenarios occur for each New York team, and the degree to which that team’s performance deviates from what’s expected (defined in absolute terms, since we don’t care if the Yankees or Mets are better or worse than average, just if they’re different).

Specifically, we can graph our independent variable, the frequency of each situation, against how much it deviates from the the league-wide average. Here is the idea applied to the Yankees’ data:

Without further manipulating the datapoints or imposing a trendline, the pattern we suspected is readily available: the Yankees escape convention only in low-frequency scenarios. As we move along the horizontal toward more commonplace situations, the amount by which the Yankees’ performance strays from the expected is magnetized toward zero.

If we redefine our horizontal axes logarithmically so that equal real estate exits between powers of 10, we can more easily see where the datapoints live:

You may be accustomed to seeing lines cut through clumps of data in attempt to predict future values. The nature of the information above, however, makes it so that such a model would have poor foresight, compromised by the points sprinkled in the lower left quadrant of the plane that ruin any semblance of a consistent pattern. Instead. something more like a “line in the sand” has been superimposed that the data (save one brave soul) don’t dare cross.

Here lies a valuable statistical lesson: information can be inferred from where data doesn’t exist in addition to where it does. In this case, even though our variables would be difficult to replicate in a model, we can still provide a function that says, “tell me how often a scenario occurs and I’ll give you the maximum amount it can deviate from the expected.” (If you’re wondering, that function is y = 0.45–0.145log(x))

The equivalent graph can be drawn up for the Mets:

Here’s the kicker: the Mets’ red line isn’t based on the Mets data. It’s grafted straight from the Yankees chart, giving credence to the idea that its equation, or something very similar to it, might have universal applicability.

The takeaway evident in all of these charts has been worn out by statisticians for years: add to your sample size and your data will start to mimic established patterns. However, the majority of people don’t go to enough baseball games to benefit from a lesson like that. The casual fan might watch their favorite team a handful of times per season; many more will see just a single game.

So don’t be the guy who breaks out a creased win-percentage chart from his back pocket in the ninth inning and declares it highly logical to go home, because a sample size of one is an invitation for surprise. Case in point: for my 12th birthday I went to a Yankees-Padres game at the old stadium. The Yankees were down 2–0 with nobody on and two outs in the bottom of the ninth. According to our original table, they would win a game like this about once every fifty tries.

But they did, and in spectacular fashion, using back-to-back home runs against a legendary pitcher to tie the game. I nearly fell out of the bleachers in excitement. And more than a decade later I’m still thinking and writing about it, while I’ve forgotten about every crappy subway ride I took home after Yankee losses.

So stay through the ninth folks. You never know…