Shanshan Ding is a data scientist at Gilt Groupe and was an Insight Data Science Fellow from the September 2013 cohort. Shanshan has also served as an adjunct instructor in NYU’s data science program. As a huge baseball fan, she provides her thoughts after recently contemplating the losing streak of the Chicago Cubs. This content originally appeared on her personal website with the accompanying code.

The Chicago Cubs have not won a World Series Championship since 1908 and have just crashed out of the 2015 playoffs. As baseball lore goes, the team is cursed, and a World Series victory would be an event of cosmic proportions, possibly signaling end of times. The curse seems unbreakable. The last time they were this close, the curse struck again in the infamous Steve Bartman incident. The drought is the longest championship drought of any major professional North American sports team. This begs the question: just how unlikely is an 108-year drought?

For simplicity, let us first assume that in any given year, each team has equal chance of winning the World Series (WS). From the Cubs’ point of view, the probability of it not winning the WS since 1907 is about 0.37%, i.e. pretty unlikely. However, what is the probability that some (at least one) team hasn’t won the WS since 1907? By my simulation1, it is about 5.8%, i.e. not probable, but not impossible. This is similar to the Birthday Paradox: if you walk into a room of 30 people, the probability that someone has the same birthday as you is low (~7.6%), but the probability that some pair of people share the same birthday is quite high (~70%).

While we are on the topic, the probability of at least 2 teams not winning the WS from 1919 to 2003, as was the case with the Cubs and the Red Sox, is a little over 1%. Not something anyone would have predicted in 1918, but had the two met in the 2003 WS (both teams advanced to their respective league’s Championship Series that year), the world likely would not have ended either.

We should remember though that while the Cubs and the Red Sox received all the attention for their droughts, there was a third team that did not win a championship between 1919 and 2003: the Chicago White Sox, who, before winning the WS in 2005, last won in 1917. Together, this trio of futility does push the limits of incredulity, as simulation shows that the probability of at least three teams being concurrently trapped in 85-year droughts is less than 0.04%.

By the way, as a die-hard Cleveland Indians fan, what is the probability that, conditioning on the Cubs’ drought, some other team hasn’t won the WS since 1948, as is the case with the Indians? It is actually about 52%! So sure, the Indians have been very unlucky, but it’s almost as if a flip of a coin decided that some team has to be that unlucky, and that team just happens to be the Indians (well, that, and decades of mismanagement).

Before we leave this discussion, I want to address what happens if, as it happens in reality, not all teams have equal chance of winning the WS. Major League Baseball was originally comprised of 16 teams, and more teams were gradually added starting in 1961 to make up today’s 30. It is fairly clear that the probability of an 108-year drought decreases if the expansion teams have had less than uniform odds of winning the WS. (This is a reasonable assumption, as a new franchise needs time to build a contending team and a fanbase; I counted only 9 of the 54 WS since 1960 being won by expansion teams.) On the other hand, if we fix the winning odds of the expansion teams, then I believe that the probability of drought is minimized if the 16 original teams are always equally likely to win the WS. I do not have a rigorous proof for this, but the extreme example where one team always has zero chance of winning suggests that the probability of drought increases as distribution on winning shifts away from uniformity (over the original teams). Thus assigning zero chance of winning to the expansion teams and 1/16 chance of winning to the original teams every year yields a lower bound on the probability in question. Simulation puts the probability of at least one 108-year drought at about 1.5% in this model.

While the painful losing streak continues at least another year for Cubs’ fans, they should gain some solace in the fact that the years of futility are not really all that mathematically remarkable in and of itself. As the proverbial saying goes, there is always next year, and don’t worry, the world will not end if the Cubs win the 2016 World Series.

1 One might ask why, if we know pp, one team’s probability of not winning the WS since 1907, we cannot simply compute the probability of some team not winning the WS since 1907 with the formula 1−(1−p)^(numTeams). The reason is that teams’s records are not independent. To be fair, the dependence is very weak and the aforementioned formula does yield about 6% as well. ←