The magic formula requires some delegates on top of that

Prior to the South Carolina primary, I explored Bernie Sanders' potential path to victory. The scenarios I examined, along with state polls of South Carolina and the rest of the South, were based on the polling data that showed Sanders doing better with African American voters and doing better in the South than he actually did.

Then after South Carolina, I also looked into what it would take for Sanders to remain competitive if he Clinton dominated across the rest of the Deep South by as much as she did in South Carolina. The actual results, at least outside of the South, have not been too far off of that. But in the South, Clinton generally exceeded even that.

So to be frank, after Super Tuesday I had pretty much given up any realistic hope that Bernie Sanders could still win the Democratic nomination. Clinton's victories across the South were sweeping and devastating to Sanders’ chances, in terms of delegates. The delegate hole Sanders found himself in seemed like too much for Sanders to realistically overcome.

I was upset by Kos’ declaration that the Democratic primary would be over on March 15 — but not so much by the declaration itself as by his choice of March 15 as the particular cutoff date. Kos is neither stupid nor naive. He knew and still knows full well that by March 15, the entire South will have voted, and this would be Clinton’s high point in any scenario in which Sanders might win the nomination. In any scenario in which Sanders were to win, he would do so by coming back as the more demographically favorable states for him outside of the South voted after March 15.

But even so, I could understand the logic. The delegate math was (and remains) very very difficult for Sanders. And given the magnitude of Clinton’s wins in the South on Super Tuesday results, and given polls that showed Clinton with large leads in other states like Michigan, it seemed like it would take a miracle for Sanders to be able to come back. Not the sort of thing worth wasting much time over.

So I stopped even bothering to update my model, and reallocated my time to other things. But over the next few days after Super Tuesday, Kansas, Nebraska, and Maine voted — all giving Sanders large victories. That didn't entirely surprise me, although I also wouldn't have been surprised if they were much closer.

Still, those were only caucuses, and by themselves victories like that in caucus states would not be enough for Sanders to come back. I could see the polls out of primary states like Michigan, Illinois, and Ohio, and I figured those would be the nail in the coffin.

And then… Michigan actually voted.

And totally upending all expectations, Sanders won. The polls were not just a little wrong. They were way wrong. Again. Much as they were in South Carolina — but in the opposite direction.

Two things have changed since South Carolina. On the one hand, Clinton has built up a massive lead thanks to her southern landslides. On the other hand, we do now know for sure that Sanders can win strong victories in caucus states, and that he can also compete and win in large industrial states (like Michigan). And we have seen that Sanders can continue to raise the money he needs to compete and win from his small donor army. So while the possibility that Sanders would need a big comeback to win has now become a certainty, he has also shown that he can win the sorts of states he will need to win (Kansas, Maine, Michigan) in order to make a comeback, by margins that at least begin to approach what he would need to make a comeback.

So now it is worthwhile to give the race a second look. Sanders once again has a legitimate shot at winning — even if it is just a very difficult and uphill one. So what, now, would Bernie Sanders' path to victory have to look like?

Democratic primary polls have been consistently wrong in systematic ways

Before we can gauge what sort of path to victory Sanders may still have, we need to figure out what is so much the matter with polling this cycle.

So far, state polls have been consistently wrong in the Democratic primary. But they have not been wrong in an unbiased way (in the statistical sense of the word “unbiased”). It is not just that there has been a large variation between the actual outcome and the outcome predicted by polls. It is that this variation has been systematically predictable and non-random.

The ways in which polls have been wrong in different states can largely be explained by three variables:

South vs. Non-South

Primary vs. Caucus

Turnout

In the charts below, we will compare 2008 turnout and 2016 turnout, and will also compare the actual 2016 results with what the Real Clear Politics polling average predicted. Let's take a look.

Early States

Early states are different from later states because the campaigns put so much effort into them. Campaigns set up shop for months, and candidates campaign extensively in them. Because of this, and because there are so few early states, I won’t try to make generalizations about them. They are what they are.

Pollsters are also more used to polling primaries and caucuses in these states, so they may be more accurate. And because there are more previous competitive primary/caucus elections to look at, likely voter models are more likely to be accurate. But even with that, polls were not all that accurate in the early states — especially in New Hampshire and South Carolina.

To get a better idea of what the race will look like going forward, we need to concentrate primarily on states that voted on Super Tuesday or later.

Southern States

In southern states, turnout was way down and Clinton’s margins systematically outperformed what polls were predicting.

Why did this happen? In general, the reason is because polls were assuming substantially higher turnout in the South than there actually was. Essentially what happened is that turnout was way down across the South. African American turnout was generally down because Obama was not on the ballot. But White turnout was down by even more, for several reasons. As a result, low turnout across the South was a perfect storm for Clinton. It meant that there was higher African American vote share, and it meant that the African Americans who actually did vote were disproportionately older and much more demographically favorable for Clinton than polls had assumed. So not only was African American vote share higher than predicted because white turnout had dropped by so much, but also the African Americans who did vote backed Clinton at even higher than predicted rates.

I explain this in greater detail here.

There are a few states that have special circumstances which deserve special mention:

Louisiana — In Louisiana, turnout was only down by 19%, which is much less than in Deep South states. Why? Because the Republican primary was closed. There remains high Democratic registration of Conservadems, and those voters could only vote in the Democratic primary. A precinct results map show that, as in Oklahoma, Bernie Sanders won those voters. That’s why Clinton didn't win by as much as in states like Mississippi. The difference between Oklahoma and Louisiana is that Louisiana has a much higher African American population.

— In Louisiana, turnout was only down by 19%, which is much less than in Deep South states. Why? Because the Republican primary was closed. There remains high Democratic registration of Conservadems, and those voters could only vote in the Democratic primary. A precinct results map show that, as in Oklahoma, Bernie Sanders won those voters. That’s why Clinton didn't win by as much as in states like Mississippi. The difference between Oklahoma and Louisiana is that Louisiana has a much higher African American population. Oklahoma — As in Louisiana, the Republican primary was closed. That meant that registered Democrats and Independents could not go vote for (or in some cases against) Trump. In addition, the Democratic primary was open. That meant that the only primary registered Independents could vote in was the Democratic primary. And they voted for Sanders. As a result of these factors, Sanders outperformed the polls in Oklahoma and won by a solid margin.

— As in Louisiana, the Republican primary was closed. That meant that registered Democrats and Independents could not go vote for (or in some cases against) Trump. In addition, the Democratic primary was open. That meant that the only primary registered Independents could vote in was the Democratic primary. And they voted for Sanders. As a result of these factors, Sanders outperformed the polls in Oklahoma and won by a solid margin. Virginia — Virginia is a bit of a special case because of Northern Virginia, which is not so "southern.” Although turnout was down and Clinton did overperform the polls, this did not happen to the same extent as in the Deep South. North Carolina may turn out to be similar.

On average in southern states, turnout has been down 32%, and Clinton's margin exceeded what polls predicted by 7.8 points.

If you exclude Oklahoma, Louisiana, and Virginia, turnout was down 37% in the South, and Clinton’s margin exceeded what polls predicted by 11.0 points.

In the Deep South (South Carolina, Georgia, Alabama, Mississippi, and Louisiana), turnout has been down 31% and Clinton's margin exceeded what polls predicted by 15.5 points.

Non-Southern Caucuses

But as soon as we move outside of the south, the factors that created that perfect storm for Clinton are absent. Outside of the South, White voters are not abandoning the Democratic party. And it also seems that African American voters outside the South may be more receptive to Bernie Sanders’ than were African Americans in the South — but it remains uncertain to what extent that is true.

And unlike in southern states, turnout has not gone down in non-southern caucus states. On average, it has been about the same. It is not entirely coincidental that turnout has held up in these caucus states and that Sanders has done very well in all of them.

I suspected before that polls in caucus states (except for in early states, which are a different animal) were absolute junk. It is now very much clear that they were in fact precisely that. So unless a poll comes along, from a very high quality pollster with a methodology that is clearly designed to effectively gauge the caucus electorate — something like the Selzer Iowa poll — we would all be well advised to entirely ignore any and all polls that come out for caucus states. Based on previous results, any such polls seem likely to be way off.

The only thing we can say with confidence based on the results in these five states is that Bernie Sanders is likely to do very well in Non-Southern caucus states. Just how well he will do in them remains to be seen.

On average in non-southern caucus states, turnout has been down by only 2%, and Sanders’ margin has exceeded what polls predicted by 47.0 points.

Non-Southern Primaries

Now, you may just think that the difference between North and South is just a difference between caucus states and primary states. But it hasn’t just been non-Southern caucus states that have underestimated Bernie Sanders. It has been non-Southern states in general, including states with primaries.

There are not many states to work with here. In Massachusetts, turnout was barely down from 2008, and Sanders outperformed the polls by about 5.3%.

Vermont is a special case because it is Bernie Sanders' home state. It is a bit surprising that turnout was down by more than in Massachusetts — but maybe this is because people thought that it would not be competitive and didn't understand the proportional delegate allocation rules.

In Michigan, there was a very large increase in turnout because there wasn't a real primary in 2008 (it was Clinton vs. “Uncommitted”). That most likely is at least some of the reason why Sanders outperformed the polls. But we don't yet know how much of the polling error is attributable to that, and how much may be repeated in other large industrial rust belt states like Ohio and Pennsylvania.

On average, including Michigan, turnout was up by 28% in these states, and Sanders overperformed the polls by an average of 10.2%. If you exclude Michigan, turnout dropped by 9% and Sanders overperformed the polls by 3.9%. However, if you also include New Hampshire, that rises to +7.2%.

In order to have any chance of winning, Sanders will also need to outperform polls in other non-southern primary states. Not necessarily by the same amount as he did in Michigan. But by the sorts of amounts that he did in New Hampshire and Massachusetts.

Static benchmarks for a narrow Sanders win

My presumption is that Clinton will win landslide victories (by about 35 points) in both North Carolina and Florida. That is roughly what we should expect based on the demographics of those states and the results in other southern states. There are some reasons to think that Sanders could be more competitive in those states than in the rest of the south:

Firstly, he is actually campaigning in those states.

Secondly, areas like Raleigh-Durham, Charlotte, Miami, and the I-4 Corridor are not as "southern” as other parts of the south.

Thirdly, Florida didn't have a full recognized primary in 2008, which could mess up likely voter models to the same degree in Michigan.

Fourthly, in Florida the GOP primary is closed. That means that registered Conservadems in North Florida can only vote in the Democratic primary. Under similar circumstances in Oklahoma and Louisiana, those sorts of voters voted for Sanders.

So if Clinton wins by less than 35 points in Florida and North Carolina, that would make things easier on Sanders in the rest of the remaining states than shown here.

Polls show Clinton way up in Ohio and especially Illinois. Is it really plausible that Clinton has such large leads in states like Illinois and Michigan if Sanders lost Ohio? It’s possible. But how much should we trust these polls after Michigan?

The extreme example is the Chicago Tribune poll. Supposedly this was conducted by “Research America.” But if you actually look at their website, it says that they are an intermediary that commissions polls from other firms: “Research!America has been commissioning public opinion polls with leading firms since 1992.” What are the "leading firms” from which they commission polls?

Zogby, Zogby, Zogby, Zogby.

After the polling debacles we have already seen… your guess is as good a mine.

This, is roughly what Sanders would need to win the nomination if all remaining states voted today:

On March 15, winning Ohio by a 9 point margin (as he does here) doesn't help Sanders. Winning it narrowly or losing it narrowly would get him pretty much the same amount of delegates. That is because the delegates are distributed in a particularly arbitrary way in Ohio. Most Ohio Congressional districts have exactly 4 delegates, and it would be very difficult for either candidate to win 3-1 in any of those districts. So the exact scale of the wins or losses matter more in Missouri, Illinois, Florida, and North Carolina.

The chart in general makes two points very clear:

It is not enough for Sanders to just win the western caucus states. He needs to win them big — by the sorts of margins that he won in Kansas and Maine (less so Nebraska). But one can't say that is implausible precisely because he did so well in the caucuses that have already happened, and because there is still time left. If Sanders can get 68% in Kansas on March 5, he can get 70% (or perhaps significantly more) in Washington state on March 26.

But one can't say that is implausible precisely because he did so well in the caucuses that have already happened, and because there is still time left. Secondly, Sanders needs to win most of the large Midwestern industrial states. While he could afford losing Illinois as long as it is not a blowout, Sanders would need to win later states like Pennsylvania by reasonably large margins. But again, because of the results in Michigan, and because there is still time, that possibility cannot simply be outright dismissed. If Sanders can win by 2 points in Michigan on March 8, he can win by 6 points in Pennsylvania on April 26.

The final thing we should note is that this does require some very big wins for Sanders in a lot of states. In many of these states, there is no real indication that Sanders can achieve those sorts of margins right now, if the elections were held there today. But the elections are not going to be held today. And if there is one thing we should all have learned by now from the repetitive polling errors from New Hampshire to South Carolina to Michigan, it is that the unexpected has frequently been happening. Here is how the numbers would add up cumulatively:

The bar is high enough so that it is clear that Clinton holds a big advantage. But it is not yet out of the realm of possibility that Sanders could clear that bar.

A dynamic scenario in which Sanders wins

But if Sanders wins, he is more likely to do so by continuing to increase his support over time, rather than by maintaining a constant level of support through June. So he could do worse than the static scenario indicates in the earlier states (March 15 in particular) provided that he gained additional support over time and did better than shown in the static scenario in later states (June 7 in particular). What might that look like?

Something like this:

In this scenario, Sanders starts off by doing worse than in the static scenario on March 15. Although Sanders wins Missouri, Clinton wins Illinois by about 10 points and narrowly wins Ohio. Given the results in Michigan, it seems to me that there is a good chance that the March 15 results will indeed be in this general range. But then after every election day, Sanders gains 1 point and Clinton loses 1 point up through West Virginia on May 10.

But if you look at the later states, you can see that this would require Sanders to win some very very strong victories in the states that vote in May and June — for example, winning California 62.5% — 37.5%. In an election today, that would not be the result in California. So whether that is really plausible can certainly be questioned, and indeed I would question it myself. But the April, May, and June primaries are not held now. They are held in April, May, and June. And what seems implausible today could seem quite plausible then. Much as it didn’t seem plausible that Sanders would win Michigan. Until he won it.

This would requires Sanders doing dramatically better than would currently be expected in most of these later states, so it would require a lot of momentum to build up, and probably would require some unexpected events to break in Sanders’ favor.

But if Sanders wants to have a better chance of winning, he should not just rely on the possibility of momentum and unexpected events. It is important that he should limit his losses on March 15. If he could manage a result more like the static scenario projects in Illinois, Ohio and Missouri, along with doing better than either of the scenarios project in Florida and North Carolina, that would go a long way towards helping his chances of staging a plausible comeback. In that case, there would be many fewer delegates Sanders would have to make up in the later states — which means he wouldn’t require such strong victories.

In this scenario, Clinton jumps all the way up to a 351 delegate lead after March 15, but then Sanders would gradually eat away at it up until June 7:

The vital role of Puerto Rico and US territories

From all these scenarios, we can see that if Sanders were to win, it would almost certainly be a narrow win at best. That means that delegates from US territories and Puerto Rico would become important in any plausible scenario in which Sanders might win.

Sanders is currently doing very well in the Democrats Abroad primary, where he is on track to win by about 9-4 or 8-5. But Clinton won the America Samoa caucuses 4 delegates to 2, from only 237 votes cast. That's an extraordinarily large number of delegates per person, compared to US States. In order to win, it would help a lot if Sanders could avoid letting that happen again, and could maybe even win some of the other territories where you only need a small number of votes to get a delegate advantage.

But above all else, Puerto Rico has 60 delegates. In 2008, Clinton won Puerto Rico by 38 delegates to 17. Sanders will very much need to avoid a repeat of Obama's performance. But Puerto Rico's debt crisis is just the sort of economic issue that Sanders has focused on for a while, even before the current campaign began, and that could potentially help him in Puerto Rico.

Conclusion

Clinton has a very large delegate lead thanks to her enormous southern landslides, and that makes it most likely that Clinton will win the nomination. Sanders supporters should be realistic about what it would take for him to come back and win. It would take a lot, and Sanders’ path to victory allows no real room for error or missteps. Sanders would need to do very well in the states that vote after March 15 and through June in order to win. That would not be at all easy for him to do. But after Michigan, Kansas, Nebraska, and Maine, the plausibility of Sanders accomplishing this has shot way up. Although he still faces a very steep uphill climb, things look much better for him than they did after Super Tuesday.

One thing is clear. And that is the same thing that has been clear for some time now. That is that if Bernie Sanders were to win, it wouldn’t be by cutting Clinton’s delegate lead on March 15. It remains the case that Clinton is highly likely to expand her delegate lead on March 15, even if Sanders is on course to win the nomination. So Sanders’ task on March 15 is to try to stop Clinton from expanding her delegate lead much further.

If Sanders were to win, he would have to do so by cutting into Clinton’s delegate lead beginning on March 22 (Arizona, Idaho, and Utah), and then on March 26 (Alaska, Hawaii, and Washington State), April 5 (Wisconsin), and April 9 (Wyoming). It is possible Sanders could win all or most of those states, by large margins, which could conceivably change the media narrative heading into the New York primary on April 19.

So if you want fair benchmarks that don’t prejudge the game in favor of Clinton, you shouldn’t be looking just at March 15. You should be looking to March 22, March 26, and beyond, to see by how much Sanders can start cutting Clinton’s delegate lead then. March 22 (Arizona, Idaho, and Utah), March 26 (Alaska, Hawaii, and Washington State), April 5 (Wisconsin), and April 9 (Wyoming).

Even if that happens, it would finally have to be California that puts Sanders over the top on June 7, with some help from Montana, New Jersey, New Mexico, North Dakota, and South Dakota. California alone has enough delegates so that it is possible Sanders could erase a 100 delegate Clinton lead just from a big win in that single state alone.

That may not seem like the most likely scenario now. But there are factors that people may not be considering — such as the fact that the California GOP primary is closed, while the Democratic primary is open. That means that there could be an awful lot of independents voting in the California Democratic party. Especially if the GOP primary has already been wrapped up thanks to the winner-take all states, and if it is clear that California has the opportunity to decide the Democratic nomination, then turnout in the California Democratic primary could conceivably reach unprecedented heights.

Additionally, June 7 is 3 months away. And in politics, a lot can change in 3 months.

In sum, given Clinton's delegate advantage she is very clearly the favorite, and is obviously most likely to win. But it no longer looks like quite the slam dunk for Clinton that it did on March 2. Sanders’ strong performances in Kansas, Nebraska, and Maine, along with his victory in Michigan, have shown that he still does have a chance. It also exposed weakness in Clinton. It remains to be seen just how far that weakness radiates out from Kansas and Nebraska into the rest of the West, and just how far that extends from Michigan to the rest of the rust belt. Sanders does not have an easy path by any stretch of the imagination, but he does once again have a conceivable path to the Democratic nomination.

This is a continuing part of an ongoing series using polling data, past exit poll data, census data, and other data sources to analyze the 2016 Democratic Primary.

Previous posts are:

For more detail on how delegates are allocated across different states, check out this excellent resource from Torilahure.