For the past week or so there has been non-stop talk in the press about the “Warren surge.” There’s a pervasive narrative that Elizabeth Warren’s star is rising while Bernie Sanders is all but done for. With more than a little schadenfreude, a Fox News headline announced “Warren surges to tie Biden for 2020 Dem nomination lead, as Sanders sinks to distant 3rd in latest polls.” A Sept. 30 Politico story proclaimed “Bernie Sanders Is in Trouble.”

It’s pretty clear that Warren has gained some ground in the last month, but most polls show Biden maintaining a double-digit lead while Warren and Sanders are within the margin of error of one another. There are outliers, however, in which Warren appears to “surge” ahead of both Biden and Sanders.

What are we to make of it when one poll has Warren “surging” into the high 20s, and another comes out the next day showing her at 14 points? Did she get really popular overnight and then blow it?

Probably not.

Elizabeth Warren (Gage Skidmore / Flickr / CC-BY-SA )

A much more plausible explanation can be found in the way each poll is conducted. If one looks at polls where Warren “surges,” a pattern starts to emerge. The polls that show the biggest gap between Warren and Sanders either (1) have an extremely low sample size or (2) they are polls of “likely voters.” Often it’s both.

Pollsters construct a sample using either likely voters, registered voters or all adults. The first is has become increasingly common. There are lots of ways to determine if a person is a “likely voter.” Sometimes it’s as simple as asking the question “Do you plan to vote?”

However, pollsters argue that this isn’t a reliable predictor of who will ultimately go to the polls since, according to Gallop, 90 percent of registered voters will respond in the affirmative. Sometimes a battery of questions will be used to determine how engaged or excited a person is, but if survey-takers are already subjecting strangers to a lengthy comprehensive questionnaire, this approach isn’t always practical.

Pollsters more often than not rely on predictive demographics, which is to say they target specific groups based on how predisposed they were to vote in the past.

A poll of “likely voters” will oversample white people, the college educated, wealthy people and people over the age of 50. In other words: The kind of people who support Elizabeth Warren.

According to Pew, 70 percent of Warren’s supporters are white, 58 percent are college grads, and 80 percent are over the age of 30.

By contrast, Sanders’s voters are more diverse (49 percent white), less likely to have a college degree (33 percent are degree-holders) and younger (66 percent are over 30).

Bernie speaks at a high school in Iowa (Phil Roeder / Flickr / CC-BY )

These are the people who “don’t vote”—or at least according to the conventional wisdom of the pollsters.

Two September polls from HarrisX—one of likely voters and the other of registered voters—produced radically different results though they were released within days of each other. The former had Warren leading Sanders 19 points to 13. The latter had Sanders narrowly leading Warren 16 to 15, with Biden at 30 points instead of 35.

The difference? In the poll of registered voters, only 47 percent were over the age of 50, whereas 54 percent of “likely voters” were over 50. It might seem insignificant, but undercounting Sanders’ strongest demographic translates to roughly a 5-point disadvantage.

And this is predictably magnified in smaller samples. The polls that produced much of the talk about the “Warren surge” included questions about the Democratic primary as a subset of a larger questionnaire about a range of issues presented to voters of all political affiliations, so the sample was half that of a standard poll.

For example, the Sept. 23 Monmouth University Poll, which showed Warren at 28 percent—leading both Sanders (+13) and Biden (+3)—had a sample size of only 430.

It was also slightly older and considerably whiter (66 percent vs. 57 percent) compared to the HarrisX polls.

The reduction of precision that comes with polling significantly fewer people— combined with a sample skewed to more closely match Warren’s base—likely accounts for this illusory “surge.”

Of course, none of this is to argue that the polls are somehow intentionally “rigged” in Warren’s favor. There’s a certain logic in trying to make polls a better, more scientific instrument by trying to anticipate who actually will vote. But the current method rests on the faulty assumption that the electorate will remain relatively fixed from one election to another.

Just because youth voter turnout has tended to be low historically doesn’t mean young folks won’t come out for the right candidate. They came out for Obama in the 2008 primary and even more of them came out for Bernie in 2016.

And even if the political prognosticators are sincere in their efforts to provide an accurate picture of what the American people want, polling is an inexact science and they can get it spectacularly wrong—as they did in 2016. Some had Clinton’s chances of winning at between 70 and 99 percent. One of several possible explanations for why they missed the mark so badly was the flawed way pollsters identify “likely voters.”

Bad polling can create a false sense of security that an awful candidate won’t win or it can unfairly handicap an otherwise worthy candidate. Polls can turn into self-fulfilling prophecies—especially in the hands of bad actors in the media.

Pro-Sanders writer Doug J. Hatlem quantitatively analyzed coverage of polls and found that when Bernie did poorly, it was covered nearly six times as much by major media than when he performed well. For instance, when Sanders was at 14 points in a Quinnipiac poll, it garnered 47 news stories. By contrast, a poll showing him at 23 points got no coverage at all.

A selective reading of polls can be used in an attempt to effectively write Sanders out of the race by making it seem like his campaign is on the ropes.

To quote the Politico story referenced earlier:

A problematic narrative [is] hardening around him: His campaign is in disarray and Elizabeth Warren has eclipsed him as the progressive standard-bearer of the primary. He’s sunk to third place nationally, behind Warren and Joe Biden, and some polls of early nomination states show him barely clinging to double digits.

The author describes this “problematic narrative” as if it’s taking shape on its own instead of being constructed in matter-of-fact editorializing statements like this.

Writing that “some” state polls have Sanders “barely clinging to double digits” is misleading to the point of journalistic malpractice. True he’s polling poorly in Southern states, like South Carolina, where Warren (12 pts.) could also be described as “barely clinging to double digits.” However, he’s polling strong in key swing states like Ohio, North Carolina and Nevada.

Sanders is described as having “sunk” to third nationally, but the polls don’t clearly show this.

The hypothesis of a “Warren surge” coupled with a Sanders slump suffers from what social scientists call a replication crisis, which occurs when a phenomenon shows up in low-power studies—i.e. those with small samples—but disappears when the power is raised.

If we only look at polls with sample sizes greater than 1,000, Biden still leads the field by 14 points, while Warren and Sanders are tied for second, with 16 points each.

And given that standard methods for identifying “likely voters” have the potential to bias samples in terms of age and other demographics, limiting the scope to only samples of adults and registered voters changes the picture further, significantly reducing Biden’s lead while putting Sanders in second place.

The narrative about Sanders’ decline are contradicted by data from Morning Consult, which has had him at consistently north of 19 points since January. This is likely due to the fact that the pollster shuns conventional methods for predicting “likely voters” in favor of a simple yes or no question. Furthermore, raising $25.3 million in three months from extremely small donors is usually not considered a sign that a candidate is “struggling” or in “big trouble.”

Since Aug. 2, Sanders’ donor numbers have grown from 746,000 to 1.05 million, an increase of 40 percent, while Warren’s grew from 421,000 to 509,000 in the same time span. She raised $1 million less than Sanders from half as many people.

When taken as a whole, the polling data show Warren and Sanders in a statistical tie for second. At most, Warren may lead Sanders by a point or two, but other indicators show Sanders has more staying power. His younger base is more apt to pound pavement and phone bank. His generally less wealthy donors are willing to give what little money they have.

But most importantly he has loyalty and enthusiasm. According to Pew, a mere 19 percent of Warren supporters said that they were “only enthusiastic about their first choice” —by far the lowest among the top contenders in the Democratic field—wheres 51 percent of Sanders supporters said they were ride or die for Bernie.

2016 should have been a lesson to voters and the press not to put too much stake in polls. Pollsters discovered that the way they determine “likely voters” is flawed, and in more ways than one, Bernie Sanders is redefining what’s “likely.”